<span style="color:Blue">
Determine a set of test cases/questions and answers and use these to test each package/method.<br>

Check that the:<br>
<ol>
<li>The correct features are used and</li><br>
<li>Calculations are correct</li>
</ol>
</span>

# Questions

To examine the housing tenure component in the context of housing and affordability using American Community Survey (ACS) data, you can ask a variety of questions. Here are ten questions that can help you gain insights into housing tenure:

1. **What is the overall homeownership rate in the area under study?**
   - This question provides an initial understanding of the prevalence of homeownership in the community.

2. **How has the homeownership rate changed over the past decade?**
   - Analyzing trends in homeownership rates can reveal shifts in housing tenure patterns.

3. **What is the homeownership rate broken down by age groups (e.g., young adults, middle-aged, seniors)?**
   - Examining tenure by age group can highlight generational differences in homeownership.

4. **What is the homeownership rate broken down by race and ethnicity?**
   - Investigating tenure by race and ethnicity can uncover disparities in access to homeownership opportunities.

5. **Are there notable gender differences in homeownership rates within the community?**
   - Exploring tenure by gender can reveal gender-based disparities in housing.

6. **What percentage of homeowners have mortgages, and what is the average mortgage debt for homeowners in the area?**
   - Understanding mortgage usage and debt levels can provide insights into housing affordability.

7. **What is the median home value for owner-occupied units in the area?**
   - This question helps assess the affordability of homes for potential buyers.

8. **How do homeownership rates vary across different neighborhoods or regions within the community?**
   - Analyzing geographic variations can highlight areas with higher or lower homeownership rates.

9. **What is the percentage of households in the area that receive housing assistance or subsidies (e.g., Section 8 vouchers)?**
   - This question can shed light on the role of government programs in housing tenure.

10. **What are the reasons cited by renters for not owning homes, and how do these reasons differ across demographic groups?**
    - Survey respondents may provide reasons such as affordability constraints, credit issues, or personal preferences for renting.

These questions can serve as a starting point for your analysis of housing tenure in ACS data. Depending on your specific research objectives and the available data, you may want to explore additional dimensions or conduct more in-depth analyses to better understand housing tenure patterns in your chosen area.

In [1]:
import os
import pandas as pd
from pathlib import Path, PosixPath
from typing import NamedTuple

In [2]:
pd.set_option("display.max_columns", None) # display all columns
pd.set_option("float_format", "{:,}".format)  # display numbers with commas; "{:,.2f}".format for 2 decimals

In [3]:
# NamedTuple type hint
class ParametersType(NamedTuple):
    acs_path: PosixPath # Platform neutral pathlib PosixPath to ACS data
    openai_api_key: str # OpenAI API key

In [4]:
Parameters: ParametersType = ParametersType(
    acs_path = Path("./Data/ACS_2012_21.csv"),
    openai_api_key = os.environ["OPENAI_API_KEY"]
)

# ACS Data

In [5]:
acs_df: pd.DataFrame = pd.read_csv(Parameters.acs_path)
acs_df.drop(columns=["Unnamed: 0"], inplace=True)
display(acs_df.info())
display(acs_df.head())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 520 entries, 0 to 519
Data columns (total 95 columns):
 #   Column                                                        Non-Null Count  Dtype  
---  ------                                                        --------------  -----  
 0   Geography                                                     520 non-null    object 
 1   Geographic Area Name                                          520 non-null    object 
 2   Total population                                              520 non-null    int64  
 3   Male                                                          520 non-null    int64  
 4   Female                                                        520 non-null    int64  
 5   Under 5 years                                                 520 non-null    int64  
 6   5 to 9 years                                                  520 non-null    int64  
 7   10 to 14 years                                                520 non-n

None

Unnamed: 0,Geography,Geographic Area Name,Total population,Male,Female,Under 5 years,5 to 9 years,10 to 14 years,15 to 19 years,20 to 24 years,25 to 34 years,35 to 44 years,45 to 54 years,55 to 59 years,60 to 64 years,65 to 74 years,75 to 84 years,85 years and over,Median age (years),18 years and over,21 years and over,65 years and over,One race,Two or more races,White,Black or African American,American Indian and Alaska Native,Asian,Native Hawaiian and Other Pacific Islander,Some other race,Hispanic or Latino (of any race),Not Hispanic or Latino,Total housing units_x,YEAR,Total households,Households with one or more people under 18 years,Households with one or more people 65 years and over,Average household size,Males 15 years and over,"Never married, Males 15 years and over","Now married, except separated, Males 15 years and over","Separated, Males 15 years and over","Widowed, Males 15 years and over","Divorced, Males 15 years and over",Females 15 years and over,"Never married, Females 15 years and over","Now married, except separated, Females 15 years and over","Separated, Females 15 years and over","Widowed, Females 15 years and over","Divorced, Females 15 years and over",Population 3 years and over enrolled in school,Population 25 years and over,Less than 9th grade,"9th to 12th grade, no diploma",High school graduate (includes equivalency),"Some college, no degree",Associate's degree,Bachelor's degree,Graduate or professional degree,Native,Foreign born,Language other than English,Population 16 years and over,In labor force,Civilian labor force,Employed,Unemployed,Armed Forces,Not in labor force,Civilian employed population 16 years and over,"Management, business, science, and arts occupations",Service occupations,Sales and office occupations,"Natural resources, construction, and maintenance occupations","Production, transportation, and material moving occupations",Median family income (dollars),Mean family income (dollars),Per capita income (dollars),Civilian noninstitutionalized population,With health insurance coverage,With private health insurance,With public coverage,No health insurance coverage,Total housing units_y,Occupied housing units,Vacant housing units,Homeowner vacancy rate,Rental vacancy rate,Median rooms,"Median (dollars), Value",Owner-occupied units,Housing units with a mortgage,Housing units without a mortgage,"Median (dollars), Rent",No rent paid
0,0400000US01,Alabama,4777326,2317520,2459806,305091,309360,318484,337159,340808,607797,619112,686672,310336,279202,374441,211411,77453,37.8,3647097,3433673,663305,4710487,66839,3379235,1285740,57219,66512,2901,56469,182268,4595058,2172647,2012,1837576.0,597337.0,475330.0,2.54,1841356.0,584355.0,957174.0,39467.0,52899.0,207461.0,2003035.0,517693.0,937087.0,61290.0,222434.0,264531.0,1220805.0,3166424.0,187882.0,363148.0,991406.0,691686.0,227301.0,448117.0,256884.0,4610592.0,166734.0,230806.0,3779457,2265008,2248665,2017887,230778,16343,1514449,2017887,643951,332351,507206,218389,315990,54326,70237,23587,4693822,4039446,3112613,1524117,654376,2172647,1837576,335071,2.5,9.0,5.7,122300,1289324,776946,512378,691,63064
1,0400000US01,Alabama,4817678,2336020,2481658,299571,304412,321104,327579,347110,618482,610792,675347,322017,292003,401417,217634,80210,38.2,3699760,3491373,699261,4741250,76428,3393927,1304167,58134,72528,3737,65898,191838,4625840,2190638,2013,1838683.0,592093.0,486161.0,2.55,1854423.0,596005.0,952726.0,40229.0,54972.0,210491.0,2016442.0,530755.0,932261.0,61058.0,222995.0,269373.0,1222995.0,3193338.0,180671.0,358529.0,991730.0,703243.0,236473.0,458393.0,264299.0,4631045.0,168232.0,235741.0,3806434,2259344,2244093,2002163,241930,15251,1547090,2002163,652201,335865,494371,209494,310232,54362,70661,23680,4716915,4061521,3093955,1568842,655394,2178116,1838683,339433,2.5,8.9,5.7,122500,1281604,762450,519154,705,62178
2,0400000US01,Alabama,4799277,2328592,2470685,301925,306456,320031,332287,345240,612596,615375,681953,316960,285880,387589,214771,78214,38.1,3675910,3463812,680574,4727132,72145,3388895,1294437,57376,69176,3345,62114,188294,4610983,2178116,2014,1842174.0,585393.0,497517.0,2.55,1863744.0,603407.0,950855.0,39418.0,56153.0,213911.0,2028847.0,540860.0,932449.0,60148.0,222066.0,273324.0,1215753.0,3217902.0,174324.0,350044.0,999761.0,708087.0,243873.0,465268.0,276545.0,4651201.0,166477.0,235709.0,3828799,2253005,2239169,2010453,228716,13836,1575794,2010453,661335,339569,489899,203989,315661,54724,71423,23936,4735953,4095677,3097222,1607735,640276,2190638,1842174,348464,2.6,9.0,5.7,123800,1274196,751234,522962,715,62842
3,0400000US01,Alabama,4841164,2346193,2494971,292771,305707,313980,324809,342489,626564,606216,656639,332234,297361,434510,225663,82221,38.6,3735975,3530912,742394,4755752,85412,3400118,1320276,58977,77731,4858,69651,193503,4647661,2209335,2015,1848325.0,581680.0,511256.0,2.55,1872257.0,613105.0,950508.0,38073.0,57128.0,213443.0,2039158.0,550811.0,933161.0,59843.0,221378.0,273965.0,1206014.0,3239351.0,166885.0,343006.0,1005295.0,711180.0,251335.0,478812.0,282838.0,4663396.0,167224.0,235540.0,3846845,2242401,2229422,2022325,207097,12979,1604444,2022325,673400,339082,488066,200200,321577,55341,71994,24091,4749786,4148627,3122837,1648163,601159,2199329,1848325,351004,2.5,9.0,5.7,125500,1269145,738618,530527,717,63906
4,0400000US01,Alabama,4830620,2341093,2489527,295054,305714,318437,324020,348044,621592,609415,665372,326349,297297,416983,220721,81622,38.4,3718646,3514202,719326,4748974,81646,3396662,1312584,58251,75634,5186,69042,193492,4637128,2199329,2016,1851061.0,572325.0,525341.0,2.55,1881213.0,621892.0,948574.0,37310.0,58482.0,214955.0,2047493.0,559931.0,932449.0,58487.0,222097.0,274529.0,1193757.0,3261408.0,162018.0,334018.0,1009593.0,714201.0,258502.0,492382.0,290694.0,4675660.0,165504.0,233260.0,3864302,2238654,2226504,2042025,184479,12150,1625648,2042025,685523,339793,489112,199303,328294,56828,74189,24736,4761291,4208373,3162223,1687781,552918,2209335,1851061,358274,2.4,9.4,5.7,128500,1267824,730637,537187,728,65161


In [22]:
features: list = [
    "Geographic Area Name",
    "YEAR",
    # "Total housing units_x",
    # "Total housing units_y",
    # "Total households",
    "Occupied housing units",
    "Vacant housing units",
    "Owner-occupied units",
    "Housing units with a mortgage",
    "Housing units without a mortgage",
    "Homeowner vacancy rate",
    "Rental vacancy rate",
]
display(acs_df[features].head())

Unnamed: 0,Geographic Area Name,YEAR,Occupied housing units,Vacant housing units,Owner-occupied units,Housing units with a mortgage,Housing units without a mortgage,Homeowner vacancy rate,Rental vacancy rate
0,Alabama,2012,1837576,335071,1289324,776946,512378,2.5,9.0
1,Alabama,2013,1838683,339433,1281604,762450,519154,2.5,8.9
2,Alabama,2014,1842174,348464,1274196,751234,522962,2.6,9.0
3,Alabama,2015,1848325,351004,1269145,738618,530527,2.5,9.0
4,Alabama,2016,1851061,358274,1267824,730637,537187,2.4,9.4


In [13]:
[c for c in acs_df.columns.tolist() if c.find("rate") > -1]

['Now married, except separated, Males 15 years and over',
 'Separated, Males 15 years and over',
 'Now married, except separated, Females 15 years and over',
 'Separated, Females 15 years and over',
 'Homeowner vacancy rate',
 'Rental vacancy rate']

In [14]:
1837576+335071

2172647

In [15]:
1838683+339433	

2178116

In [16]:
1842174+348464

2190638

In [17]:
776946+512378	

1289324

In [23]:
acs_df.YEAR.describe()

count              520.0
mean             2,016.5
std     2.87504712200639
min              2,012.0
25%              2,014.0
50%              2,016.5
75%              2,019.0
max              2,021.0
Name: YEAR, dtype: float64

1. **What is the overall homeownership rate in the area under study?**
   - This question provides an initial understanding of the prevalence of homeownership in the community.

<span style="color:brown">Code was generated by Github Copilot</span>

In [21]:
# Owner occupied rate
acs_df["Owner occupied rate"] = acs_df["Owner-occupied units"] / acs_df["Occupied housing units"]

2. **How has the homeownership rate changed over the past decade?**
   - Analyzing trends in homeownership rates can reveal shifts in housing tenure patterns.

<span style="color:brown">Code was generated by Github Copilot</span>

In [24]:
# How has the homeownership rate changed over time?
acs_df.groupby("YEAR")["Owner occupied rate"].mean()

YEAR
2012    0.669935989046527
2013   0.6650096353511181
2014    0.659445496175559
2015    0.655557058689484
2016   0.6530587935124246
2017   0.6550756543431873
2018   0.6554526803778757
2019   0.6571024559104524
2020    0.661275851255221
2021   0.6642988346293597
Name: Owner occupied rate, dtype: float64

<span style="color:brown">Code was generated by Github Copilot</span>

In [25]:
# By state, how has the homeownership rate changed over time?
acs_df.groupby(["Geographic Area Name", "YEAR"])["Owner occupied rate"].mean()

Geographic Area Name  YEAR
Alabama               2012   0.7016439047963187
                      2013   0.6970228146994343
                      2014   0.6916805904328256
                      2015   0.6866460173400241
                      2016   0.6849174608508309
                                    ...        
Wyoming               2017   0.6920521028331675
                      2018   0.6942765468499328
                      2019   0.7039169755889805
                      2020   0.7097255510631091
                      2021   0.7169167537382996
Name: Owner occupied rate, Length: 520, dtype: float64

# <span style="color:red">Cannot answer using data</span>

3. **What is the homeownership rate broken down by age groups (e.g., young adults, middle-aged, seniors)?**
   - Examining tenure by age group can highlight generational differences in homeownership.

<span style="color:brown">Code was generated by Github Copilot</span>

In [26]:
# By state and year, what is the homeownership rate by state broken down by age groups?
acs_df.groupby(["Geographic Area Name", "YEAR", "Age"])["Owner occupied rate"].mean()

KeyError: 'Age'

<span style="color:green">Modified code was generated by Github Copilot</span>

In [31]:
age_features: list = [
    c for c in acs_df.columns.tolist()
    if c.find("years") > -1 and (c[0].isdigit() or c.find("Under") > -1)
]
age_features

['Under 5 years',
 '5 to 9 years',
 '10 to 14 years',
 '15 to 19 years',
 '20 to 24 years',
 '25 to 34 years',
 '35 to 44 years',
 '45 to 54 years',
 '55 to 59 years',
 '60 to 64 years',
 '65 to 74 years',
 '75 to 84 years',
 '85 years and over',
 '18 years and over',
 '21 years and over',
 '65 years and over']

In [None]:
# By state and year, what is the homeownership rate by state broken down by age groups?
acs_df.groupby(["Geographic Area Name", "YEAR", "Age"])["Owner occupied rate"].mean()

In [None]:
4. **What is the homeownership rate broken down by race and ethnicity?**
   - Investigating tenure by race and ethnicity can uncover disparities in access to homeownership opportunities.

5. **Are there notable gender differences in homeownership rates within the community?**
   - Exploring tenure by gender can reveal gender-based disparities in housing.

6. **What percentage of homeowners have mortgages, and what is the average mortgage debt for homeowners in the area?**
   - Understanding mortgage usage and debt levels can provide insights into housing affordability.

7. **What is the median home value for owner-occupied units in the area?**
   - This question helps assess the affordability of homes for potential buyers.

8. **How do homeownership rates vary across different neighborhoods or regions within the community?**
   - Analyzing geographic variations can highlight areas with higher or lower homeownership rates.

9. **What is the percentage of households in the area that receive housing assistance or subsidies (e.g., Section 8 vouchers)?**
   - This question can shed light on the role of government programs in housing tenure.

10. **What are the reasons cited by renters for not owning homes, and how do these reasons differ across demographic groups?**
    - Survey respondents may provide reasons such as affordability constraints, credit issues, or personal preferences for renting.