# Looking at changing family structures

To do this, we need to use data from the census, which the first part of our project was dedicated to retrieving.

## Finding a package

We ended up going with the censusdata package. We found a clear write up about it at  https://towardsdatascience.com/accessing-census-data-with-python-3e2f2b56e20d and found its functions clear and useful. We originally looked at the census package, but this required an API key to access the data that often failed and the functions to get the data were less clear.

In [4]:
import pandas as pd
import censusdata

## Searching for relevant tables

This package contains a search function that allows us to search a data set for relevant tables.

In this search we looked at the American Community Survey 5-year estimates. This is a more in depth but slightly less accurate version of the 10 year census and is conducted on a rolling basis.

In this search we look at this data set for 2019 and look for tables related to marriage.

In [3]:
tables = censusdata.search('acs5', 2019,'concept', 'marriage')
tables

[('B12007A_001E',
  'MEDIAN AGE AT FIRST MARRIAGE (WHITE ALONE)',
  'Estimate!!Median age at first marriage --!!Male'),
 ('B12007A_002E',
  'MEDIAN AGE AT FIRST MARRIAGE (WHITE ALONE)',
  'Estimate!!Median age at first marriage --!!Female'),
 ('B12007B_001E',
  'MEDIAN AGE AT FIRST MARRIAGE (BLACK OR AFRICAN AMERICAN ALONE)',
  'Estimate!!Median age at first marriage --!!Male'),
 ('B12007B_002E',
  'MEDIAN AGE AT FIRST MARRIAGE (BLACK OR AFRICAN AMERICAN ALONE)',
  'Estimate!!Median age at first marriage --!!Female'),
 ('B12007C_001E',
  'MEDIAN AGE AT FIRST MARRIAGE (AMERICAN INDIAN AND ALASKA NATIVE ALONE)',
  'Estimate!!Median age at first marriage --!!Male'),
 ('B12007C_002E',
  'MEDIAN AGE AT FIRST MARRIAGE (AMERICAN INDIAN AND ALASKA NATIVE ALONE)',
  'Estimate!!Median age at first marriage --!!Female'),
 ('B12007D_001E',
  'MEDIAN AGE AT FIRST MARRIAGE (ASIAN ALONE)',
  'Estimate!!Median age at first marriage --!!Male'),
 ('B12007D_002E',
  'MEDIAN AGE AT FIRST MARRIAGE (ASIAN A

This outputs a list of tuples with the code name, table name, and variable.

We then looked at a list of just the table names to be able to more easily look for the table we want.

In [21]:
table_names = list(set([row[1] for row in tables[:len(tables)-1]]))
table_names.sort()
table_names

['HOUSEHOLD TYPE (INCLUDING LIVING ALONE)',
 'HOUSEHOLD TYPE (INCLUDING LIVING ALONE) (AMERICAN INDIAN AND ALASKA NATIVE ALONE)',
 'HOUSEHOLD TYPE (INCLUDING LIVING ALONE) (ASIAN ALONE)',
 'HOUSEHOLD TYPE (INCLUDING LIVING ALONE) (BLACK OR AFRICAN AMERICAN ALONE)',
 'HOUSEHOLD TYPE (INCLUDING LIVING ALONE) (HISPANIC OR LATINO)',
 'HOUSEHOLD TYPE (INCLUDING LIVING ALONE) (NATIVE HAWAIIAN AND OTHER PACIFIC ISLANDER ALONE)',
 'HOUSEHOLD TYPE (INCLUDING LIVING ALONE) (SOME OTHER RACE ALONE)',
 'HOUSEHOLD TYPE (INCLUDING LIVING ALONE) (TWO OR MORE RACES)',
 'HOUSEHOLD TYPE (INCLUDING LIVING ALONE) (WHITE ALONE)',
 'HOUSEHOLD TYPE (INCLUDING LIVING ALONE) (WHITE ALONE, NOT HISPANIC OR LATINO)',
 'HOUSEHOLD TYPE (INCLUDING LIVING ALONE) BY RELATIONSHIP',
 'HOUSEHOLD TYPE BY HOUSEHOLD SIZE',
 'HOUSEHOLD TYPE BY RELATIVES AND NONRELATIVES FOR POPULATION IN HOUSEHOLDS',
 'HOUSEHOLD TYPE BY RELATIVES AND NONRELATIVES FOR POPULATION IN HOUSEHOLDS (AMERICAN INDIAN AND ALASKA NATIVE ALONE)',
 'HOUSE

In [13]:
set([row[1] for row in tables[:len(tables)] if len(row[1]) < 500])

{'MARRIAGES ENDING IN WIDOWHOOD IN THE LAST YEAR BY SEX BY MARITAL STATUS FOR THE POPULATION 15 YEARS AND OVER',
 'MARRIAGES IN THE LAST YEAR BY SEX BY MARITAL STATUS FOR THE POPULATION 15 YEARS AND OVER',
 'MEDIAN AGE AT FIRST MARRIAGE',
 'MEDIAN AGE AT FIRST MARRIAGE (AMERICAN INDIAN AND ALASKA NATIVE ALONE)',
 'MEDIAN AGE AT FIRST MARRIAGE (ASIAN ALONE)',
 'MEDIAN AGE AT FIRST MARRIAGE (BLACK OR AFRICAN AMERICAN ALONE)',
 'MEDIAN AGE AT FIRST MARRIAGE (HISPANIC OR LATINO)',
 'MEDIAN AGE AT FIRST MARRIAGE (NATIVE HAWAIIAN AND OTHER PACIFIC ISLANDER ALONE)',
 'MEDIAN AGE AT FIRST MARRIAGE (SOME OTHER RACE ALONE)',
 'MEDIAN AGE AT FIRST MARRIAGE (TWO OR MORE RACES)',
 'MEDIAN AGE AT FIRST MARRIAGE (WHITE ALONE)',
 'MEDIAN AGE AT FIRST MARRIAGE (WHITE ALONE, NOT HISPANIC OR LATINO)',
 'MEDIAN DURATION OF CURRENT MARRIAGE IN YEARS BY SEX BY MARITAL STATUS FOR THE MARRIED POPULATION 15 YEARS AND OVER'}

In [26]:
def see_tables(year, keyword):
    tables = censusdata.search('acs5', year,'concept', keyword)
    names = list(set([row[1] for row in tables[:len(tables)] if len(row[1]) < 500]))
    names.sort()
    return names

In [142]:
see_tables(2019, "household type")

['HOUSEHOLD TYPE (INCLUDING LIVING ALONE)',
 'HOUSEHOLD TYPE (INCLUDING LIVING ALONE) (AMERICAN INDIAN AND ALASKA NATIVE ALONE)',
 'HOUSEHOLD TYPE (INCLUDING LIVING ALONE) (ASIAN ALONE)',
 'HOUSEHOLD TYPE (INCLUDING LIVING ALONE) (BLACK OR AFRICAN AMERICAN ALONE)',
 'HOUSEHOLD TYPE (INCLUDING LIVING ALONE) (HISPANIC OR LATINO)',
 'HOUSEHOLD TYPE (INCLUDING LIVING ALONE) (NATIVE HAWAIIAN AND OTHER PACIFIC ISLANDER ALONE)',
 'HOUSEHOLD TYPE (INCLUDING LIVING ALONE) (SOME OTHER RACE ALONE)',
 'HOUSEHOLD TYPE (INCLUDING LIVING ALONE) (TWO OR MORE RACES)',
 'HOUSEHOLD TYPE (INCLUDING LIVING ALONE) (WHITE ALONE)',
 'HOUSEHOLD TYPE (INCLUDING LIVING ALONE) (WHITE ALONE, NOT HISPANIC OR LATINO)',
 'HOUSEHOLD TYPE (INCLUDING LIVING ALONE) BY RELATIONSHIP',
 'HOUSEHOLD TYPE BY HOUSEHOLD SIZE',
 'HOUSEHOLD TYPE BY RELATIVES AND NONRELATIVES FOR POPULATION IN HOUSEHOLDS',
 'HOUSEHOLD TYPE BY RELATIVES AND NONRELATIVES FOR POPULATION IN HOUSEHOLDS (AMERICAN INDIAN AND ALASKA NATIVE ALONE)',
 'HOUSE

In [137]:
def get_code(name, full_name, year):
    tables = censusdata.search('acs5', year,'concept', name)
    #full_name = name + " (" + race + ")"
    codes = []
    variables = []
    display = []
    display2 = "<ul>"
    indents = []
    for item in tables:
        if item[1] == full_name:
            codes.append(item[0])
            
            parts = item[2].split('!!')
            indent = len(parts)  - 2
            
            if indent > 0:
                variable = " - ".join(parts[2:])
            else:
                variable = "Grand total"
                
            previous = 0
            if len(indents):
                previous = indents[-1]
            while indent < previous:
                display2 += "</li></ul>"
                previous = previous - 1
            if indent > previous:
                display2 += f"</li><ul><li>{variable}"
            elif indent == previous:
                if len(indents):
                    display2+="</li>"
                display2 += f"<li>{variable}"
                
            
            line = "".join("   "*(indent))
            display.append(line + variable)
            
            indents.append(indent)
            
            
            if variable[0] == "$":
                variable = "\\" + variable
            variables.append(variable)
            
    if len(indents)>2:
        previous = indents[0]
        while indents[-1] > previous:
            display2 += "</li></ul>"
            previous = previous + 1
    
    return codes, variables, display, indents, display2

In [None]:
f"<li>{variable}"

In [143]:
c, v, d, i,d2 = get_code("household type", 'HOUSEHOLD TYPE (INCLUDING LIVING ALONE)', 2012)

In [144]:
d, d2, i

(['Grand total',
  '   Family households',
  '      Family households - Married-couple family',
  '      Family households - Other family',
  '         Family households - Other family - Male householder, no wife present',
  '         Family households - Other family - Female householder, no husband present',
  '   Nonfamily households',
  '      Nonfamily households - Householder living alone',
  '      Nonfamily households - Householder not living alone'],
 '<ul><li>Grand total</li><ul><li>Family households</li><ul><li>Family households - Married-couple family</li><li>Family households - Other family</li><ul><li>Family households - Other family - Male householder, no wife present</li><li>Family households - Other family - Female householder, no husband present</li></ul></li></ul></li><li>Nonfamily households</li><ul><li>Nonfamily households - Householder living alone</li><li>Nonfamily households - Householder not living alone</li></ul></li></ul>',
 [0, 1, 2, 2, 3, 3, 1, 2, 2])

In [145]:
for i in d:
    print(i)

Grand total
   Family households
      Family households - Married-couple family
      Family households - Other family
         Family households - Other family - Male householder, no wife present
         Family households - Other family - Female householder, no husband present
   Nonfamily households
      Nonfamily households - Householder living alone
      Nonfamily households - Householder not living alone


In [112]:
for i in np.arange(len(v)):
    print(v)

NameError: name 'np' is not defined

In [105]:
int('2012')

2012

In [95]:
c,v,d = get_code("household", 'HOUSEHOLD TYPE (INCLUDING LIVING ALONE) (AMERICAN INDIAN AND ALASKA NATIVE ALONE)')

In [106]:
for i in d:
    print(i)

Grand total
   Imputed
   Not imputed


In [29]:
for v in varis:
    indents = len(v.split('!!'))
    line = "".join("   "*(indents-2))
    if indents > 2:
        parts = v.split("!!")[2:]
        line += " - ".join(parts)
    else:
        line += "Grand total"
    print(line)
    
        

Grand total
   Under $25,000:
      Under $25,000: - With health insurance coverage
         Under $25,000: - With health insurance coverage - With private health insurance
         Under $25,000: - With health insurance coverage - With public coverage
      Under $25,000: - No health insurance coverage
   $25,000 to $49,999:
      $25,000 to $49,999: - With health insurance coverage
         $25,000 to $49,999: - With health insurance coverage - With private health insurance
         $25,000 to $49,999: - With health insurance coverage - With public coverage
      $25,000 to $49,999: - No health insurance coverage
   $50,000 to $74,999:
      $50,000 to $74,999: - With health insurance coverage
         $50,000 to $74,999: - With health insurance coverage - With private health insurance
         $50,000 to $74,999: - With health insurance coverage - With public coverage
      $50,000 to $74,999: - No health insurance coverage
   $75,000 to $99,999:
      $75,000 to $99,999: - With hea

In [22]:
"".join((["n"]*0))

''

In [19]:
varis[1].split('!!')

['Estimate', 'Total:', 'Under $25,000:']

In [48]:
"""
This function downloads the specified tables from the specified years and reformats them

Inputs: codes - a list of the table codes
        names - what to rename the variables for each table code
        years - years to get a table from
        
Output: one dataframe with the requested data compiling the different years
"""
def get_tables(codes, names, years):
    tables = []
    for year in years:
        #Get table
        df = censusdata.download('acs5', year,
                   censusdata.censusgeo([('state', '*')]),
                    codes)
        
        #Rename columns
        name_dict = dict(zip(codes, names))
        name_dict['index'] = 'State'
        df = df.reset_index() #Turns row names into row
        df = df.rename(columns = name_dict)
        
        #Shorten states column to state name
        df = df.astype({'State':'str'})
        df['State'] = df['State'].str.split(':').str.get(0) 
        
        #Add column for year
        df['Year'] = year
        
        tables.append(df)
    return pd.concat(tables)

In [104]:
get_tables(c, v, [2012])

Unnamed: 0,State,Grand total,Imputed,Not imputed,Year
0,Alabama,3844391,156475,3687916,2012
1,Alaska,556204,13495,542709,2012
2,Arizona,5056561,199891,4856670,2012
3,Arkansas,2325562,73387,2252175,2012
4,California,29700084,964986,28735098,2012
5,Colorado,4022530,142081,3880449,2012
6,Delaware,730642,19843,710799,2012
7,District of Columbia,518785,16222,502563,2012
8,Connecticut,2911421,88938,2822483,2012
9,Florida,15597732,677129,14920603,2012


In [97]:
tab = get_tables(c, v, [2019])
tab.head()

Unnamed: 0,State,Grand total,Family households:,Family households: - Married-couple family,Family households: - Other family:,"Family households: - Other family: - Male householder, no spouse present","Family households: - Other family: - Female householder, no spouse present",Nonfamily households:,Nonfamily households: - Householder living alone,Nonfamily households: - Householder not living alone,Year
0,Alabama,9848,6534,5260,1274,306,968,3314,2830,484,2019
1,Alaska,29726,20792,10356,10436,3375,7061,8934,7027,1907,2019
2,Arizona,86934,61088,27986,33102,8898,24204,25846,20664,5182,2019
3,Arkansas,7270,4626,3122,1504,337,1167,2644,2163,481,2019
4,California,96536,68408,39602,28806,9311,19495,28128,21220,6908,2019


In [98]:
tab.to_csv("house.csv")

## Getting data froma specific table

We one we pick the name we needed to find the code

In [29]:
name = 'HOUSEHOLD TYPE (INCLUDING LIVING ALONE) (AMERICAN INDIAN AND ALASKA NATIVE ALONE)'
#name = 'MEDIAN AGE AT FIRST MARRIAGE (HISPANIC OR LATINO)'
for item in tables:
    if item[1] == name:
        print(item[0])
        code = item[0][:6]
code

'B12007'

Then we can use the printtable function to get a look at the structure of the table.

In [94]:
censusdata.printtable(censusdata.censustable('acs5', 2019, "B12007"))

Variable     | Table                          | Label                                                    | Type 
-------------------------------------------------------------------------------------------------------------------
B12007_001E  | MEDIAN AGE AT FIRST MARRIAGE   | !! !! Estimate Median age at first marriage -- Male      | float
B12007_002E  | MEDIAN AGE AT FIRST MARRIAGE   | !! !! Estimate Median age at first marriage -- Female    | float
-------------------------------------------------------------------------------------------------------------------


Then we can download the variables we choose into a pandas data frame by using the codes above.

In [6]:
marriage_2019 = censusdata.download('acs5', 2019,
                   censusdata.censusgeo([('state', '*')]),
                    ['B12007_001E', 'B12007_002E'])
marriage_2019.head()

Unnamed: 0,B12007_001E,B12007_002E
"Alabama: Summary level: 040, state:01",28.5,26.7
"Alaska: Summary level: 040, state:02",29.2,26.4
"Arizona: Summary level: 040, state:04",29.9,27.8
"Arkansas: Summary level: 040, state:05",27.2,25.7
"California: Summary level: 040, state:06",30.8,29.0


In [40]:
df = censusdata.download('acs5', 2019,
                   censusdata.censusgeo([('state', '*')]),
                    ['B12007_001E', 'B12007_002E'])
df = df.reset_index()
df = df.astype({'index':'str'})
print(list(df['index'].str.split(':').str.get(0)))

['Alabama', 'Alaska', 'Arizona', 'Arkansas', 'California', 'Colorado', 'Delaware', 'District of Columbia', 'Connecticut', 'Florida', 'Georgia', 'Idaho', 'Hawaii', 'Illinois', 'Indiana', 'Iowa', 'Kansas', 'Kentucky', 'Louisiana', 'Maine', 'Maryland', 'Massachusetts', 'Michigan', 'Minnesota', 'Mississippi', 'Missouri', 'Montana', 'Nebraska', 'Nevada', 'New Hampshire', 'New Jersey', 'New Mexico', 'New York', 'North Carolina', 'North Dakota', 'Ohio', 'Oklahoma', 'Oregon', 'Pennsylvania', 'Rhode Island', 'South Carolina', 'South Dakota', 'Tennessee', 'Texas', 'Vermont', 'Utah', 'Virginia', 'Washington', 'West Virginia', 'Wisconsin', 'Wyoming', 'Puerto Rico']


## Streamlining this process

To reformat this table and others, we wrote the following function. We can follow the same process to find other tables and variables of interest, and then plug that information into this function to get a nicer table.

In [7]:
"""
This function downloads the specified tables from the specified years and reformats them

Inputs: codes - a list of the table codes
        names - what to rename the variables for each table code
        years - years to get a table from
        
Output: one dataframe with the requested data compiling the different years
"""
def get_tables(codes, names, years):
    tables = []
    for year in years:
        #Get table
        df = censusdata.download('acs5', year,
                   censusdata.censusgeo([('state', '*')]),
                    codes)
        
        #Rename columns
        name_dict = dict(zip(codes, names))
        name_dict['index'] = 'State'
        df = df.reset_index() #Turns row names into row
        df = df.rename(columns = name_dict)
        
        #Shorten states column to state name
        df = df.astype({'State':'str'})
        df['State'] = df['State'].str.split(':').str.get(0) 
        
        #Add column for year
        df['Year'] = year
        
        tables.append(df)
    return pd.concat(tables)

In [8]:
marriage = get_tables(['B12007_001E', 'B12007_002E'], ['Male age', 'Female age'], [2009, 2014, 2019])
marriage.head()

Unnamed: 0,State,Male age,Female age,Year
0,Alaska,27.2,25.2,2009
1,Alabama,26.8,25.3,2009
2,Arkansas,25.8,24.3,2009
3,Arizona,27.8,25.8,2009
4,California,28.8,26.8,2009


In [11]:
household_type = get_tables(['B11001_001E', 'B11001_002E','B11001_003E','B11001_004E','B11001_007E','B11001_008E','B11001_009E'], ['Total', 'Total Family','Married-couple Family', 'Single Householder, no spouse','Total Nonfamily','Nonfamily Living Alone','Nonfamily Not Alone'], [2009, 2014, 2019])
household_type

Unnamed: 0,State,Total,Total Family,Married-couple Family,"Single Householder, no spouse",Total Nonfamily,Nonfamily Living Alone,Nonfamily Not Alone,Year
0,Alaska,234779,159319,118716,40603,75460,57718,17742,2009
1,Alabama,1819441,1236035,894351,341684,583406,508317,75089,2009
2,Arkansas,1109635,754486,563199,191287,355149,305252,49897,2009
3,Arizona,2248170,1492544,1115833,376711,755626,603300,152326,2009
4,California,12187191,8333690,6085094,2248596,3853501,2993951,859550,2009
...,...,...,...,...,...,...,...,...,...
47,Washington,2848396,1841954,1430460,411494,1006442,759370,247072,2019
48,West Virginia,732585,473856,356024,117832,258729,217699,41030,2019
49,Wisconsin,2358156,1482213,1148844,333369,875943,696118,179825,2019
50,Wyoming,230101,148652,119353,29299,81449,64997,16452,2019


In [12]:
divorces = get_tables(['B12503_001E','B12503_003E','B12503_005E','B12503_006E','B12503_008E','B12503_010E','B12503_011E'],['Total','Male Never Married', 'Male Married; Divorced Last Year', 'Male Married; Not Divorced Last Year','Female Never Married','Female Married; Divorced Last Year','Female Married; Not Divorced Last Year'],[2012, 2019])
divorces

Unnamed: 0,State,Total,Male Never Married,Male Married; Divorced Last Year,Male Married; Not Divorced Last Year,Female Never Married,Female Married; Divorced Last Year,Female Married; Not Divorced Last Year,Year
0,Alabama,3844391,584355,22564,1234437,517693,26151,1459191,2012
1,Alaska,556204,105333,3389,181330,74461,3330,188361,2012
2,Arizona,5056561,876661,24077,1595535,711047,27608,1821633,2012
3,Arkansas,2325562,330745,13975,784152,276122,16462,904106,2012
4,California,29700084,5778554,112136,8772035,4819717,129478,10088164,2012
...,...,...,...,...,...,...,...,...,...
47,Washington,6031108,1042361,23783,1934478,826429,26890,2177167,2019
48,West Virginia,1512469,232652,7665,502692,183487,7214,578759,2019
49,Wisconsin,4734360,825525,15745,1497749,695245,15502,1684594,2019
50,Wyoming,466549,72140,2069,163131,53486,2687,173036,2019


## Data visualization

Now we can make charts from the data.

In [None]:
from matplotlib import pyplot as plt
import plotly.io as pio
from plotly import express as px

### 1. Median Age of Marriage in California

In [None]:
mar_cal = marriage[marriage['State'] == "California"]
mar_cal

In [None]:
fig = px.scatter(data_frame = mar_cal, 
                x = "Year", 
                y = ["Male age", "Female age"],
                title = "Median Age of First Marriage (CA)",
                trendline = "ols", # ordinary least squares regression trendline
                width = 800,
                height = 600)

fig.show()

### 2. Frequency of Different Household Types in California

In [None]:
household_cal = household_type[household_type['State']=="California"]
household_cal

In [None]:
mc_percentage = household_cal['Married-couple Family'] / household_cal['Total']
sh_percentage = household_cal['Single Householder, no spouse'] / household_cal['Total']
nla_percentage = household_cal['Nonfamily Living Alone'] / household_cal['Total']
nna_percentage = household_cal['Nonfamily Not Alone'] / household_cal['Total']
year = household_cal['Year']

household_percentage = pd.DataFrame({
    'Married-couple Family': mc_percentage,
    'Single Householder': sh_percentage, 
    'Nonfamily Living Alone': nla_percentage, 
    'Nonfamily Not Alone': nna_percentage,
    'Year': year
})

household_percentage

In [None]:
household_percentage = household_percentage.round(decimals = 4)

In [None]:
fig = px.bar(household_percentage,
             x="Year", 
             y=["Married-couple Family","Single Householder", "Nonfamily Living Alone","Nonfamily Not Alone"],  
             title="Household Types (CA)")
fig.show()

### 3. Divorces in the Last Year in California

In [None]:
divorces_cal = divorces[divorces['State']=='California']
divorces_cal

In [None]:
never_married_total = divorces_cal["Male Never Married"] + divorces_cal["Female Never Married"]
div_last_year = divorces_cal["Male Married; Divorced Last Year"] + divorces_cal["Female Married; Divorced Last Year"]
married_not_div = divorces_cal["Male Married; Not Divorced Last Year"] + divorces_cal["Female Married; Not Divorced Last Year"]
total = divorces_cal["Total"]
year = divorces_cal["Year"]

divorces_cal_totals = pd.DataFrame({
    'Total': total,
    'Never Married Total': never_married_total,
    'Ever Married; Divorced Last Year': div_last_year, 
    'Ever Married; Did Not Divorce Last Year': married_not_div, 
    'Year': year
})


In [None]:
never_married_per = divorces_cal_totals["Never Married Total"] / divorces_cal_totals["Total"]
married_div_per = divorces_cal_totals["Ever Married; Divorced Last Year"] / divorces_cal_totals["Total"]
married_not_div_per = divorces_cal_totals["Ever Married; Did Not Divorce Last Year"] / divorces_cal_totals["Total"]

divorces_cal_percentages = pd.DataFrame({
    'Never Married': never_married_per,
    'Ever Married; Divorced Last Year': married_div_per, 
    'Ever Married; Did Not Divorce Last Year': married_not_div_per, 
    'Year': year
})

divorces_cal_percentages
divorces_cal_percentages = divorces_cal_percentages.round(decimals = 4)
divorces_cal_percentages

In [None]:
fig = px.scatter(divorces_cal_percentages, 
                x = "Year", 
                y = ["Never Married", "Ever Married; Divorced Last Year", "Ever Married; Did Not Divorce Last Year"],
                title = "Divorces in the Past Year in CA",
                trendline = "ols", # ordinary least squares regression trendline
                width = 800,
                height = 600)

fig.show()