## Ex1 Use data from Danmarks Statistik - Databanken
1. Go to https://www.dst.dk/da/Statistik/brug-statistikken/muligheder-i-statistikbanken/api#testkonsol
2. Open 'Konsol' and click 'Start Konsol'
3. In the console at pt 1: choose 'Retrieve tables', pt 2: choose get request and json format and pt 3: execute:
  1. check the result
  2. in the code below this same get request is used to get information about all available data tables in 'databanken'. 
4. Change pt. 1 in the console to 'Retrieve data', pt 2: 'get request' and Table id: 'FOLK1A', format: csv, delimiter: semicolon and click: 'Variable and value codes' and choose some sub categories (Hint: hover over the codes to see their meaning). Finally execute and see what data you get.
5. With data aggregation and data visualization answer the following questions:
  1. What is the change in pct of divorced danes from 2008 to 2020?
  2. Which of the 5 biggest cities has the highest percentage of 'Never Married' in 2020?
  3. Show a bar chart of changes in marrital status in Copenhagen from 2008 till now
  4. Show 2 plots in same figure: 'Married' and 'Never Married' for all ages in DK in 2020 (Hint: x axis is age from 0-125, y axis is how many people in the 2 categories). Add lengend to show names on graphs

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib notebook

In [2]:
#%pylab inline 
# %pylab is a magic function in ipython, and triggers the import of various modules within Matplotlib

In [3]:
import requests

# url = 'http://api.worldbank.org/v2/en/country/DNK;URY' 
# response = requests.get(url, params={'downloadformat': 'csv'})
url = 'https://api.statbank.dk/v1/data/FOLK1A/CSV?delimiter=Semicolon&OMR%C3%85DE=000%2C084%2C083%2C085%2C082%2C081&Tid=2008K4%2C2009K4%2C2010K4%2C2011K4%2C2012K4%2C2013K4%2C2014K4%2C2015K4%2C2016K4%2C2017K4%2C2018K4%2C2019K4%2C2020K4&ALDER=IALT&K%C3%98N=TOT&CIVILSTAND=*'
response = requests.get(url)

print(response.headers)

{'Cache-Control': 'no-cache', 'Pragma': 'no-cache', 'Transfer-Encoding': 'chunked', 'Content-Type': 'text/csv; charset=utf-8', 'Expires': '-1', 'Server': 'Microsoft-IIS/10.0', 'StatbankAPI-Request-Id': 'b782b7d4-daef-4652-b3c8-e5f75b78d498', 'Access-Control-Expose-Headers': 'StatbankAPI-Request-Id', 'Content-Disposition': 'attachment; filename=FOLK1A.csv', 'X-AspNet-Version': '4.0.30319', 'X-Powered-By': 'ASP.NET', 'Date': 'Fri, 24 Sep 2021 07:03:27 GMT'}


In [4]:
# get the filename
fname = response.headers['Content-Disposition'].split('=')[1]
fname = 'data/'+'i_got_'+fname
# write content to file (zip file writing bytes)
if response.ok:  # status_code == 200:
    with open(fname, 'wb') as f:
        f.write(response.content)   
print('-----------------')
print('Downloaded {}'.format(fname))

-----------------
Downloaded data/i_got_FOLK1A.csv


In [5]:
%%bash
head ./data/i_got_FOLK1A.csv

﻿OMRÅDE;TID;ALDER;KØN;CIVILSTAND;INDHOLD
Hele landet;2008K4;I alt;I alt;I alt;5505995
Hele landet;2008K4;I alt;I alt;Ugift;2568255
Hele landet;2008K4;I alt;I alt;Gift/separeret;2191033
Hele landet;2008K4;I alt;I alt;Enke/enkemand;314551
Hele landet;2008K4;I alt;I alt;Fraskilt;432156
Hele landet;2009K4;I alt;I alt;I alt;5532531
Hele landet;2009K4;I alt;I alt;Ugift;2588198
Hele landet;2009K4;I alt;I alt;Gift/separeret;2193779
Hele landet;2009K4;I alt;I alt;Enke/enkemand;311126


In [6]:
data_file = './data/i_got_FOLK1A.csv'
df = pd.read_csv(data_file,sep=';', encoding='utf-8')
df['TID'] = df['TID'].map(lambda x:x[:-2]) #cut the last 2 characters
df.to_csv('./data/i_got_FOLK1A_cleaned.csv',header=True, index=False)

In [7]:
%%bash
head ./data/i_got_FOLK1A_cleaned.csv

OMRÅDE,TID,ALDER,KØN,CIVILSTAND,INDHOLD
Hele landet,2008,I alt,I alt,I alt,5505995
Hele landet,2008,I alt,I alt,Ugift,2568255
Hele landet,2008,I alt,I alt,Gift/separeret,2191033
Hele landet,2008,I alt,I alt,Enke/enkemand,314551
Hele landet,2008,I alt,I alt,Fraskilt,432156
Hele landet,2009,I alt,I alt,I alt,5532531
Hele landet,2009,I alt,I alt,Ugift,2588198
Hele landet,2009,I alt,I alt,Gift/separeret,2193779
Hele landet,2009,I alt,I alt,Enke/enkemand,311126


In [8]:
data = pd.read_csv('./data/i_got_FOLK1A_cleaned.csv')

In [9]:
data

Unnamed: 0,OMRÅDE,TID,ALDER,KØN,CIVILSTAND,INDHOLD
0,Hele landet,2008,I alt,I alt,I alt,5505995
1,Hele landet,2008,I alt,I alt,Ugift,2568255
2,Hele landet,2008,I alt,I alt,Gift/separeret,2191033
3,Hele landet,2008,I alt,I alt,Enke/enkemand,314551
4,Hele landet,2008,I alt,I alt,Fraskilt,432156
...,...,...,...,...,...,...
385,Region Nordjylland,2020,I alt,I alt,I alt,590739
386,Region Nordjylland,2020,I alt,I alt,Ugift,281410
387,Region Nordjylland,2020,I alt,I alt,Gift/separeret,223755
388,Region Nordjylland,2020,I alt,I alt,Enke/enkemand,33279


### What is the change in pct of divorced danes from 2008 to 2020?

In [10]:
def pct_of_divorced_danes(year_start, year_end, data):
    
    divorced_danes_pct_dict = {}
    
    for year in range(year_start, year_end + 1):
        print(year)
        total_population = data.loc[(data['TID'] == year) & (data['CIVILSTAND'] == 'I alt')].iloc[0][4:]
        print("Total population: {}".format(total_population['INDHOLD']))
        
        divorced_danes = data.loc[(data['TID'] == year) & (data['CIVILSTAND'] == 'Fraskilt')].iloc[0][4:]
        print("Total divorced danes: {}".format(divorced_danes['INDHOLD']))
        
        divorced_danes_pct = (divorced_danes['INDHOLD']/total_population['INDHOLD']) * 100
        print("Pct divorced danes: {:0.2f}%".format(divorced_danes_pct))
        
        divorced_danes_pct_dict[str(year)] = divorced_danes_pct
        
    return divorced_danes_pct_dict

In [11]:
divorced_danes_pct_dict = pct_of_divorced_danes(2008, 2020, data)

2008
Total population: 5505995
Total divorced danes: 432156
Pct divorced danes: 7.85%
2009
Total population: 5532531
Total divorced danes: 439428
Pct divorced danes: 7.94%
2010
Total population: 5557709
Total divorced danes: 447258
Pct divorced danes: 8.05%
2011
Total population: 5579204
Total divorced danes: 455846
Pct divorced danes: 8.17%
2012
Total population: 5599665
Total divorced danes: 466356
Pct divorced danes: 8.33%
2013
Total population: 5623501
Total divorced danes: 477056
Pct divorced danes: 8.48%
2014
Total population: 5655750
Total divorced danes: 500795
Pct divorced danes: 8.85%
2015
Total population: 5699220
Total divorced danes: 513060
Pct divorced danes: 9.00%
2016
Total population: 5745526
Total divorced danes: 526340
Pct divorced danes: 9.16%
2017
Total population: 5778570
Total divorced danes: 535372
Pct divorced danes: 9.26%
2018
Total population: 5806015
Total divorced danes: 542280
Pct divorced danes: 9.34%
2019
Total population: 5827463
Total divorced danes: 5

In [12]:
print(divorced_danes_pct_dict)

{'2008': 7.848826597190881, '2009': 7.942621559644221, '2010': 8.047524618507374, '2011': 8.170448687662255, '2012': 8.328283924127604, '2013': 8.483256249087534, '2014': 8.854616982716704, '2015': 9.002284523145272, '2016': 9.16086708162142, '2017': 9.264783501800618, '2018': 9.33996898044528, '2019': 9.332465946158731, '2020': 9.463745797866208}


In [13]:
plt.figure()
x_values = list(divorced_danes_pct_dict.keys())
y1 = list(divorced_danes_pct_dict.values())
df=pd.DataFrame({'x_values': x_values,'Divorced Danes %' : y1, })
 
# multiple line plots
plt.plot( 'x_values', 'Divorced Danes %', data=df, marker='', markerfacecolor='blue', color='skyblue', linewidth=4)

plt.title('Divorced Danes in % 2008 to 2020', fontsize=16)

# show legend
plt.legend()

# show graph
plt.show()

<IPython.core.display.Javascript object>

### Which of the 5 biggest cities has the highest percentage of 'Never Married' in 2020?

In [14]:
import csv

In [15]:
def get_areas(url):
    
    
    # Open the file
    data = open(url, encoding = 'utf-8')

    # csv.reader
    # You can seperate with ",", ";", "\t" etc. Look at the file and find the delimiter
    csv_data = csv.reader(data, delimiter = ';')

    # Reformat it into python list
    csv_data_list = list(csv_data)

    city_list = []

    for csv_data in csv_data_list[1:]:
        csv_data = list(csv.reader(csv_data, delimiter = ','))
        #print(csv_data)
        #print(csv_data[0][0])
        if csv_data[0][0] not in city_list:
            print(csv_data[0][0])
            city_list.append(csv_data[0][0])
        
    return city_list      

    

In [16]:
city_list = get_areas('./data/i_got_FOLK1A_cleaned.csv')

Hele landet
Region Hovedstaden
Region Syddanmark
Region Sjælland
Region Midtjylland
Region Nordjylland


In [17]:
city_list

['Hele landet',
 'Region Hovedstaden',
 'Region Syddanmark',
 'Region Sjælland',
 'Region Midtjylland',
 'Region Nordjylland']

In [18]:
def never_married_pct(year, city_list, data):
    
    never_married_pct_dict = {}
    
    for city in city_list:
        print(city)
        total_population = data.loc[(data['TID'] == year) & (data['OMRÅDE'] == city) & (data['CIVILSTAND'] == 'I alt')].iloc[0][4:]
        print("Total population: {}".format(total_population['INDHOLD']))
        
        never_married = data.loc[(data['TID'] == year) & (data['OMRÅDE'] == city) & (data['CIVILSTAND'] == 'Ugift')].iloc[0][4:]
        print("Total never married danes: {}".format(never_married['INDHOLD']))
        
        never_married_pct = (never_married['INDHOLD']/total_population['INDHOLD']) * 100
        print("Pct never married danes: {:0.2f}% in year: {}".format(never_married_pct, year))
        
        never_married_pct_dict[city] = never_married_pct
        
    return never_married_pct_dict

In [19]:
never_married_pct_dict = never_married_pct(2020, city_list, data)

Hele landet
Total population: 5837213
Total never married danes: 2859116
Pct never married danes: 48.98% in year: 2020
Region Hovedstaden
Total population: 1854296
Total never married danes: 981652
Pct never married danes: 52.94% in year: 2020
Region Syddanmark
Total population: 1223183
Total never married danes: 568943
Pct never married danes: 46.51% in year: 2020
Region Sjælland
Total population: 838129
Total never married danes: 369548
Pct never married danes: 44.09% in year: 2020
Region Midtjylland
Total population: 1330866
Total never married danes: 657563
Pct never married danes: 49.41% in year: 2020
Region Nordjylland
Total population: 590739
Total never married danes: 281410
Pct never married danes: 47.64% in year: 2020


In [20]:
never_married_pct_dict

{'Hele landet': 48.98084068544355,
 'Region Hovedstaden': 52.93933654605306,
 'Region Syddanmark': 46.513318121654734,
 'Region Sjælland': 44.092019247633715,
 'Region Midtjylland': 49.40865571740506,
 'Region Nordjylland': 47.636942880019774}

In [21]:
plt.figure()
plt.bar(never_married_pct_dict.keys(), never_married_pct_dict.values(), width=0.5, linewidth=0, align='center')
plt.title('Pct never married danes in 2020', fontsize=12)
plt.tick_params(axis='both', which='major', labelsize=10)
plt.xticks(rotation=45, horizontalalignment='right',fontweight='light')

<IPython.core.display.Javascript object>

([0, 1, 2, 3, 4, 5], <a list of 6 Text major ticklabel objects>)

### Show a bar chart of changes in marrital status in Copenhagen from 2008 till now

In [22]:
def change_in_marrital_status(year_start, year_end, data, area):
    
    change_in_marrital_status_dict = {}
    
    for year in range(year_start, year_end + 1):
        print(year)
        change_in_marrital_status = data.loc[(data['TID'] == year) & (data['OMRÅDE'] == area) & (data['CIVILSTAND'] == 'Gift/separeret')].iloc[0][4:]
        print("Total with marrital status: {} in year: {} in area: {}".format(change_in_marrital_status['INDHOLD'], year, area))
        
        change_in_marrital_status_dict[str(year)] = change_in_marrital_status['INDHOLD']
        
    return change_in_marrital_status_dict

In [23]:
change_in_marrital_status_dict = change_in_marrital_status(2008, 2020, data, 'Region Hovedstaden')

2008
Total with marrital status: 602049 in year: 2008 in area: Region Hovedstaden
2009
Total with marrital status: 606165 in year: 2009 in area: Region Hovedstaden
2010
Total with marrital status: 609109 in year: 2010 in area: Region Hovedstaden
2011
Total with marrital status: 609487 in year: 2011 in area: Region Hovedstaden
2012
Total with marrital status: 609041 in year: 2012 in area: Region Hovedstaden
2013
Total with marrital status: 610228 in year: 2013 in area: Region Hovedstaden
2014
Total with marrital status: 607557 in year: 2014 in area: Region Hovedstaden
2015
Total with marrital status: 609331 in year: 2015 in area: Region Hovedstaden
2016
Total with marrital status: 611311 in year: 2016 in area: Region Hovedstaden
2017
Total with marrital status: 614079 in year: 2017 in area: Region Hovedstaden
2018
Total with marrital status: 617464 in year: 2018 in area: Region Hovedstaden
2019
Total with marrital status: 620854 in year: 2019 in area: Region Hovedstaden
2020
Total with 

In [24]:
change_in_marrital_status_dict

{'2008': 602049,
 '2009': 606165,
 '2010': 609109,
 '2011': 609487,
 '2012': 609041,
 '2013': 610228,
 '2014': 607557,
 '2015': 609331,
 '2016': 611311,
 '2017': 614079,
 '2018': 617464,
 '2019': 620854,
 '2020': 619225}

In [25]:
plt.figure()
x_values = list(change_in_marrital_status_dict.keys())
y1 = list(change_in_marrital_status_dict.values())
df=pd.DataFrame({'x_values': x_values,'Marrital status' : y1, })
 
# multiple line plots
plt.plot( 'x_values', 'Marrital status', data=df, marker='', markerfacecolor='blue', color='skyblue', linewidth=4)

plt.title('Marrital status in Region Hovedstaden from 2008 to 2020', fontsize=16)

# show legend
plt.legend()

# show graph
plt.show()

<IPython.core.display.Javascript object>

### Show 2 plots in same figure: 'Married' and 'Never Married' for all ages in DK in 2020 (Hint: x axis is age from 0-125, y axis is how many people in the 2 categories). Add lengend to show names on graphs

In [26]:
# url = 'http://api.worldbank.org/v2/en/country/DNK;URY' 
# response = requests.get(url, params={'downloadformat': 'csv'})
url = 'https://api.statbank.dk/v1/data/FOLK1A/CSV?delimiter=Semicolon&OMR%C3%85DE=000&CIVILSTAND=U%2CG&ALDER=*&Tid=2020K4'
response = requests.get(url)

print(response.headers)

{'Cache-Control': 'no-cache', 'Pragma': 'no-cache', 'Transfer-Encoding': 'chunked', 'Content-Type': 'text/csv; charset=utf-8', 'Expires': '-1', 'Server': 'Microsoft-IIS/10.0', 'StatbankAPI-Request-Id': '5d5953e1-09ab-4bb1-adfe-d76f8dd3551c', 'Access-Control-Expose-Headers': 'StatbankAPI-Request-Id', 'Content-Disposition': 'attachment; filename=FOLK1A.csv', 'X-AspNet-Version': '4.0.30319', 'X-Powered-By': 'ASP.NET', 'Date': 'Fri, 24 Sep 2021 07:03:29 GMT'}


In [27]:
# get the filename
fname = response.headers['Content-Disposition'].split('=')[1]
fname = 'data/'+'dk_stat_'+fname
# write content to file (zip file writing bytes)
if response.ok:  # status_code == 200:
    with open(fname, 'wb') as f:
        f.write(response.content)   
print('-----------------')
print('Downloaded {}'.format(fname))

-----------------
Downloaded data/dk_stat_FOLK1A.csv


In [28]:
data_file = './data/dk_stat_FOLK1A.csv'
df = pd.read_csv(data_file,sep=';', encoding='utf-8')
df['TID'] = df['TID'].map(lambda x:x[:-2]) #cut the last 2 characters
df['ALDER'] = df['ALDER'].map(lambda x:x[:-3]) #cut the last 2 characters
df.to_csv('./data/dk_stat_FOLK1A_cleaned.csv',header=True, index=False)

In [29]:
%%bash
head ./data/dk_stat_FOLK1A_cleaned.csv

OMRÅDE,CIVILSTAND,ALDER,TID,INDHOLD
Hele landet,Ugift,I ,2020,2859116
Hele landet,Ugift,0,2020,61381
Hele landet,Ugift,1,2020,61650
Hele landet,Ugift,2,2020,62532
Hele landet,Ugift,3,2020,61928
Hele landet,Ugift,4,2020,62597
Hele landet,Ugift,5,2020,59132
Hele landet,Ugift,6,2020,58528
Hele landet,Ugift,7,2020,59354


In [30]:
dk_stat_data = pd.read_csv('./data/dk_stat_FOLK1A_cleaned.csv')

In [31]:
dk_stat_data

Unnamed: 0,OMRÅDE,CIVILSTAND,ALDER,TID,INDHOLD
0,Hele landet,Ugift,I,2020,2859116
1,Hele landet,Ugift,0,2020,61381
2,Hele landet,Ugift,1,2020,61650
3,Hele landet,Ugift,2,2020,62532
4,Hele landet,Ugift,3,2020,61928
...,...,...,...,...,...
249,Hele landet,Gift/separeret,121,2020,0
250,Hele landet,Gift/separeret,122,2020,0
251,Hele landet,Gift/separeret,123,2020,0
252,Hele landet,Gift/separeret,124,2020,0


In [41]:
def married_never_married(age_start, age_end, data):
    
    total_married_dict = {}
    total_nevr_married_dict = {}
    
    for age in range(age_start, age_end + 1):
        print(age)
        total_married = data.loc[(data['ALDER'] == str(age)) & (data['CIVILSTAND'] == 'Gift/separeret')].iloc[0][4:]
        print("Total with marrital status: {} with age: {}".format(total_married['INDHOLD'], age))
        
        total_married_dict[str(age)] = total_married['INDHOLD']
        
        total_never_married = data.loc[(data['ALDER'] == str(age)) & (data['CIVILSTAND'] == 'Ugift')].iloc[0][4:]
        print("Total with never marrital status: {} with age: {}".format(total_never_married['INDHOLD'], age))
        
        total_nevr_married_dict[int(age)] = total_never_married['INDHOLD']
        
    return (total_married_dict, total_nevr_married_dict)

In [42]:
total_married_dict, total_never_married_dict = married_never_married(0,125, dk_stat_data)

0
Total with marrital status: 0 with age: 0
Total with never marrital status: 61381 with age: 0
1
Total with marrital status: 0 with age: 1
Total with never marrital status: 61650 with age: 1
2
Total with marrital status: 0 with age: 2
Total with never marrital status: 62532 with age: 2
3
Total with marrital status: 0 with age: 3
Total with never marrital status: 61928 with age: 3
4
Total with marrital status: 0 with age: 4
Total with never marrital status: 62597 with age: 4
5
Total with marrital status: 0 with age: 5
Total with never marrital status: 59132 with age: 5
6
Total with marrital status: 0 with age: 6
Total with never marrital status: 58528 with age: 6
7
Total with marrital status: 0 with age: 7
Total with never marrital status: 59354 with age: 7
8
Total with marrital status: 0 with age: 8
Total with never marrital status: 60586 with age: 8
9
Total with marrital status: 0 with age: 9
Total with never marrital status: 63442 with age: 9
10
Total with marrital status: 0 with ag

In [43]:
total_married_dict

{'0': 0,
 '1': 0,
 '2': 0,
 '3': 0,
 '4': 0,
 '5': 0,
 '6': 0,
 '7': 0,
 '8': 0,
 '9': 0,
 '10': 0,
 '11': 0,
 '12': 0,
 '13': 0,
 '14': 0,
 '15': 0,
 '16': 0,
 '17': 0,
 '18': 13,
 '19': 74,
 '20': 231,
 '21': 575,
 '22': 1104,
 '23': 1971,
 '24': 3323,
 '25': 5390,
 '26': 7782,
 '27': 11080,
 '28': 14725,
 '29': 18106,
 '30': 22054,
 '31': 24633,
 '32': 27145,
 '33': 28876,
 '34': 30431,
 '35': 31269,
 '36': 31867,
 '37': 32675,
 '38': 33934,
 '39': 35034,
 '40': 37745,
 '41': 38791,
 '42': 40136,
 '43': 39917,
 '44': 42480,
 '45': 44804,
 '46': 44188,
 '47': 44168,
 '48': 46608,
 '49': 45038,
 '50': 43636,
 '51': 43729,
 '52': 45571,
 '53': 49121,
 '54': 51013,
 '55': 49791,
 '56': 48603,
 '57': 47007,
 '58': 45561,
 '59': 44281,
 '60': 44092,
 '61': 43280,
 '62': 42965,
 '63': 42660,
 '64': 42952,
 '65': 41960,
 '66': 40955,
 '67': 42051,
 '68': 39872,
 '69': 39285,
 '70': 40086,
 '71': 39555,
 '72': 40784,
 '73': 42657,
 '74': 42096,
 '75': 38275,
 '76': 34132,
 '77': 29610,
 '78'

In [44]:
total_never_married_dict

{0: 61381,
 1: 61650,
 2: 62532,
 3: 61928,
 4: 62597,
 5: 59132,
 6: 58528,
 7: 59354,
 8: 60586,
 9: 63442,
 10: 66642,
 11: 66984,
 12: 69063,
 13: 67463,
 14: 68501,
 15: 68110,
 16: 68059,
 17: 67736,
 18: 67886,
 19: 70314,
 20: 71873,
 21: 72919,
 22: 73281,
 23: 76513,
 24: 74894,
 25: 76716,
 26: 72077,
 27: 68725,
 28: 64341,
 29: 58801,
 30: 53898,
 31: 48657,
 32: 42918,
 33: 38418,
 34: 35288,
 35: 32035,
 36: 28768,
 37: 26542,
 38: 25126,
 39: 23939,
 40: 23638,
 41: 22435,
 42: 21848,
 43: 20471,
 44: 20984,
 45: 21246,
 46: 19926,
 47: 19284,
 48: 19343,
 49: 18116,
 50: 16711,
 51: 16501,
 52: 16584,
 53: 17551,
 54: 17493,
 55: 16433,
 56: 15761,
 57: 14757,
 58: 13577,
 59: 12616,
 60: 11951,
 61: 11161,
 62: 10690,
 63: 10048,
 64: 9495,
 65: 8689,
 66: 8074,
 67: 7504,
 68: 6670,
 69: 5928,
 70: 5457,
 71: 4815,
 72: 4407,
 73: 4290,
 74: 3913,
 75: 3277,
 76: 2859,
 77: 2361,
 78: 1973,
 79: 1652,
 80: 1509,
 81: 1382,
 82: 1216,
 83: 1062,
 84: 942,
 85: 813,
 8

In [76]:
plt.figure()
x_values = list(total_never_married_dict.keys())
y1 = list(total_never_married_dict.values())
y2 = list(total_married_dict.values())

plt.bar(x_values, y1, width=0.5, linewidth=0, align='center', color = 'blue', alpha = 0.9, label="Never Married")
plt.bar(x_values, y_new, width=0.5, linewidth=0, align='center', color = 'red', alpha = 0.5, label="Married")
plt.title('Marrital status by age in 2020', fontsize=12)
plt.tick_params(axis='both', which='major', labelsize=10)
plt.xticks(rotation=45, horizontalalignment='right',fontweight='light')
plt.legend()
plt.show()


<IPython.core.display.Javascript object>

In [71]:
plt.figure(figsize = (10,10))
x_values = list(total_never_married_dict.keys())
y1 = list(total_never_married_dict.values())
y2 = list(total_married_dict.values())
df=pd.DataFrame({'x_values': x_values,'Never Married' : y1, 'Married' : y2 })
 
# multiple line plots
plt.plot( 'x_values', 'Never Married', data=df, marker='', markerfacecolor='blue', color='skyblue', linewidth=4)
plt.plot( 'x_values', 'Married', data=df, marker='', color='red', linewidth=4)

plt.title('Marrital status by age in 2020', fontsize=16)


# show legend
plt.legend()

# show graph
plt.show()

<IPython.core.display.Javascript object>