# Answer Key for Breakout sessions

# Lecture 4


In breakout groups you will work in teams to practice APIs

Let's stick to APIs that don't ask for authentication (it can be time consuming to figure this out because of documentation)

[Open Notify](http://open-notify.org/) gives access to data about the international space station.

As a group:
1. Look at the documentation 
2. Make a GET Request and find out:
    1. How many people are in space
    2. Where the ISS right now 
    3. The pass time over Pace 


In [43]:
import requests
#A. How many people are in space
url = 'http://api.open-notify.org/astros.json' #the specific url for this question
response = requests.get(url) #request the get
response.json() #show json information in get request

{'number': 3,
 'people': [{'craft': 'ISS', 'name': 'Chris Cassidy'},
  {'craft': 'ISS', 'name': 'Anatoly Ivanishin'},
  {'craft': 'ISS', 'name': 'Ivan Vagner'}],
 'message': 'success'}

In [85]:
#B. Where the ISS right now 
url = 'http://api.open-notify.org/iss-now.json'
response = requests.get(url)
print(response.json())

#fancier way to do this
#create a function to return the response in json
def get_json(url):  #define a function called get_json with argument url
    response = requests.get(url) #name the get request response
    response_json = response.json() #save the json response in resonse_json
    return response_json #return response_json 

get_json('http://api.open-notify.org/iss-now.json')


{'iss_position': {'longitude': '-60.4220', 'latitude': '23.8940'}, 'message': 'success', 'timestamp': 1601518568}


{'iss_position': {'longitude': '-60.4220', 'latitude': '23.8940'},
 'message': 'success',
 'timestamp': 1601518568}

In [68]:
#C. The pass time over Pace 
# 1. First, google what's the lat/lon of pace 
# 40.7111° N, 74.0048° W

In [86]:
url = 'http://api.open-notify.org/iss-pass.json?lat=40.7111&lon=74.0048'
response = requests.get(url)
response.json()

# more simply
get_json('http://api.open-notify.org/iss-pass.json?lat=40.7111&lon=74.0048')

{'message': 'success',
 'request': {'altitude': 100,
  'datetime': 1601517496,
  'latitude': 40.7111,
  'longitude': 74.0048,
  'passes': 5},
 'response': [{'duration': 555, 'risetime': 1601538975},
  {'duration': 654, 'risetime': 1601544712},
  {'duration': 594, 'risetime': 1601550581},
  {'duration': 564, 'risetime': 1601556467},
  {'duration': 628, 'risetime': 1601562296}]}

In [58]:
#notice the time stamp. The documentation says it is in unix code.
# let's translate this to human readable format
import time

time.strftime("%Y %m %d %H:%M:%S", time.gmtime(1601562296))

'2020 10 01 14:24:56'

In [91]:
# Bonus: Turn the output of duration into a dataframe
duration = get_json('http://api.open-notify.org/iss-pass.json?lat=40.7111&lon=74.0048')
#get the json of the request
date_passthrough = pd.DataFrame(duration['response'])
#turn the item response into a dataframe


# turn the timestamp into a column in the dataframe
out = []
#create an empty list
for i in date_passthrough['risetime']: #for every item in the column risetime in the dataframe
    a = time.strftime("%Y %m %d %H:%M:%S", time.gmtime(i))
    # use the time package and usr the function to convert unix time into this format: year, month, day, hour:minute: seconds
    out.append(a)
    #append the time I converted into a list
out = pd.DataFrame(out)    
# turn that list into a dataframe

out.merge(date_passthrough, left_index=True, right_index=True).rename(columns={0:'date & time'})
# merge the passthrough data with the converted day/time information into one dataframe based on the index values

# Lecture 6

In [3]:
pwd

'/Users/mkaltenberg/Documents/Data Analysis Python R Lectures/Data_Analysis_Python_R/Assignments'

In [7]:
# get to this point
import pandas as pd

path = '/Users/mkaltenberg/Documents/Data Analysis Python R Lectures/Data_Analysis_Python_R/Lecture_6/ds/'
wjp = pd.read_csv(path+'wjp.csv') 
ineq = pd.read_csv(path+'ineq.csv') 
wjp_ineq = pd.merge(wjp, ineq, left_on=['Country','year'],right_on=['country','year'],how='left')

1. Filter out the dataset to show only data from the region 'Sub-Saharan Africa'
2. How many countries are in the region?
3. Calculate the average gini of the region.
4. What's the maximum population in the region? What's the countries name?
5. Can you do a for loop that can do this calculation for all of the regions?
6. Export the filtered dataset of 1 (only countries that are from the region)

In [9]:
# 1. Filter out the dataset to show only data from the region 'Sub-Saharan Africa'
wjp_ineq[wjp_ineq['Region']=='Sub-Saharan Africa']

Unnamed: 0,Country,Region,Income Group,isocode,year,id,isocode.1,factor1,f1.2 Government powers are effectively limited by the legislature,f1.3 Government powers are effectively limited by the judiciary,...,population,databasesource,sharetop1,sharetop5,source1,sqcoeffvariation,surveysource2,surveyyears,consumptionsurvey,theil
9,Botswana,Sub-Saharan Africa,Upper middle income,BWA,2012,BWA2012,BWA,0.73,0.76,0.79,...,2132822.0,,,,,1.191579,,0.0,,0.783651
12,Burkina Faso,Sub-Saharan Africa,Low income,BFA,2012,BFA2012,BFA,0.43,0.4,0.41,...,16590813.0,,,,,0.959153,,0.0,,0.636224
14,Cameroon,Sub-Saharan Africa,Lower middle income,CMR,2012,CMR2012,CMR,0.31,0.24,0.27,...,21659488.0,,,,,0.980201,,0.0,,0.654881
19,Cote d'Ivoire,Sub-Saharan Africa,Lower middle income,CIV,2012,CIV2012,CIV,0.43,0.46,0.35,...,21102641.0,,,,,0.93217,,0.0,,0.641412
28,Ethiopia,Sub-Saharan Africa,Low income,ETH,2012,ETH2012,ETH,0.36,0.41,0.34,...,92191211.0,,,,,0.852139,,0.0,,0.578986
33,Ghana,Sub-Saharan Africa,Low income,GHA,2012,GHA2012,GHA,0.72,0.84,0.7,...,25544565.0,,,,,0.872084,,0.0,,0.590276
46,Kenya,Sub-Saharan Africa,Low income,KEN,2012,KEN2012,KEN,0.45,0.56,0.39,...,42542978.0,,,,,1.067078,,0.0,,0.701574
49,Liberia,Sub-Saharan Africa,Low income,LBR,2012,LBR2012,LBR,0.53,0.73,0.49,...,4190155.0,,,,,0.851184,,0.0,,0.590845
51,Madagascar,Sub-Saharan Africa,Low income,MDG,2012,MDG2012,MDG,0.45,0.5,0.41,...,22293720.0,,,,,0.943059,,0.0,,0.63321
52,Malawi,Sub-Saharan Africa,Low income,MWI,2012,MWI2012,MWI,0.49,0.53,0.51,...,15700436.0,,,,,1.026015,,0.0,,0.675875


In [36]:
# 2. How many countries are in the region?
sa = wjp_ineq[wjp_ineq['Region']=='Sub-Saharan Africa']
sa['Country'].nunique()

18

In [11]:
# 3. Calculate the average gini of the region.
sa['gini'].mean()

0.5840616764705883

In [13]:
# 4. What's the maximum population in the region? What's the countries name?
print(sa['population'].max())
sa[sa['population']==sa['population'].max()]

168000000.0


Unnamed: 0,Country,Region,Income Group,isocode,year,id,isocode.1,factor1,f1.2 Government powers are effectively limited by the legislature,f1.3 Government powers are effectively limited by the judiciary,...,population,databasesource,sharetop1,sharetop5,source1,sqcoeffvariation,surveysource2,surveyyears,consumptionsurvey,theil
62,Nigeria,Sub-Saharan Africa,Lower middle income,NGA,2012,NGA2012,NGA,0.45,0.62,0.49,...,168000000.0,,,,,0.525289,,0.0,,0.396058


In [40]:
# 5. Can you do a for loop that can do this calculation for all of the regions?
region = wjp_ineq['Region'].unique()
out = []

for i in region:
    reg_pop = wjp_ineq[wjp_ineq['Region']==str(i)]['population'].max()
    reg_pop = [reg_pop, str(i)]
    out.append(reg_pop)
out    


[[74099255.0, 'Eastern Europe & Central Asia'],
 [202000000.0, 'Latin America & Caribbean'],
 [1350000000.0, 'East Asia & Pacific'],
 [314000000.0, 'Western Europe & North America'],
 [1260000000.0, 'South Asia'],
 [168000000.0, 'Sub-Saharan Africa'],
 [85660902.0, 'Middle East & North Africa']]

In [27]:
# 6. Export the filtered dataset of 1 (only countries that are from the region 'Sub-Saharan Africa' )

#two ways to do this

wjp_ineq[wjp_ineq['Region']=='Sub-Saharan Africa'].to_csv('ssa_data.csv')

sa.to_csv('ssa_data.csv')

