The goal of this notebook is to make any api calls and save the results to a csv/json format so that we can read the responses without re-running the api calls and we don't need to track api keys. 

In [None]:
#imports including api_key if needed
#because no api key is needed for the world bank api, no need to change .gitignore or import from config.py

import pandas as pd
import numpy as np
import requests
import json
import time
from pathlib import Path

### Sources for API Calls for Population

uses of open data - API to get the population by country - Open Data Stack Exchange
https://opendata.stackexchange.com/questions/20863/api-to-get-the-population-by-country

Developer Information – World Bank Data Help Desk
https://datahelpdesk.worldbank.org/knowledgebase/topics/125589

Country API Queries – World Bank Data Help Desk
https://datahelpdesk.worldbank.org/knowledgebase/articles/898590-country-api-queries


From above links:


Requesting Country Data
To list all countries: http://api.worldbank.org/v2/country

The following information will appear, when available, in the response when using this country query through the World Bank API:

3 letter ISO 3166-1 alpha-3 code
2 letter ISO 3166-1 alpha-2 code
Name
Region: ID, name and World Bank 2 letter code
Income Level: ID, name and World Bank 2 letter code
Lending Type: ID, name and World Bank 2 letter code
Capital City
Longitude
Latitude



Sample Request Format: Country Query
For XML format: http://api.worldbank.org/v2/country/br

For JSON format: http://api.worldbank.org/v2/country/br?format=json

Note: “br” is the two-letter ISO code for Brazil.

Sample response for JSON format country query for Brazil:

```
[
  {
    "page": 1,
    "pages": 1,
    "per_page": "50",
    "total": 1
    },
    [
      {
        "id": "BRA",
        "iso2Code": "BR",
        "name": "Brazil",
        "region": {
          "id": "LCN",
          "iso2code": "ZJ",
          "value": "Latin America & Caribbean (all income levels)"
        },
        "adminregion": {
          "id": "LAC",
          "iso2code": "XJ",
          "value": "Latin America & Caribbean (developing only)"
        },
        "incomeLevel": {
          "id": "UMC",
          "iso2code": "XT",
          "value": "Upper middle income"
        },
        "lendingType": {
          "id": "IBD",
          "iso2code": "XF",
          "value": "IBRD"
        },
        "capitalCity": "Brasilia",
        "longitude": "-47.9292",
        "latitude": "-15.7801"
      }
    ]
  ]
  ```


API Basic Call Structures – World Bank Data Help Desk
https://datahelpdesk.worldbank.org/knowledgebase/articles/898581-api-basic-call-structures

Gives us the following query to get population:

https://api.worldbank.org/v2/country/br/indicator/SP.POP.TOTL?format=json

which gives us output like:

```

[{"page":1,"pages":2,"per_page":50,"total":63,"sourceid":"2","lastupdated":"2023-12-18"},[{"indicator":{"id":"SP.POP.TOTL","value":"Population, total"},"country":{"id":"BR","value":"Brazil"},"countryiso3code":"BRA","date":"2022","value":215313498,"unit":"","obs_status":"","decimal":0},{"indicator":{"id":"SP.POP.TOTL","value":"Population, total"},"country":{"id":"BR","value":"Brazil"},"countryiso3code":"BRA","date":"2021","value":214326223,"unit":"","obs_status":"","decimal":0},{"indicator":{"id":"SP.POP.TOTL","value":"Population, total"},"country":{"id":"BR","value":"Brazil"},"countryiso3code":"BRA","date":"2020","value":213196304,"unit":"","obs_status":"","decimal":0},{"indicator":{"id":"SP.POP.TOTL","value":"Population, total"},"country":{"id":"BR","value":"Brazil"},"countryiso3code":"BRA","date":"2019","value":211782878,"unit":"","obs_status":"","decimal":0},
...
]]

```


This gives us country population data by year, although it is very messy to grab it. Might take some trial and error.

Looks like it returns a list. The second element of that list is a list. The elements of that list are dictionaries corresponding to countries, one for each country per year.
Keys for each dictionary in that list are indicator, country, countryiso3code, date, value, unit, obs_status, decimal. 
We only care about date and value, as the query will be generated for a specific country code. 

It also looks like there might be some page navigation, as that information is included in the first list entry of the return. It shouldn't be needed, we only need to look at the country entries for date=2024, 2023, 2022, 2021, 2020 (although 2024 and 2023 data are likely not yet available)



But I should be able to generate some visuals for country data science job listings from ai-jobs.net, which is the source of the data set we are using. 

Limitations in posts to that job listing site might limit the effectiveness of these metrics. Poor saturation of job postings on that site in one country compared to another could create an unreliable basis for country specifc metrics. 

Some potential questions to answer using this and the other data set:

different salaries in different countries
(choose several indicators from world bank api and compare across the counties/data science salaries)
job postings in different countries
job postings per capita in different countries


In [None]:
#Pseudocode:

#grab all countries from our dataset (create a list of the unique entries in 'company_location')
#API call to list all countires and store the request as json
#create a list of two letter ISO country codes that correspond to the list of countries in our dataset
#create a list/dictionary to store country data

#for loop - for each country_code in that list, we want to make a specific country query
    #url = f"http://api.worldbank.org/v2/country/{country_code}?format=json"
    #country_response = requests.get(url).json()
    #store data in country_data

#create pandas dataframe from country_data list/dictionary we generated (see module 6)
#save the dataframe to a csv, save it as "worldbank_api_results.csv" in "resources" directory


#read the csv in the other notebook, can create dataframes and try merging if needed. Otherwise, maybe just find key statistics. 