## Project Phase 1: Proposal


### Introduction

Economic development of a country is very important to its people. It impacts things like quality of life and general well-being. This proposal is for a project studying how the economic factors of a country affect its population's wellbeing through measurements suc as national poverty. I also believe that studying the impacts of a country's political factors on the economy would be interesting. Both of these answers provide critical insight on how a country should be run to help better its population.

## Data Collection

I plan on collecting data from the World Bank Indicator data API. This data source represents critical information on economic and socio-political factors for a country. I plan on collecting a small number of indicators, for each country, for one year (in the case of the code below the year is 2023). Data is collected through an HTTP API call using the requests library. 

**Note 1**: There aren't any indicators that provide categorical data. To get categorical data, I queried for the list of countries, each with an associated region. This region data serves as categorical data, but requires a slightly different call to the same API.

**Note 2**: The API does not require any API keys.

In [None]:
import requests;
import pandas as pd;
import json;



baseurl = 'https://api.worldbank.org/v2/country/all/indicator/'
    
#API input parameters
params = {
    'format': 'json',
    'per_page': '300', # This makes sure all countries are returned
    'date': '2023' #Query just one year
}

#What indicator data to get
indicators = {
    'SP.POP.TOTL': 'Population, total',
    'NY.GNP.PCAP.CD': 'GDP Per Captia',
    'SI.POV.DDAY': 'Poverty headcount ratio at $3.00 a day (2021 PPP)',
    'SI.POV.GINI': 'Gini index',
    'MS.MIL.XPND.GD.ZS': 'Military expenditure (% of GDP)',
    'VA.EST': 'Voice and Accountability: Estimate'
}
def get_api_url(indicator, params):
    """
    Constructs a URL for the API call, to query a given indicator and with a given set of parameters.

    Args:
        indicator: the indicator ID string
        params: a dictionary containing the API call parameters

    Returns:
        A URL to send an HTTP request to to get the API data
    """
    url = baseurl + indicator + '?'
    for param in params.keys():
        url = url + param + '=' + params[param] + '&'
    return url



Indicator_Data = {}
#Loop through each indicator and make an API call for each. Unfortunately, each API call can only return data for 1 indicator.
for indicator in indicators.keys():
    Indicator_Data[indicator] = json.loads(requests.get(get_api_url(indicator, params)).text)


#One additional piece of data: The region code. This must be pulled from a seperate, slightly different call to the same API.
Country_Data = json.loads(requests.get('https://api.worldbank.org/v2/country?format=json&per_page=296').text)[1]


print(Country_Data)
print(Indicator_Data)

[{'id': 'ABW', 'iso2Code': 'AW', 'name': 'Aruba', 'region': {'id': 'LCN', 'iso2code': 'ZJ', 'value': 'Latin America & Caribbean '}, 'adminregion': {'id': '', 'iso2code': '', 'value': ''}, 'incomeLevel': {'id': 'HIC', 'iso2code': 'XD', 'value': 'High income'}, 'lendingType': {'id': 'LNX', 'iso2code': 'XX', 'value': 'Not classified'}, 'capitalCity': 'Oranjestad', 'longitude': '-70.0167', 'latitude': '12.5167'}, {'id': 'AFE', 'iso2Code': 'ZH', 'name': 'Africa Eastern and Southern', 'region': {'id': 'NA', 'iso2code': 'NA', 'value': 'Aggregates'}, 'adminregion': {'id': '', 'iso2code': '', 'value': ''}, 'incomeLevel': {'id': 'NA', 'iso2code': 'NA', 'value': 'Aggregates'}, 'lendingType': {'id': '', 'iso2code': '', 'value': 'Aggregates'}, 'capitalCity': '', 'longitude': '', 'latitude': ''}, {'id': 'AFG', 'iso2Code': 'AF', 'name': 'Afghanistan', 'region': {'id': 'MEA', 'iso2code': 'ZQ', 'value': 'Middle East, North Africa, Afghanistan & Pakistan'}, 'adminregion': {'id': 'MNA', 'iso2code': 'XQ',

### Data Usage

The data for economic country indicators covers pretty much every factor that could impact a country's population and economic performance. By taking a sufficient sample of these factors for each country, we can build a profile that could be used to predict other factors. Most of these factors come in the form of numerical data, with the exception of country region which is categorical data. Because the interplay between indicators is extremely complex, a Machine Learning model could be used for predictions. Such a model could take in easy-to-control indicators such as military spending and output hard-to-control indicators such as national poverty. These predictions could directly answer both of my questions.