# ODA data by Sector (with donor and recipient)
This tutorial shows you how to build a Pandas DataFrame containing ODA data from any number of donors to any number of recipients, across sectors available in the list of indicators, over a range of years.

The tutorial has the following steps:
1. Import `ODAData`, the main tool used to interact with the data
2. Create an instance of `ODAData` with the specific arguments you would like to use
3. Load the indicators and get the DataFrame
4. Optionally, export the DataFrame

## 1. Import ODAData

We can gather the data we need using the `ODAData` class. An object from this class can:
    - Get data for specific indicators
    - Optionally, filter the data for specific donors, recipients and years
    - Optionally, exchange and deflate data

The `oda_data` package automatically downloads the relevant datasets (i.e. DAC1, DAC2a or CRS tables) depending on the indicators selected in step 3.

It is highly recommended that you specify the folder where you want to store, and from where you want to read, this raw data. This must be done once in each notebook or script where we use the `oda_data` package. You can specify the data path if the raw data has already been downloaded, or if you haven't yet specified the download path in your notebook and script.

In [2]:
from oda_data import ODAData, set_data_path

# in section 3, we give the option to retreive data for specific donor and recipient groupings, so we need to import the following
from oda_data.tools.groupings import donor_groupings, recipient_groupings

# If you haven't set the data path, you can do it now.
set_data_path(path="../tutorials/data")

# 2. Create an instance of ODAData

Next, we need to set the right arguments which are used to create an instance of ODAData to produce the right DataFrame. The arguments required are:
- *years*: you must specify the `years`, as an `int`, `list` or `range`
- *donors*: you can _optionally_ specify the `donors` you want the output to have (as `int`, or `list`) of donor codes.
- *recipients*: you can _optionally_ specify the `recipients` you want the output to have. Not all indicators
    need or accept recipients. If using an indicator for which recipients are not an option, a warning will be logged to
    the console and the recipients ignored for that indicator.
- *currency*: you can _optionally_ specify the `currency` in which you want your data to be shown. If not specified,
    by default, `USD` will be used. Other options include `EUR`, `GBP` and `CAN`.
- *prices*: you can _optionally_ specify the `prices` in which you want your data to be shown. If not specified,
    by default, `current` will be used. The other option is `constant`. If specifying `constant` a `base_year` must be set.
- *base_year*: you must specify a `base_year` if you have set `prices = 'constant'`. If you have chosen `current` prices,
    by default, `base_year` will be `None`.

You can use a few methods provided by the ODAData object to see the available donors, recipients and donor/recipient groupings.

In [6]:
# print a list of available donors
ODAData().available_donors()

INFO 2023-02-06 12:42:25,674 [oda_data.py:available_donors:445] Note that not all donors may be available for all indicators


{
1: Austria,
2: Belgium,
3: Denmark,
4: France,
5: Germany,
6: Italy,
7: Netherlands,
8: Norway,
9: Portugal,
10: Sweden,
11: Switzerland,
12: United Kingdom,
18: Finland,
20: Iceland,
21: Ireland,
22: Luxembourg,
40: Greece,
50: Spain,
61: Slovenia,
68: Czech Republic,
69: Slovak Republic,
75: Hungary,
76: Poland,
301: Canada,
302: United States,
701: Japan,
742: Korea,
801: Australia,
820: New Zealand,
30: Cyprus,
45: Malta,
55: Turkey,
62: Croatia,
70: Liechtenstein,
72: Bulgaria,
77: Romania,
82: Estonia,
83: Latvia,
84: Lithuania,
87: Russia,
130: Algeria,
133: Libya,
358: Mexico,
543: Iraq,
546: Israel,
552: Kuwait,
561: Qatar,
566: Saudi Arabia,
576: United Arab Emirates,
611: Azerbaijan,
613: Kazakhstan,
732: Chinese Taipei,
764: Thailand,
765: Timor-Leste,
104: Nordic Development Fund,
807: UNEP,
811: Global Environment Facility,
812: Montreal Protocol,
901: International Bank for Reconstruction and Development,
902: Multilateral Investment Guarantee Agency,
903: Internationa

In [7]:
# print a list of available recipients
ODAData().available_recipients()

INFO 2023-02-06 12:42:27,733 [oda_data.py:available_recipients:451] Note that not all recipients may be available for all indicators


{
30: Cyprus,
35: Gibraltar,
45: Malta,
55: Turkey,
57: Kosovo,
61: Slovenia,
62: Croatia,
63: Serbia,
64: Bosnia and Herzegovina,
65: Montenegro,
66: North Macedonia,
71: Albania,
85: Ukraine,
86: Belarus,
88: States Ex-Yugoslavia unspecified,
89: Europe, regional,
93: Moldova,
130: Algeria,
133: Libya,
136: Morocco,
139: Tunisia,
142: Egypt,
189: North of Sahara, regional,
218: South Africa,
225: Angola,
227: Botswana,
228: Burundi,
229: Cameroon,
230: Cabo Verde,
231: Central African Republic,
232: Chad,
233: Comoros,
234: Congo,
235: Democratic Republic of the Congo,
236: Benin,
237: East African Community,
238: Ethiopia,
239: Gabon,
240: Gambia,
241: Ghana,
243: Guinea,
244: Guinea-Bissau,
245: Equatorial Guinea,
247: Cote d'Ivoire,
248: Kenya,
249: Lesotho,
251: Liberia,
252: Madagascar,
253: Malawi,
255: Mali,
256: Mauritania,
257: Mauritius,
258: Mayotte,
259: Mozambique,
260: Niger,
261: Nigeria,
265: Zimbabwe,
266: Rwanda,
268: Sao Tome and Principe,
269: Senegal,
270: Seyche

In [8]:
# Print available donor groupings
print(donor_groupings().keys())

dict_keys(['dac_members', 'dac_countries', 'non_dac_countries', 'multilateral', 'all_bilateral', 'all_official', 'g7', 'eu27_total', 'eu27_countries', 'dac1_aggregates'])


In [9]:
# Print available recipient groupings
print(recipient_groupings().keys())

dict_keys(['all_recipients', 'all_developing_countries_regions', 'african_countries', 'africa_regional', 'african_countries_regional', 'sahel', 'ldc_countries', 'france_priority', 'dac2a_aggregates'])


Next, in order to get the data we want, we will create an instance of the `ODAData` class by specifying the correct arguments.

We will store this instance in a variable called `oda`, which we will use later to load the indicators and get the DataFrame we're after.

Below are some example settings for this tutorial. For clarity, we will first store them in variables, but you can
always pass them directly as arguments to the `ODAData` class.

In [18]:
# Select years as (for example) a range. Remember ranges are exclusive of the upper bound.
years = range(2018,2021)

# Select donors, which must be specified by their codes. To get all donors, do not use this argument.
donors = [4, 5, 12, 302]

# Select recipients, which must be specified by their codes or available country groupings. In this example, we are selecting LDCs (which we define below).
recipients = list(recipient_groupings()['ldc_countries'])

# Select the currency. By default 'USD' is shown but we'll get the data in Euros.
currency = 'EUR'

# Select the prices. By default, 'current' is shown, but we'll get the data in constant prices.
prices = 'constant'

# Set the base year. We must set this given that we've asked for constant data.
base_year = 2021

# Instantiate the `ODAData` class and store it in a variable called 'oda'
oda = ODAData(years=years,
              donors=donors,
              currency=currency,
              prices=prices,
              base_year=base_year,
              include_names=True)

## 3. Load the indicators and get the DataFrame

Then we tell `oda` to load the indicator(s) that are useful for our analysis. For this tutorial, we will use the *"crs_gender_significant_flow_disbursement_gross"*, *"crs_gender_principal_flow_disbursement_gross"*, and *"crs_gender_total_flow_gross"* indicators to get total gender relevant ODA from our 4 donors to LDCs.

A full list of indicators can be seen by using the `.available_indicators()` method.

In [19]:
# print available indicators
ODAData().available_indicators()

[
total_oda_flow_net,
total_oda_ge,
total_oda_official_definition,
total_oda_bilateral_flow_net,
total_oda_bilateral_ge,
total_oda_multilateral_flow_net,
total_oda_multilateral_ge,
total_oda_flow_gross,
total_oda_flow_commitments,
total_oda_grants_flow,
total_oda_grants_ge,
total_oda_non_grants_flow,
total_oda_non_grants_ge,
total_covid_oda_ge,
total_covid_oda_flow,
total_covid_oda_ge_linked,
total_health_covid_oda_ge,
total_health_covid_oda_flow,
total_health_covid_oda_ge_linked,
total_covid_vaccine_donations_oda_ge,
total_covid_vaccine_donations_oda_flow,
total_covid_vaccine_donations_oda_ge_linked,
total_covid_vaccine_donations_domestic_supply_oda_ge,
total_covid_vaccine_donations_domestic_supply_oda_flow,
total_covid_vaccine_donations_domestic_supply_oda_ge_linked,
total_covid_vaccine_donations_dev_purchase_oda_ge,
total_covid_vaccine_donations_dev_purchase_oda_flow,
total_covid_vaccine_donations_dev_purchase_oda_ge_linked,
total_covid_ancillary_oda_ge,
total_covid_ancillary_oda_fl

In [22]:
# Create a variable with the list of indicators for this analysis.
# The indicator can also be directly passed as an argument in the step below.
indicators = ['crs_gender_significant_flow_disbursement_gross',
              'crs_gender_principal_flow_disbursement_gross',
              'crs_gender_total_flow_gross']

# Add all the indicators in our `indicators` list
for indicator in indicators:
    oda.load_indicator(indicator)

# Get a DataFrame with all the data. Here, we also add the share of total.
df = oda.add_share_of_total(True).get_data('all')

# Finally, group the DataFrame rows by year, currency, prices and indicator
df = df.groupby(['donor_name','year','currency','prices','indicator'], observed=True, dropna=False)['value'].sum(numeric_only=True).reset_index(drop=False)

# show the resulting dataframe
df

Unnamed: 0,donor_name,year,currency,prices,indicator,value
0,France,2018,EUR,constant,crs_gender_principal_flow_disbursement_gross,275.432951
1,France,2018,EUR,constant,crs_gender_significant_flow_disbursement_gross,945.007322
2,France,2018,EUR,constant,crs_gender_total_flow_gross,1220.440273
3,France,2019,EUR,constant,crs_gender_principal_flow_disbursement_gross,193.153019
4,France,2019,EUR,constant,crs_gender_significant_flow_disbursement_gross,1390.450238
5,France,2019,EUR,constant,crs_gender_total_flow_gross,1583.603257
6,France,2020,EUR,constant,crs_gender_principal_flow_disbursement_gross,276.828589
7,France,2020,EUR,constant,crs_gender_significant_flow_disbursement_gross,2716.558708
8,France,2020,EUR,constant,crs_gender_total_flow_gross,2993.387297
9,Germany,2018,EUR,constant,crs_gender_principal_flow_disbursement_gross,227.704653


## 4. Optionally export DataFrame as CSV

Finally, we can export the DataFrame as a CSV if required.

In [17]:
df.to_csv(r'../tutorials/output/total_gender_oda.csv', index = False)