# Data reading example 4 - FAOstat #
To run this example the file `Emissions_Agriculture_Cultivated_Organic_Soils_E_All_Data_(Normalized).csv` must be placed in the same folder as this notebook.
The data is available from the [FAOstat](http://www.fao.org/faostat/en/#data/GV/metadata).

In [1]:
import primap2 as pm2

## Dataset Specifications ##
Here we define which columns of the csv file contain the dimensions.

In [2]:
file = "Emissions_Agriculture_Cultivated_Organic_Soils_E_All_Data_(Normalized).csv"
coords_cols = {
    "unit": "Unit",
    "entity": "Element",
    "area": "Area",
    "category": "Item",
    "data": "Value",
    "time": "Year",
}
coords_defaults = {
    "source": "FAOstat",
}
coords_terminologies = {
    "area": "FAOstat",
    "category": "FAOstat",
}

# TODO: proper mapping of the area to ISO3
coords_value_mapping = {
    "unit": {"gigagrams": "Gg N2O / year"},
    "entity": {"Emissions (N2O) (Cultivation of organic soils)": "N2O"},
}

filter_keep = {
    "f1": {
        "Element": "Emissions (N2O) (Cultivation of organic soils)",
    },
}

meta_data = {
    "references": "http://www.fao.org/faostat/en/#data/GV/metadata"
}

## Reading the data to interchange format ##

In [3]:
AgN2O_if = pm2.pm2io.read_long_csv_file_if(
    file,
    coords_cols=coords_cols,
    coords_defaults=coords_defaults,
    coords_terminologies=coords_terminologies,
    coords_value_mapping=coords_value_mapping,
    filter_keep=filter_keep,
    meta_data=meta_data,
    time_format="%Y",
)
AgN2O_if.head()

Unnamed: 0,source,area (FAOstat),entity,unit,category (FAOstat),1990,1991,1992,1993,1994,...,2012,2013,2014,2015,2016,2017,2018,2019,2030,2050
0,FAOstat,Africa,N2O,Gg N2O / year,Cropland and grassland organic soils,34.3224,34.3224,34.3224,34.3224,34.3211,...,35.2262,35.2203,35.1451,35.1434,35.1327,35.1277,35.0546,35.0546,35.0546,35.0546
1,FAOstat,Africa,N2O,Gg N2O / year,Cropland organic soils,17.4597,17.4597,17.4597,17.4597,17.4588,...,18.1675,18.1546,18.1318,18.1307,18.1014,18.0904,18.0273,18.0273,18.0273,18.0273
2,FAOstat,Africa,N2O,Gg N2O / year,Grassland organic soils,16.8627,16.8627,16.8627,16.8627,16.8623,...,17.0586,17.0657,17.0133,17.0127,17.0313,17.0373,17.0272,17.0272,17.0272,17.0272
3,FAOstat,Albania,N2O,Gg N2O / year,Cropland and grassland organic soils,0.0471,0.0471,0.0471,0.0471,0.0471,...,0.047,0.047,0.047,0.047,0.0468,0.0468,0.046,0.046,0.046,0.046
4,FAOstat,Albania,N2O,Gg N2O / year,Cropland organic soils,0.0378,0.0378,0.0378,0.0378,0.0378,...,0.0377,0.0377,0.0377,0.0377,0.0376,0.0376,0.0369,0.0369,0.0369,0.0369


In [4]:
AgN2O_if.attrs

{'attrs': {'references': 'http://www.fao.org/faostat/en/#data/GV/metadata',
  'area': 'area (FAOstat)',
  'cat': 'category (FAOstat)'},
 'time_format': '%Y',
 'dimensions': {'*': ['source',
   'area (FAOstat)',
   'entity',
   'unit',
   'category (FAOstat)']}}

## Transformation to PRIMAP2 xarray format ##
The transformation to PRIMAP2 xarray format is done using the function `from_interchange_format` which takes an interchange format DataFrame.
The resulting xr Dataset is already quantified, thus the variables are pint arrays which include a unit.

In [5]:
AgN2O = pm2.pm2io.from_interchange_format(AgN2O_if)
AgN2O

2021-04-13 17:03:06.126 | DEBUG    | primap2.pm2io._interchange_format:from_interchange_format:266 - Expected array shapes: [[1, 147, 1, 3]], resulting in size 441.
2021-04-13 17:03:06.145 | INFO     | primap2._data_format:ensure_valid_attributes:245 - Reference information is not a DOI: 'http://www.fao.org/faostat/en/#data/GV/metadata'


0,1
Magnitude,[[[[34.3224 0.0471 52.6792 ... 382.035 0.1305 9.1669]  [17.4597 0.0378 30.6613 ... 250.4325 0.0731 3.1896]  [16.8627 0.0093 22.0179 ... 131.6025 0.0574 5.9773]]]  [[[34.3224 0.0471 52.6792 ... 382.035 0.1305 9.1669]  [17.4597 0.0378 30.6613 ... 250.4325 0.0731 3.1896]  [16.8627 0.0093 22.0179 ... 131.6025 0.0574 5.9773]]]  [[[34.3224 0.0471 52.6792 ... 382.035 nan 9.1669]  [17.4597 0.0378 30.6613 ... 250.4325 nan 3.1896]  [16.8627 0.0093 22.0179 ... 131.6025 nan 5.9773]]]  ...  [[[35.0546 0.046 55.6044 ... 416.8022 nan 9.1346]  [18.0273 0.0369 32.0142 ... 280.9176 nan 3.2132]  [17.0272 0.0091 23.5902 ... 135.8846 nan 5.9214]]]  [[[35.0546 0.046 55.6044 ... 416.8022 nan 9.1346]  [18.0273 0.0369 32.0142 ... 280.9176 nan 3.2132]  [17.0272 0.0091 23.5902 ... 135.8846 nan 5.9214]]]  [[[35.0546 0.046 55.6044 ... 416.8022 nan 9.1346]  [18.0273 0.0369 32.0142 ... 280.9176 nan 3.2132]  [17.0272 0.0091 23.5902 ... 135.8846 nan 5.9214]]]]
Units,N2O gigagram/year
