# Example usage

To use `unesco_reader` in a project:

In [1]:
import unesco_reader

print(unesco_reader.__version__)

0.1.2


To explore UIS data import the `uis` module fomr `unesco_reader`

In [1]:
from unesco_reader import uis

Explore the available datasets and get information about a particular dataset

In [3]:
uis.available_datasets() # see available datasets - returns a list of dataset codes

['SDG', 'OPRI', 'SCI', 'SDG11', 'DEM']

In [4]:
uis.available_datasets(as_names=True) # get the full names for available datasets

['SDG Global and Thematic Indicators',
 'Other Policy Relevant Indicators',
 'Research and Development (R&D) SDG 9.5',
 'SDG 11.4',
 'Demographic and Socio-economic Indicators']

In [6]:
uis.available_datasets(category='education')

['SDG', 'OPRI']

In [8]:
uis.dataset_info('SDG') #get information about the SDG dataset

----------------  ----------------------------------------------------------------------------------------
dataset_name      SDG Global and Thematic Indicators
dataset_code      SDG
dataset_category  education
regional          True
link              https://apimgmtstzgjpfeq2u763lag.blob.core.windows.net/content/MediaLibrary/bdds/SDG.zip
----------------  ----------------------------------------------------------------------------------------


To access and explore the data for a particular UIS dataset, use the `UIS` object

In [2]:
# First, instantiate a `UIS` object, passing the dataset code or name that you want to explore
# Here we are going to instantiate the `SDG` dataset

sdg = uis.UIS('SDG') # you can also pass the dataset name `SDG Global and Thematic Indicators`
sdg

<unesco_reader.uis.UIS at 0x226c5d54a60>

You can get information about the dataset, such as dataset name, code, category, and link to download the data zipped file


In [15]:
sdg.dataset_code

'SDG'

In [14]:
sdg.dataset_name

'SDG Global and Thematic Indicators'

In [16]:
sdg.dataset_category

'education'

In [17]:
sdg.link

'https://apimgmtstzgjpfeq2u763lag.blob.core.windows.net/content/MediaLibrary/bdds/SDG.zip'

In order to explore the data, use the `load_data` which loads data to the object by downloading it from
UNESCO, cleaning it, and formatting it to a pandas DataFrame.

If you already downloaded the zipped file locally,
you can pass the path to the file, and the data will be read from the local file rather than being downloaded.

In [3]:
sdg.load_data() # optionally pass `local_path = "path to zipped file..."` to use a locally downloaded file

INFO 2023-01-27 12:25:49,934 [uis.py:load_data:303] Data loaded for dataset: SDG


<unesco_reader.uis.UIS at 0x226c5d54a60>

Now that the data is loaded to the object you can start exploring it!

To get general information about the dataset use the `info()` methos

In [19]:
sdg.info()

--------------------  ----------------------------------------------------------------------------------------
dataset_name          SDG Global and Thematic Indicators
dataset_code          SDG
dataset_category      education
regional              True
link                  https://apimgmtstzgjpfeq2u763lag.blob.core.windows.net/content/MediaLibrary/bdds/SDG.zip
available indicators  1609
available countries   241
time range            1950 - 2022
available regions     179
--------------------  ----------------------------------------------------------------------------------------


You can take a look at the available indicators

In [4]:
# return a list of available indicator codes
indicators = sdg.available_indicators()
indicators[0: 6] # these are only the first 6 indicators

['ADMI.ENDOFLOWERSEC.MAT',
 'ADMI.ENDOFLOWERSEC.READ',
 'ADMI.ENDOFPRIM.MAT',
 'ADMI.ENDOFPRIM.READ',
 'ADMI.GRADE2OR3PRIM.MAT',
 'ADMI.GRADE2OR3PRIM.READ']

In [6]:
# get the names of indicators
indicator_names = sdg.available_indicators(as_names=True)
indicator_names[0:3] # these are the first 3 indicators

[' Administration of a nationally-representative learning assessment at the end of lower secondary education in mathematics (number)',
 ' Administration of a nationally-representative learning assessment at the end of lower secondary education in reading (number)',
 ' Administration of a nationally-representative learning assessment at the end of primary in mathematics (number)']

You can explore the countries that are available

In [8]:
# get a list of available countries
countries = sdg.available_countries()
countries[0:10] # these are only the first 10 countries

['AFG', 'ALB', 'DZA', 'ASM', 'AND', 'AGO', 'AIA', 'ATG', 'ARG', 'ARM']

In [9]:
# get the list of countries as country names
country_names = sdg.available_countries(as_names=True)
country_names[0:6]

['Afghanistan', 'Albania', 'Algeria', 'American Samoa', 'Andorra', 'Angola']

In [10]:
# You can also see which countries belong to a particular region
# here we will see which countries belong to the World Bank's country grouping for Small States
small_state_countries = sdg.available_countries(as_names=True, region='WB: Small states')
small_state_countries

['Antigua and Barbuda',
 'Bahamas',
 'Bahrain',
 'Barbados',
 'Belize',
 'Bhutan',
 'Botswana',
 'Solomon Islands',
 'Palau',
 'Brunei Darussalam',
 'Cabo Verde',
 'Comoros',
 'Cyprus',
 'Dominica',
 'Equatorial Guinea',
 'Estonia',
 'Fiji',
 'Djibouti',
 'Gabon',
 'Gambia',
 'Kiribati',
 'Grenada',
 'Guyana',
 'Iceland',
 'Jamaica',
 'Lesotho',
 'Maldives',
 'Malta',
 'Mauritius',
 'Micronesia (Federated States of)',
 'Montenegro',
 'Namibia',
 'Nauru',
 'Vanuatu',
 'Marshall Islands',
 'Guinea-Bissau',
 'Timor-Leste',
 'Qatar',
 'Saint Kitts and Nevis',
 'Saint Lucia',
 'Saint Vincent and the Grenadines',
 'San Marino',
 'Sao Tome and Principe',
 'Seychelles',
 'Suriname',
 'Eswatini',
 'Tonga',
 'Tuvalu',
 'Trinidad and Tobago',
 'Samoa']

In [11]:
# you can also see the regions that are available.
# Note that some datasets may not have regional data, so calling this function may raise an error explaining that regional data is not available
regions = sdg.available_regions()
regions[0:6] # these are only the first 6 regions

['AIMS: Asia and the Pacific',
 'AIMS: Central Asia',
 'AIMS: East Asia',
 'AIMS: East Asia and the Pacific',
 'AIMS: Pacific',
 'AIMS: South and West Asia']

In order the get the data, use the `get_data()` method

In [13]:
df = sdg.get_data()
print(df.head())

             INDICATOR_ID                                     INDICATOR_NAME  \
0  ADMI.ENDOFLOWERSEC.MAT   Administration of a nationally-representative...   
1  ADMI.ENDOFLOWERSEC.MAT   Administration of a nationally-representative...   
2  ADMI.ENDOFLOWERSEC.MAT   Administration of a nationally-representative...   
3  ADMI.ENDOFLOWERSEC.MAT   Administration of a nationally-representative...   
4  ADMI.ENDOFLOWERSEC.MAT   Administration of a nationally-representative...   

  COUNTRY_ID COUNTRY_NAME  YEAR  VALUE  
0        ABW        Aruba  2014    0.0  
1        ABW        Aruba  2015    0.0  
2        ABW        Aruba  2016    0.0  
3        ABW        Aruba  2017    0.0  
4        ABW        Aruba  2018    0.0  


In [14]:
# if you are interested in regional data, you can specify the grouping
df = sdg.get_data(grouping='regional')
print(df.head())

  INDICATOR_ID                                     INDICATOR_NAME  \
0  AIR.1.GLAST  Gross intake ratio to the last grade of primar...   
1  AIR.1.GLAST  Gross intake ratio to the last grade of primar...   
2  AIR.1.GLAST  Gross intake ratio to the last grade of primar...   
3  AIR.1.GLAST  Gross intake ratio to the last grade of primar...   
4  AIR.1.GLAST  Gross intake ratio to the last grade of primar...   

                    REGION_ID  YEAR     VALUE  
0  AIMS: Asia and the Pacific  1970  71.41793  
1  AIMS: Asia and the Pacific  1971  71.43174  
2  AIMS: Asia and the Pacific  1972  71.86948  
3  AIMS: Asia and the Pacific  1973  72.56180  
4  AIMS: Asia and the Pacific  1974  72.72099  


In [16]:
# You can also include metadata in the outputted dataframe
df = sdg.get_data(include_metadata=True)
print(df.head())

             INDICATOR_ID COUNTRY_ID  YEAR  VALUE MAGNITUDE QUALIFIER  \
0  ADMI.ENDOFLOWERSEC.MAT        ABW  2014    0.0       NaN       NaN   
1  ADMI.ENDOFLOWERSEC.MAT        ABW  2015    0.0       NaN       NaN   
2  ADMI.ENDOFLOWERSEC.MAT        ABW  2016    0.0       NaN       NaN   
3  ADMI.ENDOFLOWERSEC.MAT        ABW  2017    0.0       NaN       NaN   
4  ADMI.ENDOFLOWERSEC.MAT        ABW  2018    0.0       NaN       NaN   

  COUNTRY_NAME                                     INDICATOR_NAME  \
0        Aruba   Administration of a nationally-representative...   
1        Aruba   Administration of a nationally-representative...   
2        Aruba   Administration of a nationally-representative...   
3        Aruba   Administration of a nationally-representative...   
4        Aruba   Administration of a nationally-representative...   

  Different Coverage Source Under Coverage  
0                NaN    NaN            NaN  
1                NaN    NaN            NaN  
2          

Much more functionality is coming soon! If you have suggestions to improve or add to the package, please contribute by opening an issue!