# **SDG Prediction**

## **Dependencies**

In [17]:
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import pandas as pd
import numpy as np

## **SDG Classifier**

### Load Model
https://huggingface.co/jonas/sdg_classifier_osdg

In [6]:
model = AutoModelForSequenceClassification.from_pretrained("jonas/sdg_classifier_osdg", use_auth_token="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")

Downloading:   0%|          | 0.00/1.21k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/418M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/313 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/226k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/455k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/112 [00:00<?, ?B/s]

### Load CSV

In [11]:
df = pd.read_csv("../../../src/transformed/transformed_eib.csv")
df.head(1)

Unnamed: 0,iati_id,title_en,title_other,title_main,organization,country_code,country,region,location,description_en,...,planned_start,actual_start,planned_end,actual_end,last_update,crs_5_code,crs_5_name,crs_3_code,crs_3_name,docs
0,XM-DAC-918-3-20160430-86400,RLRS LOAN FOR SMES AND OTHER PRIORITIES II,,RLRS LOAN FOR SMES AND OTHER PRIORITIES II,European Investment Bank,['RS'],RS;,,,Loan for financing small and medium-sized ente...,...,,2020-12-11T00:00:00Z,2023-12-16T00:00:00Z,2021-01-08T00:00:00Z,2024-02-15T09:32:54Z,,,,,['http://www.eib.org/en/registers/all/index.ht...


### Load SDG CSV

In [43]:
sdg_df = pd.read_csv("../../../src/codelists/sdg_goals.csv")
sdg_df.head(1)

Unnamed: 0,code,name,description,language,category,category-name,category-description
0,1,Goal 1. End poverty in all its forms everywhere,,en,,,


### Apply Model

In [45]:
tokenizer = AutoTokenizer.from_pretrained("jonas/sdg_classifier_osdg", use_auth_token="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")

df["sgd_pred_code"] = "NaN"
df["sgd_pred_str"] = "NaN"

for index, row in df.iterrows():
    descr_row = row['description_main']
    try:
        # nan in pandas is type float
        # check if nan 
            if isinstance(descr_row, float):
                df["sgd_pred_code"][index] = "NaN"
                df["sgd_pred_str"][index] = "NaN"
            else:
                # use clf with description and predict sgd 
                inputs = tokenizer(descr_row, return_tensors="pt")
                sdg_pred = model(**inputs)

                # etxract the argmax of the sgd pred
                # extract the sgd wich is most probable
                sdg_tuple = sdg_pred.to_tuple()
                sdg_np = sdg_tuple[0][0].detach().numpy()
                sdg_code = sdg_np.argmax() + 1

                # Map sgd codes to names
                sdg_translation = sdg_df.loc[sdg_df['code'] == int(sdg_code), 'name'].values[0]

                df["sgd_pred_code"][index] = sdg_code
                df["sgd_pred_str"][index] = sdg_translation
    except Exception as e:
        print(f"{e}: {descr_row}")

    
df.head()

14 Goal 14. Conserve and sustainably use the oceans, seas and marine resources for sustainable development
4 Goal 4. Ensure inclusive and equitable quality education and promote lifelong learning opportunities for all
8 Goal 8. Promote sustained, inclusive and sustainable economic growth, full and productive employment and decent work for all
8 Goal 8. Promote sustained, inclusive and sustainable economic growth, full and productive employment and decent work for all


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["sgd_pred_code"][index] = sdg_code
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["sgd_pred_str"][index] = sdg_translation


14 Goal 14. Conserve and sustainably use the oceans, seas and marine resources for sustainable development
14 Goal 14. Conserve and sustainably use the oceans, seas and marine resources for sustainable development
14 Goal 14. Conserve and sustainably use the oceans, seas and marine resources for sustainable development
14 Goal 14. Conserve and sustainably use the oceans, seas and marine resources for sustainable development
14 Goal 14. Conserve and sustainably use the oceans, seas and marine resources for sustainable development
14 Goal 14. Conserve and sustainably use the oceans, seas and marine resources for sustainable development
14 Goal 14. Conserve and sustainably use the oceans, seas and marine resources for sustainable development
14 Goal 14. Conserve and sustainably use the oceans, seas and marine resources for sustainable development
14 Goal 14. Conserve and sustainably use the oceans, seas and marine resources for sustainable development
14 Goal 14. Conserve and sustainably 

Unnamed: 0,iati_id,title_en,title_other,title_main,organization,country_code,country,region,location,description_en,...,actual_end,last_update,crs_5_code,crs_5_name,crs_3_code,crs_3_name,docs,sgd_pred,sgd_pred_code,sgd_pred_str
0,XM-DAC-918-3-20160430-86400,RLRS LOAN FOR SMES AND OTHER PRIORITIES II,,RLRS LOAN FOR SMES AND OTHER PRIORITIES II,European Investment Bank,['RS'],RS;,,,Loan for financing small and medium-sized ente...,...,2021-01-08T00:00:00Z,2024-02-15T09:32:54Z,,,,,['http://www.eib.org/en/registers/all/index.ht...,"{'logits': [[tensor(0.0060, grad_fn=<UnbindBac...",14,Goal 14. Conserve and sustainably use the ocea...
1,XM-DAC-918-3-20160434-86405,BMCE LIGNE VERTE,,BMCE LIGNE VERTE,European Investment Bank,['MA'],MA;,,,The EIB loan will co-finance solid waste manag...,...,2021-01-08T00:00:00Z,2024-02-15T09:32:54Z,,,,,['http://www.eib.org/en/registers/all/index.ht...,"{'logits': [[tensor(-1.5721, grad_fn=<UnbindBa...",4,Goal 4. Ensure inclusive and equitable quality...
2,XM-DAC-918-3-20160453-86438,KENYA AGRICULTURE VALUE CHAIN,,KENYA AGRICULTURE VALUE CHAIN,European Investment Bank,['KE'],KE;,,,Multibeneficiary intermediated loan to be blen...,...,2021-01-08T00:00:00Z,2024-02-15T09:32:54Z,,,,,['http://www.eib.org/en/registers/all/index.ht...,"{'logits': [[tensor(0.0430, grad_fn=<UnbindBac...",8,"Goal 8. Promote sustained, inclusive and susta..."
3,XM-DAC-918-3-20160453-92653,KENYA AGRICULTURE VALUE CHAIN FACILITY EQUITY ...,,KENYA AGRICULTURE VALUE CHAIN FACILITY EQUITY ...,European Investment Bank,['KE'],KE;,,,Multibeneficiary intermediated loan to be blen...,...,2021-01-08T00:00:00Z,2024-02-15T09:32:54Z,,,,,['http://www.eib.org/en/registers/all/index.ht...,"{'logits': [[tensor(0.0430, grad_fn=<UnbindBac...",8,"Goal 8. Promote sustained, inclusive and susta..."
4,XM-DAC-918-3-20160466-86480,WEST AFRICA MICROFINANCE FACILITY (MC MALI),,WEST AFRICA MICROFINANCE FACILITY (MC MALI),European Investment Bank,,,['289'],,Framework credit line of up to EUR 50 m to pro...,...,2021-01-08T00:00:00Z,2024-02-15T09:32:54Z,,,,,['http://www.eib.org/en/registers/all/index.ht...,"{'logits': [[tensor(0.2423, grad_fn=<UnbindBac...",14,Goal 14. Conserve and sustainably use the ocea...
