SDG Mapper API
https://knowsdgs.jrc.ec.europa.eu/sdgmapper#learn

The online tool lets you submit large batches, but it seems to have an unofficial limit of 200 documents. 

This is good. But the results need to be processed since they are ordered by SDG individually

This fixes it.
The first row are variable names. After that each 2 rows are for a certain document. For example, row 2 indicates the SDG and row 3 indicates the corresponding percentage for that SDG. I want to order each document by the SDG number instead of ordered by percentage. 


In [17]:
# Here we use the original format of the results without manually adjusting it first. 
# Just need to create the .csv input file first from the SDG Mapper results

import pandas as pd

# Load the CSV file
df = pd.read_csv('/Users/mlafleur/Projects/SDGfusion/classifier results/sdg-mapper/sdgmapper-input.csv')

# Create an empty DataFrame to hold the reformatted data
output_df = pd.DataFrame(columns=['Document', 'SDG 1', 'SDG 2', 'SDG 3', 'SDG 4', 'SDG 5', 'SDG 6', 'SDG 7', 'SDG 8', 'SDG 9', 'SDG 10', 'SDG 11', 'SDG 12', 'SDG 13', 'SDG 14', 'SDG 15', 'SDG 16', 'SDG 17'])

# Iterate through the DataFrame three rows at a time
for i in range(0, len(df), 3):
    doc_name = df.loc[i, 'Document']
    sdgs = df.loc[i, '1st':'17th'].values
    # counts = df.loc[i + 1, '1st':'17th'].values  # Uncomment if you need counts for future use
    percentages = df.loc[i + 2, '1st':'17th'].values
    
    # Pair SDGs with their percentages
    sdg_percent_dict = {sdg: perc for sdg, perc in zip(sdgs, percentages)}
    
    # Sort SDGs
    sorted_sdgs = sorted(sdg_percent_dict.keys())
    
    # Create a new row for the output DataFrame
    new_row = {'Document': doc_name}
    for sdg in sorted_sdgs:
        new_row[sdg] = sdg_percent_dict[sdg]
        
    # Append the new row to the output DataFrame using pd.concat
    output_df = pd.concat([output_df, pd.DataFrame([new_row])], ignore_index=True)

# Write the output DataFrame to a new CSV file
output_df.to_csv('sdgmapper-output.csv', index=False)



