# SBTi-Finance Tool - Calculate portfolio coverage
This notebook provides a simple way to calculate portfolio coverage with a cut-off date specified by the user. The intention is to allow the SBTi TVT as well as financial institutions to calculate the portfolio coverage at a date in the past corresponding to a base date or base year for the FI's portfolio target.

This notebook currently only supports aggregation via the WATS method.

This notebook does not calculate any temperature scores.

### Install the necessary Python modules
This is only required if you have not already installed the module.

In [1]:
from datetime import datetime
import pandas as pd
import requests
import openpyxl

## Create the data directory and download the example portfolio
We have prepared dummy data for you to be able to run the tool as it is to familiarise yourself with how it works. To use your own data; please check out to the [Data Requirements section](https://sciencebasedtargets.github.io/SBTi-finance-tool/DataRequirements.html) of the technical documentation for more details on data requirements and formatting. 

*The dummy data may include some company names, but the data associated with those company names is completely random and any similarities with real world data is purely coincidental. 


In [2]:
import urllib.request
import os

if not os.path.isdir("data"):
    os.mkdir("data")
if not os.path.isfile("data/example_portfolio.csv"):
    urllib.request.urlretrieve("https://github.com/ScienceBasedTargets/SBTi-finance-tool/raw/main/examples/data/example_portfolio.csv", "data/example_portfolio.csv")

## Load your portfolio
The example portfolio is stored as a .csv file. Alternatively, you can upload an .xlsx file, just choose one of the loading options below.

You can upload your portfolio file using the folder icon on the left hand side of the screen. Once you have uploaded your file, you can load it into the notebook using the code below.

The portfolio should at least have an "id" (the identifier of the company) and a "proportion" (the weight of the company in your portfolio e.g. the value of the shares you hold) column. To calculate the weighted portfolio coverage the data also needs to include identifers for the portfolios constituents, preferably LEI, but also ISIN is recognized by the SBTi data.
The following column names headers are required to upload the file to the tool, please note the the tool will not run unless these exact headers are included:

company_name: Name of the company in your portfolio - each row must have a unique name  
company_id : Unique identifier for the company in your portfolio  
company_isin : The ISIN of the company in your portfolio, used to identify the company e.g. for SBTi status *  
company_lei : Legal Entity Identifier for the company in your portfolio, used identify the company e.g. for SBTi status *  
investment_value: Needed to weight the portfolio coverage by the value of the investment in the company  
engagement_target: Not needed for the tool to run but is included to be compatible with the finance tool portfolio format  

\* These rows may be left blank but are needed to check the SBTi status of the company. If you do not have this data, please leave the rows blank and the tool will still run, but the SBTi status will not be included in the output.

Please see the technical documentation on [Data Legends](https://sciencebasedtargets.github.io/SBTi-finance-tool/Legends.html#) for details on data requirements. 

### Load the portfolio from a CSV file or an xlsx file
Enter the path to your portfolio file inside the quotation marks below. Then remove the # at the beginning of the appropriate line and run the cell.

In [None]:
df_portfolio = pd.read_csv("data/example_portfolio.csv", encoding="iso-8859-1")
#df_portfolio = pd.read_excel("data/example_portfolio.xlsx", engine="openpyxl")

In [3]:
# Change the column names to match the API
df_portfolio.rename(columns={'company_isin': 'ISIN', 'company_lei': 'LEI'}, inplace=True)
# Check for duplicate values in the 'company_id' column
duplicate_ids = df_portfolio[df_portfolio.duplicated('company_id', keep=False)]

if not duplicate_ids.empty:
    print("Error: Duplicate values found in the 'company_id' column:")
    print(duplicate_ids)
else:
    print("No duplicate values found in the 'company_id' column.")

## Enter the date to be used in calculating the portfolio coverage.
The date has to be older than today's date.
The format is: 

In [4]:
year = 2020 #enter the year for which you want to calculate the portfolio coverage
month = 12 #enter the month for which you want to calculate the portfolio coverage
day = 31 #enter the day for which you want to calculate the portfolio coverage

In [5]:
user_date = datetime(year, month, day)

Now load the CTA file (Companies Taking Action) from the SBTi website.


In [6]:
CTA_FILE_URL = "https://sciencebasedtargets.org/download/target-dashboard"
resp = requests.get(CTA_FILE_URL)
if resp.status_code != 200:
    raise ValueError("Could not download CTA file")
cta_file = pd.read_excel(resp.content)

In [None]:
cta_file.head()

## Filter the CTA file
Filter the CTA file to create a dataframe that has on row per company with the columns "Action" and "Target".
If Action = Target then only keep the rows where Target = Near-term.
        

In [8]:
targets = cta_file[
            [
                'Company Name', 
                'ISIN',
                'LEI',
                'Action',
                'Target',
                'Date Published'
            ]
        ]
df_nt_targets = targets[
            (targets['Action'] == 'Target') & 
            (targets['Target'] == 'Near-term')
            ]
        

## Filter out dates
Now filter out rows according to the provided date

In [9]:
# Convert the "Date Published" column to datetime type
df_targets = df_nt_targets.copy()
df_targets['Date Published'] = pd.to_datetime(df_targets['Date Published'])
# Filter rows based on user-entered date
filtered_df = df_targets.loc[df_targets['Date Published'] <= user_date]
filtered_df = filtered_df[filtered_df['ISIN'].notnull() & filtered_df['LEI'].notnull()]

## Check CTA file for companies with validated targets

In [None]:
isin_set = set(filtered_df['ISIN'])
lei_set = set(filtered_df['LEI'])

# Function to check if ISIN or LEI is validated
def is_validated(row):
    return row['ISIN'] in isin_set or row['LEI'] in lei_set

# Apply the function to create the 'validated' column
df_portfolio['validated'] = df_portfolio.apply(is_validated, axis=1)

## Portfolio coverage

The portfolio coverage provides insights in the proportion of the portfolio that has set SBTi-approved GHG emissions reduction targets. Only companies with SBTi-status "Approved" are included in the portfolio coverage.

To calculate the portfolio coverage we use the same aggregation methods we use for the Portfolio Temperature Score. Currently, in this simplified notebook, only the "Weighted Average Temperature Score" (WATS) is used. For more details on aggregation methods and the portfolio coverage method, please refer to the [methodology document](https://sciencebasedtargets.org/wp-content/uploads/2020/09/Temperature-Rating-Methodology-V1.pdf) sections 3.2 and also turn to notebook 4 (on [Colab](https://colab.research.google.com/github/OFBDABV/SBTi/blob/master/examples/4_portfolio_aggregations.ipynb) or [GitHub](https://github.com/ScienceBasedTargets/SBTi-finance-tool/blob/master/examples/4_portfolio_aggregations.ipynb)) for more aggregation examples.

In [None]:
total_investment_weight = df_portfolio['investment_value'].sum()
try:
    pc_weighted = df_portfolio.apply(
        lambda row: (row['investment_value'] * row['validated'])
        / total_investment_weight,
        axis=1,
    )
except ZeroDivisionError:
    raise ValueError("The portfolio weight is not allowed to be zero")
pc_result = round(pc_weighted.sum(), 2)
print(f"Portfolio coverage is: {pc_result*100}%.")

## Save the portfolio
If you want to save the portfolio, you can use the following code in the following cell. 


In [None]:
df_portfolio.to_csv('data/validated_portfolio.csv', index=False)