This code is to extract the emitters and the isins of the green bonds to ultimately get a data set of all green bond emitting companies which we can use to build our conventional bond data. With the help of the ISIN we will be able to get external credit rating data if needed. The URL's of this matched data will allow us to web scrape all the relevant spread data at the end.

In [13]:
import pandas as pd
import re

# Load the Excel file, specifying the sheet name
excel_file = pd.ExcelFile("Green_Bond_URLs.xlsx")  # Create an ExcelFile object
df = excel_file.parse("Data")  # Parse the "Data" sheet into a DataFrame
print(df.head())

                                                 URL          ISIN  \
0  https://www.boerse-frankfurt.de/anleihe/xs2694...  xs2694874533   
1  https://www.boerse-frankfurt.de/anleihe/xs1702...  xs1702729275   
2  https://www.boerse-frankfurt.de/anleihe/xs2482...  xs2482887879   
3  https://www.boerse-frankfurt.de/anleihe/de000a...  de000a3lh6t7   
4  https://www.boerse-frankfurt.de/anleihe/xs2694...  xs2694872594   

                        Company + Kupon and Maturity  
0                -volkswagen-leasing-gmbh-4-75-23-31  
1         -e-on-international-finance-b-v-1-25-17-27  
2                                 -rwe-ag-2-75-22-30  
3  -mercedes-benz-international-finance-b-v-3-5-2...  
4               -volkswagen-leasing-gmbh-4-625-23-29  


In [14]:
def split_coupon_maturity(text):
    match = re.search(r"(\d+(?:\.\d+)?(?:-\d+)+)$", text)  # Improved regex
    if match:
        coupon_maturity = match.group(1)
        company = text[:match.start()].rstrip('-')  # Exclude trailing hyphens
        return company, coupon_maturity
    return text.rstrip('-'), None  # Handle cases where no match is found

df[['Company', 'Kupon_Maturity']] = df['Company + Kupon and Maturity'].apply(split_coupon_maturity).tolist()

df['Company'] = df['Company'].str.replace(r'^-', '', regex=True)

# Split Kupon_Maturity into Coupon, Maturity_Start, and Maturity_End
def split_coupon_maturity_details(text):
    if text:
        parts = text.split('-')
        if len(parts) >= 3:  # Ensure there are at least 3 parts (coupon and maturities)
            maturity_start = parts[-2]
            maturity_end = parts[-1]
            coupon = '-'.join(parts[:-2]).replace('-', '.')  # Join remaining parts for coupon
            return coupon, maturity_start, maturity_end
    return None, None, None  # Handle cases where splitting fails

df[['Coupon', 'Maturity_Start', 'Maturity_End']] = df['Kupon_Maturity'].apply(split_coupon_maturity_details).tolist()

# Save the updated DataFrame back to a new CSV file (or overwrite the original)
df.to_csv("Green_Bond_URLs_Processed.csv", index=False)  # Save to a new CSV

print("Data processed and saved to Green_Bond_URLs_Processed.csv")

# Display the DataFrame with the separated columns, including the original
print(df[['Company + Kupon and Maturity', 'Company', 'Coupon', 'Maturity_Start', 'Maturity_End']].head())

# Save the updated DataFrame back to a new CSV file (or overwrite the original)
df.to_csv("Green_Bond_URLs_Processed.csv", index=False)  # Save to a new CSV

print("Data processed and saved to Green_Bond_URLs_Processed.csv")

# Display the DataFrame with the separated columns (optional)
print(df[['Company', 'Kupon_Maturity']].head())

Data processed and saved to Green_Bond_URLs_Processed.csv
                        Company + Kupon and Maturity  \
0                -volkswagen-leasing-gmbh-4-75-23-31   
1         -e-on-international-finance-b-v-1-25-17-27   
2                                 -rwe-ag-2-75-22-30   
3  -mercedes-benz-international-finance-b-v-3-5-2...   
4               -volkswagen-leasing-gmbh-4-625-23-29   

                                   Company Coupon Maturity_Start Maturity_End  
0                  volkswagen-leasing-gmbh   4.75             23           31  
1           e-on-international-finance-b-v   1.25             17           27  
2                                   rwe-ag   2.75             22           30  
3  mercedes-benz-international-finance-b-v    3.5             23           26  
4                  volkswagen-leasing-gmbh  4.625             23           29  
Data processed and saved to Green_Bond_URLs_Processed.csv
                                   Company Kupon_Maturity
0        