# Import Libraries

In [1]:
import yfinance as yf
import pandas as pd
import yesg
from datetime import datetime
import numpy as np
from tqdm import trange


# Récupération des données

On cherche les tickers des entreprises Néerlandaises côtées en bourse. Ainsi, grâce au fichier csv nous obtenons :

In [2]:
tickers = pd.read_csv("./datas/Euronext_Equities_2022-12-02.csv", sep=";")
tickers_amsterdam = tickers[tickers['Currency']=='EUR']['Symbol'].tolist()
for i in range(len(tickers_amsterdam)):
    tickers_amsterdam[i] = tickers_amsterdam[i] + ".AS"
print(f"Nous avons : {len(tickers_amsterdam)}, actions")

Nous avons : 168, actions


## Récupération des prix

Désormais récupérons le prix de toutes les actions disponibles.

In [3]:
tickers = yf.Tickers(tickers_amsterdam)
datas = tickers.history(period='max')
datas.index = pd.to_datetime(datas.index)

[*********************100%***********************]  168 of 168 completed

26 Failed downloads:
- EHCT.AS: No data found, symbol may be delisted
- FAGR.AS: No data found, symbol may be delisted
- AED.AS: No data found, symbol may be delisted
- RET.AS: No data found, symbol may be delisted
- SPR1T.AS: No data found, symbol may be delisted
- VAMW.AS: No data found, symbol may be delisted
- NAIW.AS: No data found, symbol may be delisted
- ADUX.AS: No data found, symbol may be delisted
- CTCT1.AS: No data found, symbol may be delisted
- WDP.AS: No data found, symbol may be delisted
- ONWD.AS: No data found, symbol may be delisted
- VAMT.AS: No data found, symbol may be delisted
- ENX.AS: No data found for this date range, symbol may be delisted
- NAITR.AS: No data found, symbol may be delisted
- SGO.AS: No data found, symbol may be delisted
- BHNDT.AS: No data found, symbol may be delisted
- EHCW.AS: No data found, symbol may be delisted
- SPR1W.AS: No data found, symbol may be delisted
- E

Prenons uniquement le prix de fermeture ('Close').

In [4]:
datas_price = datas['Close']

In [21]:
date_from = pd.Timestamp('2010-01-01')
data_filter = datas_price.loc[date_from:]
data_filter

Unnamed: 0_level_0,AALB.AS,ABN.AS,ACOMO.AS,AD.AS,ADUX.AS,ADYEN.AS,AED.AS,AF.AS,AGN.AS,AJAX.AS,...,VAMW.AS,VASTN.AS,VEON.AS,VLK.AS,VPK.AS,VTA.AS,VVY.AS,WDP.AS,WHA.AS,WKL.AS
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2010-01-04,8.183500,,3.089916,6.128685,,,,10.09,2.645846,6.127661,...,,17.244049,,17.517265,20.687561,0.684595,,,21.835211,10.764404
2010-01-05,8.148952,,3.078679,6.077343,,,,10.09,2.639794,6.165899,...,,17.331297,,17.505108,20.774902,0.441419,,,22.011978,10.760954
2010-01-06,8.310165,,3.089916,6.111789,,,,10.09,2.630441,6.165899,...,,17.114992,,17.682592,20.745787,0.417630,,,21.822359,10.733355
2010-01-07,8.267945,,3.075870,5.977907,,,,10.09,2.683809,5.974709,...,,17.162252,,17.505108,20.967766,0.420273,,,21.565245,10.726456
2010-01-08,8.283298,,3.101151,6.039649,,,,10.09,2.766887,6.070304,...,,17.216785,,17.502676,20.825844,0.422916,,,21.533110,10.691953
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2022-12-09,38.790001,12.590,19.799999,28.549999,,1411.000000,,,4.690000,11.500000,...,,20.950001,0.595,22.900000,28.200001,4.900000,10.88,,12.870000,104.849998
2022-12-12,38.639999,12.660,19.400000,28.174999,,1440.199951,,,4.657000,11.350000,...,,20.250000,0.580,22.250000,28.370001,4.850000,10.96,,12.620000,104.650002
2022-12-13,39.509998,12.590,19.400000,28.084999,,1497.199951,,,4.715000,11.300000,...,,20.600000,0.560,22.650000,28.260000,4.980000,11.38,,12.740000,104.099998
2022-12-14,39.459999,12.405,19.400000,27.735001,,1504.000000,,,4.707000,11.400000,...,,21.200001,0.570,22.600000,27.969999,4.980000,11.26,,12.730000,104.599998


Supprimons les colonnes avec un NaN à la fin, car elles ne sont plus échangées sur les marchés financiers. Ou bien lorsqu'elles sont échangées depuis trop peu de temps. Nous n'avons pas assez de recul sur ces actions.

In [35]:
last_date = data_filter.index.to_list()[-1]
first_date = data_filter.index.to_list()[0]
last_row_NaN = pd.Series.to_frame(data_filter.iloc[-1].isna())
first_row_NaN = pd.Series.to_frame(data_filter.iloc[0].isna())
missing_price_end = last_row_NaN.index[last_row_NaN[last_date]==True].to_list()
missing_price_begin = first_row_NaN.index[first_row_NaN[first_date]==True].to_list()
for elt in missing_price_end:
    if elt in missing_price_begin:
        missing_price_begin.remove(elt)
data_filter = data_filter.drop(missing_price_end, axis=1)
data_filter = data_filter.drop(missing_price_begin, axis=1)

Sauvegardons ce fichier et voici un apperçu du DataFrame que nous obtenons :

In [36]:
data_filter.to_csv('./datas/prices.csv')
data_filter.head()


Unnamed: 0_level_0,AALB.AS,ACOMO.AS,AD.AS,AGN.AS,AJAX.AS,AKZA.AS,AMG.AS,AMUND.AS,ARCAD.AS,ASM.AS,...,TOM2.AS,TWEKA.AS,URW.AS,VALUE.AS,VASTN.AS,VLK.AS,VPK.AS,VTA.AS,WHA.AS,WKL.AS
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2010-01-04,8.1835,3.089916,6.128685,2.645846,6.127661,29.32498,8.194963,12.383609,10.344408,11.537247,...,6.60039,9.511971,68.601776,3.62446,17.244049,17.517265,20.687561,0.684595,21.835211,10.764404
2010-01-05,8.148952,3.078679,6.077343,2.639794,6.165899,29.102234,8.169353,12.383609,10.731913,11.630808,...,6.921458,9.702949,68.51342,3.983885,17.331297,17.505108,20.774902,0.441419,22.011978,10.760954
2010-01-06,8.310165,3.089916,6.111789,2.630441,6.165899,29.177523,8.08016,12.383609,10.791023,11.788898,...,6.905454,9.746504,69.109764,4.757104,17.114992,17.682592,20.745787,0.41763,21.822359,10.733355
2010-01-07,8.267945,3.07587,5.977907,2.683809,5.974709,28.703781,8.124314,12.383609,10.804158,11.511437,...,7.026479,9.602432,68.380913,5.134651,17.162252,17.505108,20.967766,0.420273,21.565245,10.726456
2010-01-08,8.283298,3.101151,6.039649,2.766887,6.070304,28.499855,8.106655,12.383609,10.731913,11.92763,...,7.065488,9.629236,68.402985,5.43669,17.216785,17.502676,20.825844,0.422916,21.53311,10.691953


## Récupération des scores ESG

Pour la construction de notre portefeuille d'actions nous avons besoin des scores ESG de toutes les entreprises disponibles.

In [39]:
esg_scores = pd.DataFrame(columns = ['Ticker Yahoo', 'Environment Score', 'Social Score', 'Governance Score', 'Total Score'], index = range(len(datas['Close'].columns)))

tickers_price = data_filter.columns.to_list()

for i in trange(len(tickers_price)):
    ticker = tickers_price[i]
    try:
        sus = yf.Ticker(ticker).sustainability
        scores = sus.loc[['environmentScore','socialScore','governanceScore','totalEsg'],'Value']
        esg_scores.loc[i] = [ticker, scores[0], scores[1], scores[2], scores[3]]
    except:
        pass
esg_scores

100%|██████████| 60/60 [10:21<00:00, 10.36s/it]


Unnamed: 0,Ticker Yahoo,Environment Score,Social Score,Governance Score,Total Score
0,,,,,
1,,,,,
2,AD.AS,6.82,9.63,4.35,20.8
3,AGN.AS,0.51,7.74,6.63,14.88
4,,,,,
...,...,...,...,...,...
163,,,,,
164,,,,,
165,,,,,
166,,,,,


Sauvegardons ce fichier dans le dossier datas.

In [40]:
esg_scores.to_csv('./datas/esg_scores.csv')
esg_scores.head()

Unnamed: 0,Ticker Yahoo,Environment Score,Social Score,Governance Score,Total Score
0,,,,,
1,,,,,
2,AD.AS,6.82,9.63,4.35,20.8
3,AGN.AS,0.51,7.74,6.63,14.88
4,,,,,


Nous voyons bien qu'il manque énormément de score ESG, nous allons donc être obligés d'aller chercher à la main les scores restant.

 # Pre-processing

We have to follow few steps :

* Analyse the liquidity of all firms
    * Market capitalization
    * Average daily volume exchange
    * Free float part
* ESG filter
    * exclude x% of firms with the worts ESG score
    * keep firms with the best ESG momentum
    * take a specific KPI
* Financial analysis
    * Profit Margin
    * Return on assets

We can also analyse the correlation between our chosen stocks.

Then after that we have stocks we will use in our portfolio we need to find best weights. We will use two different methods :
* Mean variance method 
* Black litterman method