# The role of passive effects in the relationship between active management and short-term performance: Evidence from mutual fund portfolio holdings
 

#### Groupe

- Kretz Henri
- Lin Cécile 
- Montagard Tristan
- Montariol Enzo


### Gestion Quantitative - Projet


### Import des packages nécessaire

In [6]:
import pandas as pd
import numpy as np
import statsmodels.api as sm 


### Importation des données

mutual funds VS peer synthetic portfolios 
- $w_{p,i}$ : weight of asset i on mutual fund p at the end of the quarter
- $r_{s,t}$ : daily returns of the synthetic portfolio at time t included in the following quarter
    - $r_{s,t} = \sum_{i=1}^N w_{p,i}r_{i,t} - e_{p,t}$
    - $r_{i,t}$ : return of each asset i in excess of the risk-free asset on day t
    - $e_{p,t}$ : expenses borne by the emulated fund

- $r_{s,t}$ : daily return in excess of the risk-free asset of a passively-managed synthetic portfolio that mimics mutual fund 
holdings at the end of the previous quarter
- $r_{p,t}$ : daily net return of the mutual fund in excess of the risk-free asset

- active management = correlation coefficient in each quarter between $r_{s,t}$ and $r_{p,t}$
    - A lower (higher) correlation during a period implies a higher (lower) deviation between the returns of the mutual fund, obtained through active management, and the returns of the synthetic portfolio, derived from following a passive buy-and-hold investment strategy on the assets managed by the fund at the beginning of that period. 
    - advantages of this measure lies : 
        - simplicity 
        - standardisation, ranging from 1 to 1
    - If the managed fund does not deviate from its comparable 
synthetic portfolio during a period, both portfolios will experience the same daily returns, leading to a correlation of 1. 


***compare the results with the active share measure proposed by***?

- $A_{p}$ : alpha gap, i.e. the difference between the fund alpha $\alpha_p$ and the synthetic portfolio $\alpha_s$
    - $A_{p} = \alpha_p - \alpha_s$
    - measure the value added by active management within the quarter. 


Alphas for the fund and its synthetic portfolio are estimated by implementing : model of Carhart (1997)
- $r_{p,t} = \alpha_p + \beta_{p,m}r_{m,t} + \beta_{p,smb}r_{smb,t} + \beta_{p,hml}r_{hml,t} + \beta_{p,wml}r_{wml,t} + \epsilon_{p,t}$
    - $r_{m,t}$ : risk factors are the excess market return
    - $r_{smb,t}$ : return of small-cap stocks minus the return of large-cap stocks
    - $r_{hml,t}$ : difference of the return between higher and lower book-to-market ratio stocks
    - $r_{wml,t}$ : return of past winners minus past losers

In [7]:
# path = "" # les données se trouvent dans le même fichier
# data = pd.read_csv(path + "data_exemple.csv")

In [8]:
# SIMULATION DATA
# Define the statistics from the table
stats = {
    "Risk-free asset": {"mean": 1.62, "std_dev": 0.11},
    "Mutual funds – risk-free asset": {"mean": 4.80, "std_dev": 20.97},
    "Synthetic portfolios – risk-free asset": {"mean": 5.34, "std_dev": 19.52},
    "Market – risk-free asset": {"mean": 5.34, "std_dev": 19.97},
    "SMB": {"mean": 1.67, "std_dev": 9.78},
    "HML": {"mean": 1.74, "std_dev": 10.76},
    "WML": {"mean": 4.06, "std_dev": 15.56},
}

date_range = pd.bdate_range(start="2000-01-01", end="2020-03-31", freq="B")  # Business days

# Liste des funds
funds = ["Fund A", "Fund B", "Fund C", "Fund D", "Fund E"]

# Simulate data for each variable
n = len(date_range)  # Number of data points
data = {}
np.random.seed(42)  # For reproducibility

# Initialiser une liste pour stocker les DataFrames
dfs = []

# Simulation des données pour chaque entreprise
np.random.seed(42)  # Pour garantir la reproductibilité
for fund in funds:
    data = {}
    for name, values in stats.items():
        mean, std_dev = values["mean"], values["std_dev"]
        simulated = np.random.normal(loc=mean, scale=std_dev, size=n)
        
        # Ajustement pour s'assurer des moyennes et écarts-types exacts
        simulated = (simulated - np.mean(simulated)) / np.std(simulated) * std_dev + mean
        data[name] = simulated
    
    # Créer un DataFrame pour l'entreprise avec un identifiant
    df = pd.DataFrame(data, index=date_range)
    df["Fund"] = fund
    dfs.append(df)

# Combiner tous les DataFrames
df = pd.concat(dfs)
df.describe()

Unnamed: 0,Risk-free asset,Mutual funds – risk-free asset,Synthetic portfolios – risk-free asset,Market – risk-free asset,SMB,HML,WML
count,26410.0,26410.0,26410.0,26410.0,26410.0,26410.0,26410.0
mean,1.62,4.8,5.34,5.34,1.67,1.74,4.06
std,0.110002,20.970397,19.52037,19.970378,9.780185,10.760204,15.560295
min,1.138634,-86.006124,-74.395792,-71.761602,-38.947999,-39.328062,-64.006271
25%,1.545847,-9.228899,-7.729272,-8.286427,-4.918174,-5.543141,-6.57213
50%,1.620678,4.743149,5.267819,5.404181,1.646686,1.751698,4.028207
75%,1.693797,18.772267,18.387739,18.92759,8.244025,9.100966,14.542002
max,2.05309,89.156709,91.96527,83.549106,38.607505,51.235285,69.355047


In [9]:
df

Unnamed: 0,Risk-free asset,Mutual funds – risk-free asset,Synthetic portfolios – risk-free asset,Market – risk-free asset,SMB,HML,WML,Fund
2000-01-03,1.674190,26.445394,49.743406,-7.372742,-2.735569,4.515154,-2.213222,Fund A
2000-01-04,1.604037,17.545553,-22.747081,36.296012,-4.619737,23.433724,-8.773684,Fund A
2000-01-05,1.690870,38.772900,30.157734,26.932638,-8.100543,3.883072,-11.761962,Fund A
2000-01-06,1.787579,13.718526,17.898244,42.510212,0.684834,-0.748863,2.270079,Fund A
2000-01-07,1.593443,-20.286616,-4.111216,41.544195,7.479057,16.693566,6.436408,Fund A
...,...,...,...,...,...,...,...,...
2020-03-25,1.754272,15.644959,-10.145051,12.570499,0.558858,-6.188384,-5.723477,Fund E
2020-03-26,1.580736,5.562279,17.804200,13.173629,7.048452,-11.314103,6.188780,Fund E
2020-03-27,1.566232,-11.888444,1.827020,-22.897450,-9.966822,4.641429,-9.294370,Fund E
2020-03-30,1.496224,16.484521,-13.889602,27.190179,-1.414677,-4.245367,-4.352787,Fund E


### Nettoyage des données


### Modèle à estimer

- $r_{p,t} = \alpha_p + \beta_{p,m}r_{m,t} + \beta_{p,smb}r_{smb,t} + \beta_{p,hml}r_{hml,t} + \beta_{p,wml}r_{wml,t} + \epsilon_{p,t}$
    - $r_{m,t}$ : risk factors are the excess market return
    - $r_{smb,t}$ : return of small-cap stocks minus the return of large-cap stocks
    - $r_{hml,t}$ : difference of the return between higher and lower book-to-market ratio stocks
    - $r_{wml,t}$ : return of past winners minus past losers
   

### Fonction de régression

In [10]:
# Variables explicatives et dépendantes
factors = ["Market – risk-free asset", "SMB", "HML", "WML"]  # Variables explicatives
target = "Mutual funds – risk-free asset"  # Rendement du portefeuille

# Stockage des résultats des régressions
coefficients_list = []
r_squared_list = []

for company in df["Fund"].unique():
    # Sous-ensemble des données pour une entreprise
    company_data = df[df["Fund"] == company]
    
    # Variables explicatives (facteurs)
    X = company_data[factors]
    X = sm.add_constant(X)  # Ajout de la constante (alpha)
    
    # Variable dépendante (rendement du portefeuille)
    y = company_data[target]
    
    # Régression linéaire
    model = sm.OLS(y, X).fit()
    
    # Sauvegarder les coefficients et le R^2
    coefficients_list.append(model.params)
    r_squared_list.append(model.rsquared)

# Convertir les résultats en DataFrame
coefficients_df = pd.DataFrame(coefficients_list)
coefficients_df["R2"] = r_squared_list
print(coefficients_df)

# Calcul des statistiques descriptives (moyenne, écart-type)
summary_stats = coefficients_df.describe().T
summary_stats["std_dev"] = coefficients_df.std()

# Résumer les résultats pour les mutual funds
mutual_funds_results = summary_stats[["mean", "std_dev"]]

# Afficher les résultats
print("Mutual Funds Results:")
print(mutual_funds_results)


      const  Market – risk-free asset       SMB       HML       WML        R2
0  4.786644                 -0.015539  0.039676  0.013619  0.001571  0.000608
1  4.721694                  0.015167 -0.022447  0.034183 -0.006078  0.000646
2  4.994625                 -0.000019 -0.037399 -0.026739 -0.021069  0.000738
3  5.113883                 -0.012640 -0.031128 -0.025434 -0.036982  0.001292
4  4.765710                 -0.003664  0.025419  0.046823 -0.017257  0.000889
Mutual Funds Results:
                              mean   std_dev
const                     4.876511  0.169276
Market – risk-free asset -0.003339  0.012134
SMB                      -0.005176  0.035206
HML                       0.008490  0.033718
WML                      -0.015963  0.014788
R2                        0.000834  0.000278
