# Example of Fixed Effect Regressions (GICS Sectors) to Explain Asset Intensity (=TotalAssets/TotalSales) in Python
* Using StatsModels Python Library
* This example is for Industry Sector Fixed Effects.

## Upload CSV file with data
* Note that the example CSV file contains only 2020 data
* The GICS Sector is a code.

In [13]:
# Upload CSV file from your local machine to working directory in Colab
from google.colab import files
uploaded = files.upload()

Saving lect-3.csv to lect-3 (1).csv


## Import Necessary Libraries

In [14]:
import pandas as pd
import numpy as np
import statsmodels.api as sm

## Put Data in Pandas Dataframe



In [18]:
df = pd.read_csv('lect-3.csv')
df


Unnamed: 0,at,sale,gsector
0,62008.000,17337.000,20
1,419.314,316.011,20
2,20020.421,3586.982,55
3,1317.404,2484.595,40
4,72548.000,34608.000,35
...,...,...,...
2613,202771.000,14764.000,40
2614,4065.000,3587.000,20
2615,4366.100,1998.600,20
2616,2646.700,1935.600,20


## Baseline Regression
* Estimate a regression: *Asset_Intensity = constant*
* The coefficient estimate will give the average Environmental Intenisty for firms in the sample in 2019

In [19]:


# add constant column to the original dataframe
df['constant'] = 1

# define x as a subset of original dataframe
x = df[['constant']]
# define y as a series (Asset Intensity)
y = df['at']/df['sale']
y
# pass x as a dataframe, while pass y as a series
sm.OLS(y, x).fit().summary()

  return self.ess/self.df_model


0,1,2,3
Dep. Variable:,y,R-squared:,0.0
Model:,OLS,Adj. R-squared:,0.0
Method:,Least Squares,F-statistic:,inf
Date:,"Tue, 08 Jun 2021",Prob (F-statistic):,
Time:,01:04:01,Log-Likelihood:,-10483.0
No. Observations:,2618,AIC:,20970.0
Df Residuals:,2617,BIC:,20970.0
Df Model:,0,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
constant,6.3962,0.259,24.665,0.000,5.888,6.905

0,1,2,3
Omnibus:,5365.353,Durbin-Watson:,1.868
Prob(Omnibus):,0.0,Jarque-Bera (JB):,22846583.484
Skew:,16.611,Prob(JB):,0.0
Kurtosis:,459.44,Cond. No.,1.0


##Create Industry Indicators

In [20]:
df = pd.get_dummies(df, columns=['gsector'])
df

Unnamed: 0,at,sale,constant,gsector_10,gsector_15,gsector_20,gsector_25,gsector_30,gsector_35,gsector_40,gsector_45,gsector_50,gsector_55,gsector_60
0,62008.000,17337.000,1,0,0,1,0,0,0,0,0,0,0,0
1,419.314,316.011,1,0,0,1,0,0,0,0,0,0,0,0
2,20020.421,3586.982,1,0,0,0,0,0,0,0,0,0,1,0
3,1317.404,2484.595,1,0,0,0,0,0,0,1,0,0,0,0
4,72548.000,34608.000,1,0,0,0,0,0,1,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2613,202771.000,14764.000,1,0,0,0,0,0,0,1,0,0,0,0
2614,4065.000,3587.000,1,0,0,1,0,0,0,0,0,0,0,0
2615,4366.100,1998.600,1,0,0,1,0,0,0,0,0,0,0,0
2616,2646.700,1935.600,1,0,0,1,0,0,0,0,0,0,0,0


## Create dataframe for regression that includes only Asset Intensity and GICS Sector Indicators

In [22]:

# define x as a subset of original dataframe
x = df.drop(columns=['at', 'sale','constant'])
# define y as a series
y = df['at']/df['sale']

# pass x as a dataframe, while pass y as a series
sm.OLS(y, x).fit().summary()

0,1,2,3
Dep. Variable:,y,R-squared:,0.231
Model:,OLS,Adj. R-squared:,0.228
Method:,Least Squares,F-statistic:,78.12
Date:,"Tue, 08 Jun 2021",Prob (F-statistic):,1.46e-140
Time:,01:06:02,Log-Likelihood:,-10140.0
No. Observations:,2618,AIC:,20300.0
Df Residuals:,2607,BIC:,20370.0
Df Model:,10,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
gsector_10,3.1806,1.143,2.782,0.005,0.938,5.423
gsector_15,1.6734,1.065,1.572,0.116,-0.414,3.761
gsector_20,1.7668,0.604,2.926,0.003,0.583,2.951
gsector_25,1.6289,0.670,2.432,0.015,0.315,2.943
gsector_30,1.2276,1.172,1.047,0.295,-1.070,3.526
gsector_35,4.7076,0.582,8.094,0.000,3.567,5.848
gsector_40,18.5698,0.515,36.068,0.000,17.560,19.579
gsector_45,2.2083,0.618,3.573,0.000,0.996,3.420
gsector_50,7.3713,1.184,6.226,0.000,5.050,9.693

0,1,2,3
Omnibus:,6176.522,Durbin-Watson:,1.978
Prob(Omnibus):,0.0,Jarque-Bera (JB):,60242570.922
Skew:,23.029,Prob(JB):,0.0
Kurtosis:,744.715,Cond. No.,2.71
