#  Introduction

### Econometrics A (ØkA)

Wooldridge (Ch. 1)

Bertel Schjerning

Department of Economics, University of Copenhagen


# Estimation af model for consumer cemand

## Modellering af Markedsandele: Flere Markeder og Produkter

**Nyttefunktion for Forbruger $i$, Produkt $j$ i Marked $m$:**
$$
U_{imj} = \beta \mathbf{x}_{mj} - \alpha p_{mj} + \xi_{mj} + \epsilon_{imj}
$$

- $\mathbf{x}_{mj}$: Observerbare karakteristika for produkt $j$ i marked $m$ (f.eks. størrelse, kvalitet).
- $p_{mj}$: Pris for produkt $j$ i marked $m$.
- $\xi_{mj}$: Uobserverbare karakteristika (f.eks. brand, placering).
- $\epsilon_{imj}$: Idiosynkratisk fejlled, der følger en ekstremværdi fordeling (Gumbel).

**Beslutningsregel:**
- Forbruger $i$ vælger det produkt, der maksimerer nytten:
$$
j^* = \arg \max_j \ U_{imj} \quad \text{for alle } j \text{ i marked } m
$$


## Logit-Efterspørgselsfunktion og Markedsandele

**Efterspørgselsfunktion for Produkt $j$ i Marked $m$:**
$$
S_{mj} = \frac{\exp(\beta \mathbf{x}_{mj} - \alpha p_{mj} + \xi_{mj})}{\sum_{k=1}^{J} \exp(\beta \mathbf{x}_{mk} - \alpha p_{mk} + \xi_{mk})}
$$

**Log-Lineariseret Efterspørgsel:**
$$
\log(S_{mj}) - \log(S_{m0}) = \beta \mathbf{x}_{mj} - \alpha p_{mj} + \xi_{mj}
$$

**Vi kan estimere $\alpha$ og $\beta$**, som koefficienterne i ovenstående lineære model


**Hvordan Annulleres Log-Summen?**
- Ved at trække log-markedsandelen for outside-alternativet fra, elimineres summen i nævneren.
- Dette giver en lineær model i log-markedsandele.




### Indlæs relevante libraries 

In [None]:
pip install statsmodels
pip install linearmodels
pip install matplotlib
pip install seaborn

In [None]:
import pandas as pd
import numpy as np
import statsmodels.api as sm
pd.set_option('display.float_format', '{:.8f}'.format)

### Indlæs data

In [12]:
df = pd.read_csv("data/data_blp.csv")
df["prices"] = df["msrp"]
df["market_ids"] = df["year"]
yearly_sales = df.groupby("year")["sales"].sum()
#Based on the fact that ever household has 2 cars, and buys a new car every 5 year. (1 new car every 2,5 years)
df["shares"] = df.apply(lambda x: x["sales"] / (x["number_households"] / 2.5), axis=1)

### Transformer variable 

In [13]:
no_car_share = 1 - df.groupby("year")["shares"].sum() # Based on the observed market share and the market size, how big is the no-new car share
df["dlogS"] = df.apply(lambda x: np.log(x["shares"]) - np.log(no_car_share[x["year"]]), axis=1) # difference in the log of market share of a given car and no-new car share

In [14]:
endogen_var = ["prices"] # Endogen variable
exogen_cont_vars = [ # Exogen continuous variables 
    "log_height",
    "log_footprint",
    "log_hp",
    "log_mpg",
    "log_curbweight",
    "log_number_trims"
] 
exogen_discrete_vars = [ # Exogen discrete variables
    "releaseYear",
    "yearsSinceDesign",
    "sport",
    "EV",
    "truck",
    "suv",
    "van"
]

In [15]:
#Scale prices from 1000$ unit to 10.000$
df["prices"] = df["prices"] * 0.1
year_dummies = pd.get_dummies(df["year"]).drop(columns=[1980]) # Including dummies for make (brand/producer) and each manufacturing year.
make_dummies = pd.get_dummies(df["make"]).drop(columns=["volvo"])

### Estimate model

In [16]:
y = df["dlogS"] 
X = pd.concat([df[exogen_cont_vars + exogen_discrete_vars + endogen_var], year_dummies, make_dummies], axis=1)
X = sm.add_constant(X)  # Adds a constant term for the intercept
for column in X.columns:
    if X[column].dtype == bool:
        X[column] = X[column].astype(int)
model = sm.OLS(y, X).fit(cov_type='HC1')
print(model.summary())

                            OLS Regression Results                            
Dep. Variable:                  dlogS   R-squared:                       0.589
Model:                            OLS   Adj. R-squared:                  0.584
Method:                 Least Squares   F-statistic:                     146.4
Date:                Mon, 02 Sep 2024   Prob (F-statistic):               0.00
Time:                        09:13:45   Log-Likelihood:                -15978.
No. Observations:                9694   AIC:                         3.218e+04
Df Residuals:                    9580   BIC:                         3.300e+04
Df Model:                         113                                         
Covariance Type:                  HC1                                         
                       coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------------
const              -15.8955      1.814  



### Beregn egenpriselasticitet

For at beregne egenpriselasticiteten bruger vi følgende udtryk baseret på estimaterne fra vores logit-model:

$$
\text{Elasticitet} = \alpha \cdot \text{Pris} \cdot (1 - \text{Markedsandel})
$$

In [21]:
alpha = model.params["prices"]
df["elasticity"] = df.apply(lambda x: alpha * x["prices"] * (1 - x["shares"]) , axis=1)
print(f"Egenpris elasticitet = {df['elasticity'].mean()}")

Egenpris elasticitet = -1.2121931691993502
