 ## Consumer Food Demand Estimation

In the following we will be estimating Constant Frisch Elasticity (CFE) demand systems for our selected population - Panama.

We first installed the prerequisites.

In [1]:
!pip install -r requirements.txt
%pip install CFEDemands

Collecting CFEDemands>=0.6
  Using cached CFEDemands-0.6.1-py2.py3-none-any.whl (45 kB)
Collecting ConsumerDemands
  Using cached ConsumerDemands-0.4.2.dev0-py2.py3-none-any.whl (12 kB)
Collecting eep153_tools>=0.11
  Using cached eep153_tools-0.12.3-py2.py3-none-any.whl (4.8 kB)
Collecting python-gnupg
  Using cached python_gnupg-0.5.2-py2.py3-none-any.whl (20 kB)
Collecting ray>=2.0.0
  Using cached ray-2.10.0-cp39-cp39-manylinux2014_x86_64.whl (65.1 MB)
Collecting xarray>=0.20.1
  Using cached xarray-2024.2.0-py3-none-any.whl (1.1 MB)
Collecting dvc>=2.18.1
  Using cached dvc-3.48.4-py3-none-any.whl (450 kB)
Collecting voluptuous>=0.11.7
  Using cached voluptuous-0.14.2-py3-none-any.whl (31 kB)
Collecting dvc-http>=2.29.0
  Using cached dvc_http-2.32.0-py3-none-any.whl (12 kB)
Collecting dvc-task<1,>=0.3.0
  Using cached dvc_task-0.3.0-py3-none-any.whl (21 kB)
Collecting shtab<2,>=1.3.4
  Using cached shtab-1.7.1-py3-none-any.whl (14 kB)
Collecting dpath<3,>=2.1.0
  Using cached dpa

Since our targeted population is Panama, we thus imported the corresponding Google Spreadsheet and converted it to pandas dataframes.

In [25]:
import pandas as pd
import numpy as np

Panama_Data = '1gcAb2jlGQNrD2zrrTEbjL47vbXoxCHkkjHSYzD0-Tiw'

!pip install eep153_tools --upgrade
from eep153_tools.sheets import read_sheets
#url = 'https://docs.google.com/spreadsheets/d/1gcAb2jlGQNrD2zrrTEbjL47vbXoxCHkkjHSYzD0-Tiw/edit#gid=2085637103'
#Panama_prices = read_sheets(url,sheet='Food Prices',nheaders=2)

# p is our dataframe for the food prices sheet.
p = read_sheets(Panama_Data,sheet='Food Prices')

p.columns.name = 't'
p = p.transpose()
p.columns = p.iloc[0]
p.columns.name = 't'
p =p.rename(columns = {'j': 'm'})
p.iloc[3:]



t,m,Aceite Vegetal,Aceite Vegetal.1,Aceite Vegetal.2,Aceite Vegetal.3,Aceite Vegetal.4,Aceite Vegetal.5,Aceite Vegetal.6,Aceite Vegetal.7,Aceite Vegetal.8,...,Zapallo / Chayote,Zapallo / Chayote.1,Zapallo / Chayote.2,Ñame,Ñame.1,Ñame.2,Ñame.3,Ñame.4,Ñame.5,Ñame.6
1997,Chíriqui,3.25,0.1,1.35,,5.95,,1.35,,1.325,...,,0.1,0.08441558442,,,,,0.4,0.15,0.03896103896
1997,Coclé,2.99,0.1,1.3,,5.9,,1.375,,1.275,...,0.25,0.15,0.2012987013,,,,,0.5,0.5,
1997,Colón,3.25,0.1,1.34,,5.75,,,,0.85,...,,0.25,0.04545454545,,,,,0.4,0.6,0.06493506494
1997,Darién,3.2,0.25,1.4,,4.85,,1.525,,1.4,...,,,,,,,,0.15,,
1997,Herrera,3.6,0.25,1.3,2.09,5.76,,1.3,,1.35,...,,0.275,0.06493506494,,,,,0.4,,
1997,Los Santos,3.1,0.1,1.35,,5.99,,1.3,1.47,0.85,...,,0.15,0.07142857143,,,,,0.4,,0.4155844156
1997,Panamá,2.99,0.1,1.35,7.8,5.9,,1.35,3.175,1.24,...,0.25,0.2,0.1298701299,,,,0.65,0.5,0.375,0.1168831169
1997,Veraguas,2.95,0.1,1.3,,5.86,,1.35,,0.85,...,,0.125,,,,,,0.5,,
2003,Bocas del Toro,,,,,4.800000191,,,1.250000017,,...,,,,,,0.02083333395,,0.400000006,,
2003,Chíriqui,,,,,4.875,,,1.250000033,,...,,,,,,0.04500000067,,0.4766666691,,


In [27]:
#x is our dataframe for food expenditures sheet.
x = read_sheets(Panama_Data, sheet='Food Expenditures')
x.columns.name = 'j'
x = x.replace(0,np.nan) # Replace zeros with missing
x.head()

Unnamed: 0,i,t,m,Aceite Vegetal,Aguacates,Ahí Verde,Ajo,Alimento Infantil,Apio,Arroz,...,"Sodas, Refrescos Y Jugos",Sopa Enlatada,Tercer otro,Tomate,Viscera De Res,Visceras De Pollo O Gallina,Yuca,Zanahoria,Zapallo / Chayote,Ñame
0,19971000,1997,Chíriqui,6.0,,0.5,,2.7,0.3,7.25,...,8.0,0.7,,,,,,0.8,,0.4
1,19971001,1997,Chíriqui,,,,,,,4.8,...,1.5,,,0.3,,,,,,
2,19971002,1997,Chíriqui,3.5,,,,,,6.0,...,,3.5,,,,,,,2.0,
3,19971003,1997,Chíriqui,3.5,,,,,,,...,,,,,,,,,,
4,19971005,1997,Chíriqui,3.7,,,,,,,...,,0.7,,,,,1.0,,,


In [28]:
#d is our dataframe for household characteristics sheet.
d = read_sheets(Panama_Data,sheet="Household Characteristics")
d.columns.name = 'k'

In [29]:
idx = x.columns
x.columns = [i[0] for i in idx]

In [30]:
x.head()

Unnamed: 0,i,t,m,Aceite Vegetal,Aguacates,Ahí Verde,Ajo,Alimento Infantil,Apio,Arroz,...,"Sodas, Refrescos Y Jugos",Sopa Enlatada,Tercer otro,Tomate,Viscera De Res,Visceras De Pollo O Gallina,Yuca,Zanahoria,Zapallo / Chayote,Ñame
0,19971000,1997,Chíriqui,6.0,,0.5,,2.7,0.3,7.25,...,8.0,0.7,,,,,,0.8,,0.4
1,19971001,1997,Chíriqui,,,,,,,4.8,...,1.5,,,0.3,,,,,,
2,19971002,1997,Chíriqui,3.5,,,,,,6.0,...,,3.5,,,,,,,2.0,
3,19971003,1997,Chíriqui,3.5,,,,,,,...,,,,,,,,,,
4,19971005,1997,Chíriqui,3.7,,,,,,,...,,0.7,,,,,1.0,,,


In [31]:
#print(x.columns)
#print(len(x['i']))
#print(len(x['t']))
#print(len(x['m']))

x = x.set_index(['i', 't', 'm'])

# y is when we take log of the household expenditures.
y = np.log(x.select_dtypes(include=[np.number]))
y = np.log(x.set_index(['i','t','m']))

KeyError: "None of ['i', 't', 'm'] are in the columns"

In [None]:
d.head()

In [None]:
y.head()

### Estimation regression

 Let $y_{i}^j$ be log household expenditure on food item $j$ from household $i$ of Panama. Our estimation regression takes the following form: 
 $$
      y^j_{i} = A^j(p) + \gamma_j'd_i + \beta_j w_i + \zeta^j_i.
$$

In [27]:
y = y.stack()

d = d.stack()

assert y.index.names == ['i','t','m','j']
assert d.index.names == ['i','t','m','k']

NameError: name 'y' is not defined

The formula above models the log household expenditure as a function of <br>

$A^j(p)$: A price index for food $j$, capturing how the pricing of good $j$ affects expenditure on food $j$;
<br>
$\gamma_j'd_i$: A household characterics demonstrating how demographics affects expenditure on food $j$; $\gamma_j$ is its coefficient.
<br>
$\beta_j w_i$: This term captures how the household's overall wealth affects its expenditure on food $j$; $\beta_j$ is its coefficent.
<br>
$\zeta^j_i$: This term captures other unobserved effect that influence food expenditure. 


### Set up basic estimation

In [7]:
from cfe import Regression

result = Regression(y=y,d=d)

NameError: name 'y' is not defined

In [None]:
result.predicted_expenditures()