## Preface



If need be&#x2026;



In [1]:
!pip install CFEDemands
!pip install ConsumerDemands
!pip install oauth2client
!pip install eep153_tools
#!pip install dvc

## Introduction



Here we give a set of generic instructions for analyzing demand for
food and nutrition.  Inputs include a datasets of consumption
quantities, consumption expenditures, household characteristics, and a
food conversion table.

The different datasets should be indexed as follows:

-   **Expenditures:** -   Indexed by `(j,t,m)`
    -   Columns `i`
-   **Consumption:** -   Indexed by `(j,t,m,u)`
    -   Columns `i`
-   **Household characteristics:** -   Indexed by `(j,t,m)`
    -   Columns `i`
-   **Food Conversion Table:** -   Indexed by `(i,u)`
    -   Columns `n`
-   **RDI:** -   Indexed by `n`
    -   Columns `k`

where `j` indexes households, `t` indexes periods, `m` indexes
markets, `i` indexes goods, `k` indexes different kinds of household
characteristics, `u` indexes different unit names, and `n` indexes
different nutrients.  Finally, any RDI (&ldquo;recommended daily intake&rdquo;)
tables should be indexed by nutrients, with columns corresponding to
characteristics of persons within the household (e.g., age & sex
categories).

Note that some countries have more than one dataframe of consumption,
distinguished by source; for example Malawi has consumption items
purchased as well as consumption items produced.  Here we focus on
consumption purchases, since one of our immediate aims is to infer
prices paid.



## Step 1: Acquire DataFrames



Here are addresses of google sheets for different dataframes for the
case of Niger:



In [1]:
InputFiles = {'Expenditures':('1ySP8lrXlQ2ChaMdz0HQY85Md65cRRKOZgz-T0zBN2K0','Expenditures'),
              'Consumption':('1kr2NI57xiTQm20A_68NEcLKihVTJw2ZgWCwV98ZD4JE','Consumption'),
              'HH Characteristics':('1ySP8lrXlQ2ChaMdz0HQY85Md65cRRKOZgz-T0zBN2K0','HH Characteristics'),
              'FCT':('1TM7FpKURXFAuXW4dLpGt98QA2CH4WTDty-4nPOUv1Mg','05 NV_sum_57 (per 100g EP)')}

Note that the food items for the FCT for Niger are **not** yet matched
up with food labels indexed by `i` in the expenditure and consumption datasets.



### Worksheets to dataframes



Start by defining a function that &ldquo;cleans&rdquo; up data:



In [1]:
from eep153_tools.sheets import read_sheets
import numpy as np
import pandas as pd

def get_clean_sheet(key,sheet=None):

    dfs = read_sheets(key)

    if sheet is not None:
        df = dfs[sheet]
    else:
        df = dfs.values()[0]
        
    df.columns = [c.strip() for c in df.columns.tolist()]

    df = df.loc[:,~df.columns.duplicated(keep='first')]   

    df = df.drop([col for col in df.columns if col.startswith('Unnamed')], axis=1)

    df = df.loc[~df.index.duplicated(), :]

    return df

Next read in data on expenditures&#x2026;



In [1]:
# Get expenditures...
x = get_clean_sheet(InputFiles['Expenditures'][0],
                    sheet=InputFiles['Expenditures'][1])

if 'm' not in x.columns:
    x['m'] = 1

x = x.set_index(['j','t','m'])
x.columns.name = 'i'

x = x.apply(lambda x: pd.to_numeric(x,errors='coerce'))
x = x.replace(0,np.nan)

x

&#x2026;on household characteristics&#x2026;



In [1]:
# Get HH characteristics...
z = get_clean_sheet(InputFiles['HH Characteristics'][0],
                    sheet=InputFiles['HH Characteristics'][1])

if 'm' not in z.columns:
    z['m'] = 1

z = z.set_index(['j','t','m'])
z.columns.name = 'k'

z = z.apply(lambda x: pd.to_numeric(x,errors='coerce'))

z

&#x2026;on quantities (in units `u`)&#x2026;



In [1]:
# Get purchased consumption quantities
q = get_clean_sheet(InputFiles['Consumption'][0],
                    sheet=InputFiles['Consumption'][1])

if 'm' not in q.columns:
    q['m'] = 1

q = q.set_index(['j','t','m','u'])
q.columns.name = 'i'

q = q.apply(lambda x: pd.to_numeric(x,errors='coerce'))
q = q.replace(0,np.nan)

q

&#x2026;and finally a food conversion table.



In [1]:
fct = get_clean_sheet(InputFiles['FCT'][0],
                    sheet=InputFiles['FCT'][1])

#### This bit peculiar to Niger FCT #####
fct = fct.loc[fct.Code.str.len()==6]
fct = fct.set_index('Code')
fct.columns = [v.replace('\n',' ') for v in fct.columns]
########################################

fct.index.name = 'i'

fct = fct.apply(lambda x: pd.to_numeric(x,errors='coerce'))

fct