## Lecture III How is agricultural production increasing?



We&rsquo;ve seen that growth in food production is typically greater than
population growth.  But where is this growth coming from?  

Food (and crops in particular) are the classical example of 
production.  The &ldquo;classical&rdquo; economists in the 18th century, when
most income came from agriculture, thought there were three main
&ldquo;factors&rdquo; of production:

-   Land
-   Labor
-   Capital

Thus one might write the technical relationship between &ldquo;factors&rdquo; (or
inputs) and output as
$$
    \text{Crop output} = F(\mbox{Land},\mbox{Labor},\mbox{Capital}).
 $$



### Functional forms



We have good reason to think that $F$ displays constant returns to
scale; i.e., is homogeneous of degree one.  Write it as
$F(x_1,x_2,\dots,x_n)$ (thus abstracting from the classical factors
of production).  

Observationally, it&rsquo;s also often the case the cost share for
different factors of production remains constant, even when prices
change.  If we combine these facts (linearly homogeneous, constant
cost shares) with an assumption that farmers operating this
production function are profit maximizing price-takers and a
technical assumption that $F$ is continuously differentiable, then
one can prove that $F$ is &ldquo;Cobb-Douglas&rdquo;, or
$$
      F(x_1,x_2,\dots,x_n) = A\prod_{i=1}^nx_i^{\alpha_i},
  $$
where $\sum_{i=1}^n\alpha_i=1$.  This is a result first established
by the economist Paul Douglas and the mathematician Charles Cobb in 1928.

If we observe output at time $t$ for country $j$, say $y^j_t$ and inputs $x^j_{it}$, we can take the
logarithm of the Cobb-Douglas production function, obtaining
$$
     \log y^j_t = \log A^j_t + \sum_{i=1}^n\alpha^j_i\log(x^j_{it}).
  $$
Note that we&rsquo;ve allowed the cost-share parameters $\alpha$ to vary
across both inputs and also countries, but *not* over time.



### Total Factor Productivity



The term $A$ is sometimes called &ldquo;Total Factor Productivity&rdquo; (TFP),
because increases in $A$ increase productivity of all factors.  If
we take differences in log output over time we get
$$
      \Delta\log y^j_t = \Delta\log A^j_t + \sum_{i=1}^n\alpha^j_i\Delta\log(x^j_{it}).
   $$
Recall that changes in logs approximate percent changes or growth
rates, so we can use this equation to decompose output growth into
growth in input use and TFP growth.



### Data on Food Production



What’s happened to food production over recent decades?
See
[https://www.ers.usda.gov/data-products/international-agricultural-productivity/](https://www.ers.usda.gov/data-products/international-agricultural-productivity/).

Data on TFP, output, factor use, and factor shares can be found at
[https://docs.google.com/spreadsheets/d/1DLn9owcS7ggojJGWlI9vKSz0hqozn6cbcqNGWgzMZ8k](https://docs.google.com/spreadsheets/d/1DLn9owcS7ggojJGWlI9vKSz0hqozn6cbcqNGWgzMZ8k),
which is publicly readable.



### Reading Sheets



I&rsquo;ve written a python package `eep153_tools` which includes tools to
handle authentication as well as reading google sheets as pandas
DataFrames.  First we have to deal with authentication, by decrypting
credentials to access particular files (you should only have to do
this part once):



In [1]:
#!pip install eep153_tools
#!pip install python_gnupg

from eep153_tools.sheets import decrypt_credentials
decrypt_credentials('../students.json.gpg')

Input secret passphrase for ../students.json.gpg to create google drive credentials: noodle octopus


To check that this worked, the following gives a list of
emails for &ldquo;service<sub>accounts</sub>&rdquo; that now have credentials&#x2013;you can
then &ldquo;share&rdquo; google sheets with these.



In [2]:
!ls ~/.eep153.service_accounts/

instructors@eep153.iam.gserviceaccount.com
students@eep153.iam.gserviceaccount.com


With those credentials established



In [3]:
from eep153_tools.sheets import read_sheets

#### Read a bunch of google worksheets into a dictionary of dataframes
data = read_sheets('https://docs.google.com/spreadsheets/d/1J_Yoo2eBgABBy8Hnvh2Vf7GYowELZBYTVwt-3NtjUsc/edit#gid=1532023339')['Data']

data.keys()

Key available for instructors@eep153.iam.gserviceaccount.com.
Key available for students@eep153.iam.gserviceaccount.com.




Index(['Order', 'FAO', 'ISO3', 'Level', 'Country/Territory', 'Region',
       'Subregion', 'Income', 'Year', 'TFP_Index', 'Outall_Index',
       'Input_Index', 'Land_Index', 'Labor_Index', 'Capital_Index',
       'Materials_Index', 'Outall_Q', 'Outcrop_Q', 'Outanim_Q', 'Outfish_Q',
       'Land_Q', 'Labor_Q', 'Capital_Q', 'Machinery_Q', 'Livestock_Q',
       'Fertilizer_Q', 'Feed_Q', 'Cropland_Q', 'Pasture_Q', 'IrrigArea_Q'],
      dtype='object')

This gives a dictionary of dataframes, but these dataframes need to be
tidied up some.  For example, look at `Outall_Index`:



In [8]:
data.tail()

Unnamed: 0,Order,FAO,ISO3,Level,Country/Territory,Region,Subregion,Income,Year,TFP_Index,...,Land_Q,Labor_Q,Capital_Q,Machinery_Q,Livestock_Q,Fertilizer_Q,Feed_Q,Cropland_Q,Pasture_Q,IrrigArea_Q
13331,243,,,Region,,G20 (19 countries 2021),,,2016,101.668,...,1243977.0,481284.9228,3995280.0,2236022.0,1191882.0,159201300.0,2885579000.0,885980.5888,1628312.0,214470.0666
13332,243,,,Region,,G20 (19 countries 2021),,,2017,104.6567,...,1249681.0,472232.4342,4104074.0,2276166.0,1204587.0,158381100.0,2991555000.0,887794.2355,1657323.0,215797.8895
13333,243,,,Region,,G20 (19 countries 2021),,,2018,106.0598,...,1249286.0,461882.18,4216871.0,2308464.0,1208838.0,156884700.0,3009592000.0,887969.7809,1643916.0,216816.8026
13334,243,,,Region,,G20 (19 countries 2021),,,2019,105.8433,...,1253635.0,448426.2779,4323576.0,2354034.0,1189710.0,162293800.0,3126475000.0,887471.6416,1648720.0,219884.9
13335,243,,,Region,,G20 (19 countries 2021),,,2020,106.8674,...,1252543.0,436136.9191,4487207.0,2276589.0,1204267.0,168865300.0,3188809000.0,886764.9694,1642048.0,220368.7


We really just want selected columns, and an index that depends on country and year:



In [9]:
import pandas as pd

Data = {'Level':'Level','Country':'Country/Territory','WDI Code':'ISO3','Year':'Year','Output':'Outall_Index','TFP':'TFP_Index','Land':'Land_Index','Labor':'Labor_Index','Capital':'Capital_Index','Materials':'Materials_Index'}


df = data.rename(columns={v:k for k,v in Data.items()})
df = df[Data.keys()].set_index(['WDI Code','Level','Country','Year'])

# Deal with some duplicate indices
df = df.loc[~df.index.duplicated(),:]
df

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Unnamed: 3_level_0,Output,TFP,Land,Labor,Capital,Materials
WDI Code,Level,Country,Year,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
NGA,Country,Nigeria,1961,19.5115,86.5804,40.8615,57.7794,8.6628,8.2788
NGA,Country,Nigeria,1962,20.3487,87.6861,41.4735,58.8275,8.9988,8.8999
NGA,Country,Nigeria,1963,21.2221,87.0572,44.9049,59.9037,9.3708,9.4456
NGA,Country,Nigeria,1964,21.8586,87.3097,46.1668,61.0112,9.7532,9.4536
NGA,Country,Nigeria,1965,22.7648,86.1499,49.1628,62.1489,10.4784,9.9916
...,...,...,...,...,...,...,...,...,...
,Region,,2016,98.9888,99.7940,99.6151,100.9345,99.1207,98.2452
,Region,,2017,99.6899,101.1619,99.7266,99.1944,98.8609,97.8932
,Region,,2018,99.2904,101.3774,98.9821,97.9636,99.4202,97.2640
,Region,,2019,99.1303,100.2704,99.2921,97.4115,99.3946,99.3756


### Visualizing data on ag production



Plot growth in output for all countries for 1961 on:



In [10]:
import cufflinks as cf
cf.go_offline()

df['Output'].unstack().T.iplot(title="Index of Agricultural Output",
                               yTitle='Value of Output Index',
                               xTitle='Year')

Compare world growth in outputs, inputs, and TFP:



In [11]:
import numpy as np

world = df.xs('World',level='Level').replace(0,np.nan).dropna(how='any')

# Drop unnecessary index levels
world = world.droplevel(['WDI Code','Country'])

# Put in log differences
dworld = np.log(world).diff()
dworld['Inputs'] = dworld['Output'] - dworld['TFP']

dworld.mean()

Output       0.023150
TFP          0.009915
Land         0.004572
Labor        0.004167
Capital      0.020977
Materials    0.024321
Inputs       0.013235
dtype: float64

And a graph of growth rates:



In [12]:
dworld[['Output','Inputs','TFP']].iplot(title="Growth rates of output, inputs, & TFP",
                                           xTitle="Year")

That&rsquo;s the overall picture for the world.  Now &ldquo;drill down&rdquo; and
consider just production in the US:



In [13]:
select = df.xs('USA',level='WDI Code').dropna(how='any')

# Drop unnecessary index levels
select = select.droplevel(['Level','Country'])

dselect = np.log(select).diff()
dselect['Inputs'] = dselect['Output'] - dselect['TFP']

dselect.mean()

Output       0.015012
TFP          0.011983
Land        -0.000656
Labor       -0.013495
Capital      0.008603
Materials    0.009642
Inputs       0.003029
dtype: float64

And here a graph of growth in indices of inputs & outputs since 1961:



In [14]:
select = df.xs('USA',level='WDI Code').dropna(how='any').droplevel(['Level','Country'])

select = select/select.loc[1961,:]
select.iplot()

Compare with India, which in recent years has had a level of
agricultural output close to that of the US:



In [15]:
select = df.xs('IND',level='WDI Code').dropna(how='any').droplevel(['Level','Country'])

select = select/select.loc[1961,:]
select.iplot()

And now look at ratios of inputs & outputs in India to the same in the
US:



In [21]:
USA = df.xs('USA',level='WDI Code').dropna(how='any').droplevel(['Level','Country'])

IND = df.xs('IND',level='WDI Code').dropna(how='any').droplevel(['Level','Country'])

select = (IND/USA)
select = select/select.loc[1961,:]


select.iplot(title='Log Indian inputs & output Relative to US')