## Lecture III How is agricultural production increasing?



We&rsquo;ve seen that growth in food production is typically greater than
population growth.  But where is this growth coming from?  

Food (and crops in particular) are the classical example ofa
production.  The &ldquo;classical&rdquo; economists in the 18th century, when
most income came from agriculture, thought there were three main
&ldquo;factors&rdquo; of production:

-   Land
-   Labor
-   Capital

Thus one might write the technical relationship between &ldquo;factors&rdquo; (or
inputs) and output as
$$
    \text{Crop output} = F(\mbox{Land},\mbox{Labor},\mbox{Capital}).
 $$



### Functional forms



We have good reason to think that $F$ displays constant returns to
scale; i.e., is homogeneous of degree one.  Write it as
$F(x_1,x_2,\dots,x_n)$ (thus abstracting from the classical factors
of production).  

Observationally, it&rsquo;s also often the case the cost share for
different factors of production remains constant, even when prices
change.  If we combine these facts (linearly homogeneous, constant
cost shares) with an assumption that farmers operating this
production function are profit maximizing price-takers and a
technical assumption that $F$ is continuously differentiable, then
one can prove that $F$ is &ldquo;Cobb-Douglas&rdquo;, or
$$
      F(x_1,x_2,\dots,x_n) = A\prod_{i=1}^nx_i^{\alpha_i},
  $$
where $\sum_{i=1}^n\alpha_i=1$.  This is a result first established
by the economist Paul Douglas and the mathematician Charles Cobb in 1928.

If we observe output at time $t$ for country $j$, say $y^j_t$ and inputs $x^j_{it}$, we can take the
logarithm of the Cobb-Douglas production function, obtaining
$$
     \log y^j_t = \log A^j_t + \sum_{i=1}^n\alpha^j_i\log(x^j_{it}).
  $$
Note that we&rsquo;ve allowed the cost-share parameters $\alpha$ to vary
across both inputs and also countries, but *not* over time.



### Total Factor Productivity



The term $A$ is sometimes called &ldquo;Total Factor Productivity&rdquo; (TFP),
because increases in $A$ increase productivity of all factors.  If
we take differences in log output over time we get
$$
      \Delta\log y^j_t = \Delta\log A^j_t + \sum_{i=1}^n\alpha^j_i\Delta\log(x^j_{it}).
   $$
Recall that changes in logs approximate percent changes or growth
rates, so we can use this equation to decompose output growth into
growth in input use and TFP growth.



### Data on Food Production



What’s happened to food production over recent decades?
See
[https://www.ers.usda.gov/data-products/international-agricultural-productivity/](https://www.ers.usda.gov/data-products/international-agricultural-productivity/).

Data on TFP, output, factor use, and factor shares can be found at
[https://docs.google.com/spreadsheets/d/1DLn9owcS7ggojJGWlI9vKSz0hqozn6cbcqNGWgzMZ8k](https://docs.google.com/spreadsheets/d/1DLn9owcS7ggojJGWlI9vKSz0hqozn6cbcqNGWgzMZ8k),
which is publicly readable.



In [1]:
import gspread_pandas as gsp

#### See gspread_authentication for discussion of this authentication process.
!gpg --batch --passphrase "noodle octopus" -d ../students-9093fa174318.json.gpg > ../students-9093fa174318.json
user_config = gsp.conf.get_config(conf_dir='../',file_name='students-9093fa174318.json')
user_creds = gsp.conf.get_creds(config=user_config)
client = gsp.Client(creds=user_creds)
####

# Get output for panel of countries
spread = gsp.Spread('https://docs.google.com/spreadsheets/d/1DLn9owcS7ggojJGWlI9vKSz0hqozn6cbcqNGWgzMZ8k',client=client)

Try getting data on output:



In [1]:
y = spread.sheet_to_df(sheet='Output',
                       index=3, # Column 3 for Country code
                       start_row=3 # Elide two rows of formatting
                       )
y.head()

We really just want the columns that begin with years (expressed as
strings), so



In [1]:
y = y[['%d' % t for t in range(1961,2020)]]

y.head()

Still have problem that numbers are strings, with commas.  Fix:



In [1]:
y = y.replace({',':''},regex=True) # Get rid of commas in number strings
y = y.replace({'':'NaN'}) # Change empty cells to NaN strings
y = y.astype(float) # Convert to floats

Let&rsquo;s automate this for the other series we want:



In [1]:
import pandas as pd

def get_international_ag_productivity_data(spread,sheet):
    series = spread.sheet_to_df(sheet=sheet,
                       index=3, # Column 3 for Country code
                       start_row=3 # Elide two rows of formatting
                       )

    series = series[['%d' % t for t in range(1961,2020)]]

    series = series.replace({',':''},regex=True) # Get rid of commas in number strings
    series = series.replace({'':'NaN'}) # Change empty cells to NaN strings
    series = series.astype(float) # Convert to floats

    series = series.stack()
    
    return series.loc[~series.index.duplicated(keep='first')] # Make multi-index of country-year

Data = ['Output','Ag TFP','Ag Land','Irrig','Pasture','Labor','Livestock',
        'Machinery','Fertilizer','Feed']

D = {}
for sheet in Data:
    D[sheet] = get_international_ag_productivity_data(spread,sheet)

df = pd.DataFrame(D)
df

### Visualizing data on ag production



Plot growth in output for all countries for 1961 on:



In [1]:
import cufflinks as cf
cf.go_offline()

df['Output'].unstack().T.iplot(title="Value of Agricultural Output",
                               yTitle='Thousands of 2005-06 Dollars',
                               xTitle='Year')

Compare world growth in outputs, inputs, and TFP:



In [1]:
import numpy as np

world = df.stack().unstack('WDI Code')['WLD'].unstack(1).replace(0,np.nan).dropna(how='any')

# Put in log differences
dworld = np.log(world).diff()
dworld['Inputs'] = dworld['Output'] - dworld['Ag TFP']

dworld.mean()

And a graph of growth rates:



In [1]:
dworld[['Output','Inputs','Ag TFP']].iplot(title="Growth rates of output, inputs, & TFP",
                                           xTitle="Year")

That&rsquo;s the overall picture for the world.  Now &ldquo;drill down&rdquo; and
consider just production in the US:



In [1]:
select = df[df.index.isin(['USA'],level=0)].dropna(how='any')

dselect = np.log(select).diff()
dselect['Inputs'] = dselect['Output'] - dselect['Ag TFP']

dselect.mean()

And here a graph of growth in indices of inputs & outputs since 1961:



In [1]:
select = df[df.index.isin(['USA'],level=0)].dropna(how='any')
select = select/select.loc[('USA','1961'),:]
select.iplot()

Compare with India, which in recent years has had a level of
agricultural output close to that the US:



In [1]:
select = df[df.index.isin(['IND'],level=0)].dropna(how='any')
select = select/select.loc[('IND','1961'),:]
select.iplot()

And now look at ratios of inputs & outputs in India to the same in the
US:



In [1]:
USA = df[df.index.isin(['USA'],level=0)].dropna(how='any')
USA.index = USA.index.droplevel('WDI Code')
IND = df[df.index.isin(['IND'],level=0)].dropna(how='any')
IND.index = IND.index.droplevel('WDI Code')

select = IND/USA
select.iplot(title='Indian inputs & output as proportion of US')