## API Package: usda_com 

This notebook provides a simple use case for the USDA Commodity Database. The package can be downloaded via github using the following code. 


In [3]:
import pandas as pd 

In [14]:
data = pd.read_csv('../usda_com/data/commodities.csv')

In [11]:
data = data.iloc[:, 1:]

In [15]:
data.head()

Unnamed: 0,Commodity Code,Commodity Name
0,577400,"Almonds, Shelled Basis"
1,11000,"Animal Numbers, Cattle"
2,13000,"Animal Numbers, Swine"
3,574000,"Apples, Fresh"
4,430000,Barley


In [13]:
data.to_csv('../usda_com/data/commodities.csv', index=False)

In [1]:
! pip install git+https://github.com/LightnerAndrew/usda_com --quiet

In [17]:
import os
os.chdir('..')

In [18]:
import usda_com

# create the query object. 
qry = usda_com.query()

#### API Key 

The first step in building and using the API is to create an account and generate an API key. https://apps.fas.usda.gov/psdonline/app/index.html#/app/about

1. Create an account: 
2. Generate an API Key 

In [19]:
qry.api_key = 'D892ADB8-DBC1-4AF5-BCFF-BA338AAE2B6E'

### Explore commodity options

The `commodity_options` attribute holds commodity code information. 

In [20]:
qry.commodity_options.head()

Unnamed: 0,Commodity Code,Commodity Name
0,577400,"Almonds, Shelled Basis"
1,11000,"Animal Numbers, Cattle"
2,13000,"Animal Numbers, Swine"
3,574000,"Apples, Fresh"
4,430000,Barley


Although you could query the datafrmae using pandas, there is also a convenient `find_commodity_code()` method to search the Commodity names for those of interest. 

For this example, we will access data for fresh apples. 

In [21]:
qry.find_commodity_code('Apples')

Unnamed: 0,Commodity Code,Commodity Name
3,574000,"Apples, Fresh"


In [22]:
qry.com_selection = ['574000']

### Year selection 

The USDA records data from 1960 until the most recent data (2018 at the time of creating this notebook). 

The year_selection attribute takes a string in two different formats. 
1. `{year0},{year1},..` will select only the years specified. 
2. `{year0}:{year1}` will return all years within a given range. 

In [23]:
qry.year_selection = '1990,2000'

### Access Data!

In [24]:
raw_data = qry.run()

In [25]:
raw_data.head()

Unnamed: 0,AttributeDescription,AttributeId,CalendarYear,CommodityCode,CommodityDescription,CountryCode,CountryName,MarketYear,Month,UnitDescription,UnitId,Value
0,Area Planted,1,,574000,"Apples, Fresh",AF,Afghanistan,1990,12,(HA),12,0.0
1,Area Harvested,4,,574000,"Apples, Fresh",AF,Afghanistan,1990,12,(HA),12,0.0
2,Bearing Trees,17,,574000,"Apples, Fresh",AF,Afghanistan,1990,12,(1000 TREES),10,0.0
3,Non-Bearing Trees,19,,574000,"Apples, Fresh",AF,Afghanistan,1990,12,(1000 TREES),10,0.0
4,Total Trees,16,,574000,"Apples, Fresh",AF,Afghanistan,1990,12,(1000 TREES),10,0.0


In [14]:
raw_data.columns

Index(['AttributeDescription', 'AttributeId', 'CalendarYear', 'CommodityCode',
       'CommodityDescription', 'CountryCode', 'CountryName', 'MarketYear',
       'Month', 'UnitDescription', 'UnitId', 'Value'],
      dtype='object')

In [15]:
raw_data.MarketYear.unique()

array(['1990', '2000'], dtype=object)