# EIA.gov

- https://www.eia.gov/opendata/
- https://www.eia.gov/opendata/qb.cfm

## Documentation
- https://www.eia.gov/opendata/commands.php


In [212]:
# load our requirements
import requests
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from pprint import pprint
from datetime import datetime 
plt.style.use('ggplot')

### API key

In [213]:
API_KEY = '09d7859f71b9796ca3b1e0c50da15d5b'

---
# API Category Query

Gets name and id for a single category, and also lists its children categories' names and ids.

## `http://api.eia.gov/category/?api_key=YOUR_API_KEY_HERE[&category_id=nn][&out=xml|json]`

- **category _id**: Optional. A unique numerical id of the category to fetch. If missing, the API's root category is fetched.

In [214]:
def category(api_key=None, category_id=None):
    '''
    input: api_key, category_id
    return: dict of category
    '''
    url = 'http://api.eia.gov/category/'
    payload = {'api_key':api_key,
               'category_id':category_id}
    r = requests.get(url, params=payload)
    return r.json()

In [215]:
category_id = '371'
eia = category(API_KEY, category_id)

In [216]:
eia

{'request': {'category_id': 371, 'command': 'category'},
 'category': {'category_id': '371',
  'parent_category_id': None,
  'name': 'EIA Data Sets',
  'notes': '',
  'childcategories': [{'category_id': 0, 'name': 'Electricity'},
   {'category_id': 40203, 'name': 'State Energy Data System (SEDS)'},
   {'category_id': 711224, 'name': 'Total Energy'},
   {'category_id': 714755, 'name': 'Petroleum'},
   {'category_id': 714804, 'name': 'Natural Gas'},
   {'category_id': 717234, 'name': 'Coal'},
   {'category_id': 829714, 'name': 'Short-Term Energy Outlook'},
   {'category_id': 964164, 'name': 'Annual Energy Outlook'},
   {'category_id': 1292190, 'name': 'Crude Oil Imports'},
   {'category_id': 2123635, 'name': 'U.S. Electric System Operating Data'},
   {'category_id': 2134384, 'name': 'International Energy Data'},
   {'category_id': 2251604, 'name': 'CO2 Emissions'},
   {'category_id': 2631064, 'name': 'International Energy Outlook'},
   {'category_id': 2889994, 'name': 'U.S. Nuclear Outag

---

# API Series Query


Returns the series ID followed by the series data as an array of date-value pairs. Dates are formatted as yyyy, yyyyQq, yyyymm, yyyymmdd for annual, quarterly, monthly, and daily/weekly data respectively. Values are either numeric if valid, "null" is missing, "w" if withheld, or "*" is statistically insignificant. Additional codes may be defined in future releases.


## `http://api.eia.gov/series/?series_id=sssssss&api_key=YOUR_API_KEY_HERE[&num=][&out=xml|json]`

**series_id**: Required. The series id (also called source key) is a case-insensitive string consisting of letters, numbers, dashes ("-") and periods (".") that uniquely identifies an EIA series. Multiple series can be fetched in a single request by using a semi-colon separated list of series id's. The number of series in a single request is limited to 100.



In [217]:
def series(api_key=None,series_id=None):
    '''input: series_id, api_key
    return: dic of series_id'''
    if series_id == None:
        print ('Yo you need a seres value')
        return None
    url = 'http://api.eia.gov/series/'
    payload = {'api_key':api_key,
               'series_id':series_id}
    r = requests.get(url, params=payload)
    return r.json()

In [218]:
series_id = 'ELEC.GEN.ALL-TX-98.M'

s = series(API_KEY,series_id)

In [220]:
s.keys()

dict_keys(['request', 'series'])

In [221]:
s['series'][0].keys()

dict_keys(['series_id', 'name', 'units', 'f', 'description', 'copyright', 'source', 'iso3166', 'geography', 'start', 'end', 'updated', 'data'])

In [222]:
series = s['series'][0]

In [224]:
series['name']

'Net generation : all fuels : Texas : electric power (total) : monthly'

### Get data

In [227]:
# series['data']

---
# API Geoset Query

Gets a set of the series belonging to the geoset requested by `geoset_id` input parameter and matching the list of regions requested defined in the regions input parameter. If a series does not exist Only the series matching the regions requested are returned. The fields of each series returned is described in the series command documentation.

A geoset is a relational metadata structure that organizes time series into sets that can be mapped. The geoset command is used by free EIA Visualization Library to create embeddable interactive maps. The API's API Browser contains code snippets and live examples of how to map each geoset contained in the EIA data API. EIA's entire State Energy Data System, Coal , Electricity, International data sets are organized into geosets and can be mappe using the libaray. Follow the links for examples of how to use the EIA Visualization Library with no coding required.

Coders wanting to create their own visualization library can call the geoset command using the following parameters:

### `http://api.eia.gov/geoset/?geoset_id=sssssss&regions=region1,region2,region3,...&api_key=YOUR_API_KEY_HERE[&start=|&num=][&end=][&out=xml|json]`


- **geoset_id**: Required. The series id (also called source key) is a case-insensitive string consisting of letters, numbers, dashes ("-") and periods (".") that uniquely identifies an EIA series.

- **regions**: Required. A semicolon-separated list of region codes requested. Series whose geoset_id and region fields match will be returned.


In [228]:
# example

---
# API Relation Query

Gets a set of the series belonging to the relation requested for the region requested. A relation is an EIA defined metadata structure that indicates breakdowns or details of summary statistics into composite statistics. Relations are defined between geosets, and therefore apply to all of the geoset's time series.

The relation command is used by free EIA Visualization Library to create embeddable interactive maps displaying these breakdowns. The API Browser contains code snippets and live examples of of interactive visualizations that how to create contained in the EIA data API. Relations can be found, when applicable, in EIA's State Energy Data System, Coal , Electricity, International data sets. Follow the links for examples of how to use the EIA Visualization Library to create interactive maps and charts using relationships (no coding required.)

Coders wanting to create their own visualization library can call the geoset command using the following parameters:

### `http://api.eia.gov/relation/?relation_id=rrrrrrr&region=region1&api_key=YOUR_API_KEY_HERE[&start=|&num=][&end=][&out=xml|json]`

- **geoset_id**: Required. The series id (also called source key) is a case-insensitive string consisting of letters, numbers, dashes ("-") and periods (".") that uniquely identifies an EIA series.

- **regions**: Required. A semicolon-separated list of region codes requested. Series whose geoset_id and region fields match will be returned.

In [201]:
eia.keys()

dict_keys(['request', 'category'])

---
# API Updates Data Query

Many applications will need to maintain a copy of EIA data to drive heavy data-processing and republishing operations. The update query allows your application to efficiently stay current with EIA's data releases while staying within the Terms of Service agreement which prohibits excessive server requests, such as repeatedly requesting all the data series in the EIA API. Currently, the EIA API contains 465,000 electricity series organized into 39,000 categories. As we add petroleum, natural gas, international, and state estimates over the coming months, this number will swell to over a million series. Continuous requesting all the series in the EIA API may lead to a termination of your license key. The update query avoids this problem by allowing your application to find out if anything has been updated in electricity prices for example, and only quest data is the series have been updated using the series/data query.

Returns a paginated list of series in descending order by the series' last updated date (i.e. most recent updates first). Only the series_id and the series updated date are returned. If a category_id is specified, only series belonging to that category are checked. If a start category is not specified, the query defaults to the API's root category. If the optional variable "deep" is set to true, the entire branch of the category tree if checked for updates, otherwise only series belonging to the specified category are checked.

### `http://api.eia.gov/updates/?api_key=YOUR_API_KEY_HERE[&category_id=X][&deep=true|false][&firstrow=nnnnn][&rows=nn][&out=xml|json]`

- **category _id**: Optional. A unique numerical id of the start category to fetch. If missing, the API's root category is fetched.

- **deep**: Optional. If true, include the series in all descendent categories. If missing or false, only series directly in the start category will be returned.

- **rows**: Optional. Determines the maximum number of rows returned for each request, up to 10,000. Missing or invalid value results a default value of 50 as the maximum rows returned with each call.

- **firstrow**: Optional. Integer specifying the zero-based index of the first row to return, providing a means to page through the updated series. Note that it is possible to page through the all of the API's series in this manner

In [126]:
# example

---
# API Search Data Query
Returns the series ID as an array followed by series facet data as an array. Additional codes may be defined in future releases.



In [162]:
def search(search_term=None, search_value=None):
    url = 'http://api.eia.gov/search/'
    payload = {'search_term':search_term,
               'search_value':search_value}
    r = requests.get(url, params=payload)
    return r.json()

In [182]:
search_term = 'series_id'
search_value = 'PET.MB'
s = search(search_term,search_value)
#s = search()

In [183]:
s.keys()

dict_keys(['responseHeader', 'response', 'facet_counts'])

In [184]:
s['response'].keys()

dict_keys(['numFound', 'start', 'docs'])

In [187]:
s['response']['numFound']

116

In [190]:
s['responseHeader'].keys()

dict_keys(['status', 'QTime', 'params'])

In [191]:
s['responseHeader']['params'].keys()

dict_keys(['q', 'facet.field', 'indent', 'fl', 'start', 'facet.mincount', 'sort', 'rows', 'version', 'wt', 'facet', 'facet.sort'])

In [192]:
s['responseHeader']['params']

{'q': '(series_id:PET.MB)',
 'facet.field': ['data_set',
  'frequency',
  'iso3166',
  'units',
  'region',
  'region_2',
  'last_updated'],
 'indent': 'on',
 'fl': 'series_id,name,units,frequency',
 'start': '0',
 'facet.mincount': '1',
 'sort': 'name_len asc,series_id asc',
 'rows': '10',
 'version': '2.2',
 'wt': 'json',
 'facet': 'true',
 'facet.sort': 'count'}

In [193]:
s['response'].keys()

dict_keys(['numFound', 'start', 'docs'])

In [135]:
s['facet_counts'].keys()

dict_keys(['facet_queries', 'facet_fields', 'facet_ranges', 'facet_intervals', 'facet_heatmaps'])