## Configure PUDL
The `.pudl.yml` configuration file tells PUDL where to look for data. Uncomment the next cell and run it if you're on our 2i2c JupyterHub.

In [1]:
#!cp ~/shared/shared-pudl.yml ~/.pudl.yml

In [2]:
# import the necessary packages
%load_ext autoreload
%autoreload 2

import os
import sys

# You can ignore this. It suppresses an unimportant warning.
os.environ["USE_PYGEOS"] = '0'

import pandas as pd
import sqlalchemy as sa
import pudl

# Using the PUDL output layer
The PUDL database tables are a clean, [normalized](https://en.wikipedia.org/wiki/Database_normalization) version of US electricity data. Normalized tables are great for databases and storage, but for interactive use, we often want a version of the data that includes plant and utility names and other associated info all in a single dataframe. These are "denormalized" tables. In addition to the referenced names and attributes like latitude and longitude or state, the denormalized tables often contain frequently calculated derived values (like calcuating `total_fuel_cost` from `total_heat_content_mmbtu` and `fuel_cost_per_mmbtu`). The Catalyst team developed a useful tool to access denormalized tables that we call the PUDL output object.

## What does the output layer provide?

Right now the output layer provides access to three different kinds of things:
 * denormalized tables
 * analytical outputs
 * partially integrated PUDL datasets that aren't in the database yet

## Why is the output layer useful?
Some benefits of using the output layer:
 * **Standardized denormalization:** You don't have to manually join the same tables together to get access to common attributes.
 * **Table caching:** many analyses rely on using the same table multiple times. The PUDL output object caches the tables in memory as pandas dataframes so you don't have to read tables from the database over and over again.
 * **Time series aggregation:** Some tables are annual, some monthly, some hourly. When you create a PUDL output object you can tell it to aggregate the data to either monthly or annual resolution for analysis.
 * **Standardized the filling-in of missing data:** There's a ton of missing or incomplete data. If requested, the output objects will use rolling averages and  data from the EIA API try to fill some of that missing data in.
 
## Output layer caveats
* Relying out this output layer means you need to install the whole PUDL python package and all of its dependencies.
* Many of those dependencies are really only needed for producing the data, not using it.
* If you want to use SQL directly, or R, or any other set of tools to work with the data, the output layer is not helpful.
* In future releases we plan to load more of the derived data outputs directly into their own database tables for distribution.
* We will also build denormalized tables that live inside the database as stored queries (database "views") so that all of this infrastructure is available to everyone, without needing to rely on the Python software environment.

# Instantiating Output Objects
* Grab the `pudl_settings`
* Create a connection to the PUDL Database
* Instantiate a `PudlTabl` object with that connection

In [3]:
pudl_settings = pudl.workspace.setup.get_defaults()
pudl_settings

{'pudl_in': '/Users/zane/code/catalyst/pudl-work',
 'data_dir': '/Users/zane/code/catalyst/pudl-work/data',
 'settings_dir': '/Users/zane/code/catalyst/pudl-work/settings',
 'pudl_out': '/Users/zane/code/catalyst/pudl-work',
 'sqlite_dir': '/Users/zane/code/catalyst/pudl-work/sqlite',
 'parquet_dir': '/Users/zane/code/catalyst/pudl-work/parquet',
 'ferc1_db': 'sqlite:////Users/zane/code/catalyst/pudl-work/sqlite/ferc1.sqlite',
 'ferc1_xbrl_db': 'sqlite:////Users/zane/code/catalyst/pudl-work/sqlite/ferc1_xbrl.sqlite',
 'ferc1_xbrl_datapackage': PosixPath('/Users/zane/code/catalyst/pudl-work/sqlite/ferc1_xbrl_datapackage.json'),
 'ferc1_xbrl_taxonomy_metadata': PosixPath('/Users/zane/code/catalyst/pudl-work/sqlite/ferc1_xbrl_taxonomy_metadata.json'),
 'ferc2_xbrl_db': 'sqlite:////Users/zane/code/catalyst/pudl-work/sqlite/ferc2_xbrl.sqlite',
 'ferc2_xbrl_datapackage': PosixPath('/Users/zane/code/catalyst/pudl-work/sqlite/ferc2_xbrl_datapackage.json'),
 'ferc2_xbrl_taxonomy_metadata': Posi

In [4]:
pudl_engine = sa.create_engine(pudl_settings["pudl_db"])
pudl_engine

Engine(sqlite:////Users/zane/code/catalyst/pudl-work/sqlite/pudl.sqlite)

In [5]:
# this configuration will return tables without aggregating by a time frequency... we'll explore that more below.
pudl_out = pudl.output.pudltabl.PudlTabl(pudl_engine=pudl_engine)

## List the output object methods
* There are dozens of different data access methods within the `PudlTabl` object.
* You can read more about them in the [PUDL API documentation](https://catalystcoop-pudl.readthedocs.io/en/latest/autoapi/pudl/output/pudltabl/index.html#pudl.output.pudltabl.PudlTabl)
* If you type `pudl_out.` and press `Shift` and `Tab` at the same time, you'll see a list of available methods as well.
* You can also see their docstrings by running:

```python
help(pudl_out)
```

In [6]:
#help(pudl_out)

## Basic Functionality

### Read a denormalized table
* Each of output object methods will return a Pandas Dataframe.
* Most of them correspond to a single database table, and will select all the data in that table, and automatically join it with some other useful information.
* Many of the access methods use an abbreviated name for the database table. E.g. the following reads all the data out of the `generators_eia860` table.
* Some of the methods fill in missing data and may produce logging output (depending on what parameters are used to instantiate `pudl_out`)

In [7]:
%%time
pudl_out.gens_eia860()

2022-12-22 01:22:03 [    INFO] catalystcoop.pudl.transform.eia861:456 Started with 81265 missing BA Codes out of 185357 records (43.84%)
2022-12-22 01:22:05 [    INFO] catalystcoop.pudl.transform.eia861:480 Ended with 14923 missing BA Codes out of 185357 records (8.05%)
2022-12-22 01:22:06 [    INFO] catalystcoop.pudl.output.eia860:177 91.6% of plant records have consistently reported BA Codes
2022-12-22 01:22:07 [    INFO] catalystcoop.pudl.output.eia860:227 Before any filling treatment has been applied. 43.8% of records have no BA codes
2022-12-22 01:22:07 [    INFO] catalystcoop.pudl.output.eia860:227 Backfilling and consistent value is the same. Filled w/ most consistent BA code. 10.9% of records have no BA codes
2022-12-22 01:22:07 [    INFO] catalystcoop.pudl.output.eia860:227 SWPP is most consistent value. Filled w/ oldest BA code. 8.3% of records have no BA codes
2022-12-22 01:22:07 [    INFO] catalystcoop.pudl.output.eia860:227 NWMT is most consistent value. Filled w/ oldest B

CPU times: user 36.3 s, sys: 2.85 s, total: 39.2 s
Wall time: 39.5 s


Unnamed: 0,report_date,plant_id_eia,plant_id_pudl,plant_name_eia,utility_id_eia,utility_id_pudl,utility_name_eia,generator_id,associated_combined_heat_power,bga_source,bypass_heat_recovery,capacity_mw,carbon_capture,city,cofire_fuels,county,current_planned_operating_date,data_maturity,deliver_power_transgrid,distributed_generation,duct_burners,energy_source_1_transport_1,energy_source_1_transport_2,energy_source_1_transport_3,energy_source_2_transport_1,energy_source_2_transport_2,energy_source_2_transport_3,energy_source_code_1,energy_source_code_2,energy_source_code_3,energy_source_code_4,energy_source_code_5,energy_source_code_6,energy_storage_capacity_mwh,ferc_qualifying_facility,fluidized_bed_tech,fuel_type_code_pudl,fuel_type_count,latitude,longitude,minimum_load_mw,multiple_fuels,nameplate_power_factor,net_capacity_mwdc,operating_date,operating_switch,operational_status,operational_status_code,original_planned_operating_date,other_combustion_tech,other_modifications_date,other_planned_modifications,owned_by_non_utility,ownership_code,planned_derate_date,planned_energy_source_code_1,planned_modifications,planned_net_summer_capacity_derate_mw,planned_net_summer_capacity_uprate_mw,planned_net_winter_capacity_derate_mw,planned_net_winter_capacity_uprate_mw,planned_new_capacity_mw,planned_new_prime_mover_code,planned_repower_date,planned_retirement_date,planned_uprate_date,previously_canceled,prime_mover_code,pulverized_coal_tech,reactive_power_output_mvar,retirement_date,rto_iso_lmp_node_id,rto_iso_location_wholesale_reporting_id,solid_fuel_gasification,startup_source_code_1,startup_source_code_2,startup_source_code_3,startup_source_code_4,state,stoker_tech,street_address,subcritical_tech,summer_capacity_estimate,summer_capacity_mw,summer_estimated_capability_mw,supercritical_tech,switch_oil_gas,syncronized_transmission_grid,technology_description,time_cold_shutdown_full_load_code,timezone,topping_bottoming_code,turbines_inverters_hydrokinetics,turbines_num,ultrasupercritical_tech,unit_id_pudl,uprate_derate_completed_date,uprate_derate_during_year,winter_capacity_estimate,winter_capacity_mw,winter_estimated_capability_mw,zip_code
523342,2001-01-01,2,848,Bankhead Dam,195,18,Alabama Power Co,1,False,,False,45.0,,Northport,,Tuscaloosa,NaT,final,,False,False,,,,,,,WAT,,,,,,,False,,hydro,1,33.458665,-87.35682,,,,,1963-07-01,,existing,OP,NaT,,NaT,,,S,NaT,,,,,,,,,NaT,NaT,NaT,,HY,,,NaT,,,,,,,,AL,,19001 Lock 17 Road,,,56.0,,,,,Conventional Hydroelectric,,America/Chicago,X,,,,,NaT,,,56.0,,35476
523341,2001-01-01,3,32,Barry,195,18,Alabama Power Co,1,False,,False,153.1,,Bucks,,Mobile,NaT,final,,False,False,WT,,,PL,,,BIT,NG,,,,,,False,,coal,3,31.006900,-88.01030,,,,,1954-02-01,,existing,OP,NaT,,NaT,,,S,NaT,,,,,,,,,NaT,NaT,NaT,,ST,True,,NaT,,,,,,,,AL,,North Highway 43,True,,138.0,,,,,Conventional Steam Coal,,America/Chicago,X,,,,,NaT,,,138.0,,36512
523340,2001-01-01,3,32,Barry,195,18,Alabama Power Co,2,False,,False,153.1,,Bucks,,Mobile,NaT,final,,False,False,WT,,,PL,,,BIT,NG,,,,,,False,,coal,3,31.006900,-88.01030,,,,,1954-07-01,,existing,OP,NaT,,NaT,,,S,NaT,,,,,,,,,NaT,NaT,NaT,,ST,True,,NaT,,,,,,,,AL,,North Highway 43,True,,139.0,,,,,Conventional Steam Coal,,America/Chicago,X,,,,,NaT,,,139.0,,36512
523339,2001-01-01,3,32,Barry,195,18,Alabama Power Co,3,False,,False,272.0,,Bucks,,Mobile,NaT,final,,False,False,WT,,,PL,,,BIT,NG,,,,,,False,,coal,3,31.006900,-88.01030,,,,,1959-07-01,,existing,OP,NaT,,NaT,,,S,NaT,,,,,,,,,NaT,NaT,NaT,,ST,True,,NaT,,,,,,,,AL,,North Highway 43,True,,251.0,,,,,Conventional Steam Coal,,America/Chicago,X,,,,,NaT,,,251.0,,36512
523338,2001-01-01,3,32,Barry,195,18,Alabama Power Co,4,False,,False,403.7,,Bucks,,Mobile,NaT,final,,False,False,WT,,,PL,,,BIT,NG,,,,,,False,,coal,3,31.006900,-88.01030,,,,,1969-12-01,,existing,OP,NaT,,NaT,,,S,NaT,,,,,,,,,NaT,NaT,NaT,,ST,True,,NaT,,,,,,,,AL,,North Highway 43,True,,362.0,,,,,Conventional Steam Coal,,America/Chicago,X,,,,,NaT,,,362.0,,36512
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4,2022-01-01,65721,16835,RT405 Westerlo Solar 2,64985,14123,"RT405 Westerlo Solar 2, LLC",7104,False,,False,5.0,,Albany,,Albany,NaT,monthly_update,,,False,,,,,,,SUN,,,,,,,,,solar,1,42.445060,-74.01609,,,,7.0,2020-03-01,,existing,OP,NaT,,NaT,,,,NaT,,,,,,,,,NaT,NaT,NaT,,PV,,,NaT,,,,,,,,NY,,6844 Route 32,,,5.0,,,,,Solar Photovoltaic,,America/New_York,X,,,,,NaT,,,4.8,,12083
3,2022-01-01,65722,16852,Webster Solar,64882,14134,"Webster Solar, LLC",2538,False,,False,1.0,,Webster,,Worcester,NaT,monthly_update,,,False,,,,,,,SUN,,,,,,,,,solar,1,42.028100,-71.85031,,,,1.0,2017-12-01,,existing,OP,NaT,,NaT,,,,NaT,,,,,,,,,NaT,NaT,NaT,,PV,,,NaT,,,,,,,,MA,,338 Thompson Rd,,,1.0,,,,,Solar Photovoltaic,,America/New_York,X,,,,,NaT,,,0.8,,01570
2,2022-01-01,65723,16836,LR Wheatfield Solar 1,64986,14122,"LR Wheatfield Solar 1, LLC",15557,False,,False,5.0,,Wheatfield,,Niagara,NaT,monthly_update,,,False,,,,,,,SUN,,,,,,,,,solar,1,43.121630,-78.91409,,,,6.3,2020-10-01,,existing,OP,NaT,,NaT,,,,NaT,,,,,,,,,NaT,NaT,NaT,,PV,,,NaT,,,,,,,,NY,,2469 Lockport Road,,,5.0,,,,,Solar Photovoltaic,,America/New_York,X,,,,,NaT,,,4.8,,14132
1,2022-01-01,65724,16837,Pendleton Solar 1,64987,14121,"Pendleton Solar 1, LLC",15889,False,,False,5.0,,Lockport,,Niagara,NaT,monthly_update,,,False,,,,,,,SUN,,,,,,,,,solar,1,43.097230,-78.75382,,,,6.0,2020-10-01,,existing,OP,NaT,,NaT,,,,NaT,,,,,,,,,NaT,NaT,NaT,,PV,,,NaT,,,,,,,,NY,,6707 Bear Ridge Road,,,5.0,,,,,Solar Photovoltaic,,America/New_York,X,,,,,NaT,,,4.8,,14094


In [8]:
pudl_out.gens_eia860().info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 523343 entries, 523342 to 0
Columns: 102 entries, report_date to zip_code
dtypes: Int64(7), boolean(26), datetime64[ns](11), float64(17), int64(1), string(40)
memory usage: 336.9 MB


### Automatic dataframe caching
The `generators_eia860` table is quite long, and the above cell probably took several seconds to read more than half a million records each with 100 columns, creating an 300MB Dataframe. If you run the same output routine again, it will complete almost instantly because that dataframe is already stored inside `pudl_out`. This is memory intensive, but can save time in calculations that need to use the same tables several times.

In [9]:
%%time
pudl_out.gens_eia860()

CPU times: user 3 µs, sys: 1 µs, total: 4 µs
Wall time: 5.25 µs


Unnamed: 0,report_date,plant_id_eia,plant_id_pudl,plant_name_eia,utility_id_eia,utility_id_pudl,utility_name_eia,generator_id,associated_combined_heat_power,bga_source,bypass_heat_recovery,capacity_mw,carbon_capture,city,cofire_fuels,county,current_planned_operating_date,data_maturity,deliver_power_transgrid,distributed_generation,duct_burners,energy_source_1_transport_1,energy_source_1_transport_2,energy_source_1_transport_3,energy_source_2_transport_1,energy_source_2_transport_2,energy_source_2_transport_3,energy_source_code_1,energy_source_code_2,energy_source_code_3,energy_source_code_4,energy_source_code_5,energy_source_code_6,energy_storage_capacity_mwh,ferc_qualifying_facility,fluidized_bed_tech,fuel_type_code_pudl,fuel_type_count,latitude,longitude,minimum_load_mw,multiple_fuels,nameplate_power_factor,net_capacity_mwdc,operating_date,operating_switch,operational_status,operational_status_code,original_planned_operating_date,other_combustion_tech,other_modifications_date,other_planned_modifications,owned_by_non_utility,ownership_code,planned_derate_date,planned_energy_source_code_1,planned_modifications,planned_net_summer_capacity_derate_mw,planned_net_summer_capacity_uprate_mw,planned_net_winter_capacity_derate_mw,planned_net_winter_capacity_uprate_mw,planned_new_capacity_mw,planned_new_prime_mover_code,planned_repower_date,planned_retirement_date,planned_uprate_date,previously_canceled,prime_mover_code,pulverized_coal_tech,reactive_power_output_mvar,retirement_date,rto_iso_lmp_node_id,rto_iso_location_wholesale_reporting_id,solid_fuel_gasification,startup_source_code_1,startup_source_code_2,startup_source_code_3,startup_source_code_4,state,stoker_tech,street_address,subcritical_tech,summer_capacity_estimate,summer_capacity_mw,summer_estimated_capability_mw,supercritical_tech,switch_oil_gas,syncronized_transmission_grid,technology_description,time_cold_shutdown_full_load_code,timezone,topping_bottoming_code,turbines_inverters_hydrokinetics,turbines_num,ultrasupercritical_tech,unit_id_pudl,uprate_derate_completed_date,uprate_derate_during_year,winter_capacity_estimate,winter_capacity_mw,winter_estimated_capability_mw,zip_code
523342,2001-01-01,2,848,Bankhead Dam,195,18,Alabama Power Co,1,False,,False,45.0,,Northport,,Tuscaloosa,NaT,final,,False,False,,,,,,,WAT,,,,,,,False,,hydro,1,33.458665,-87.35682,,,,,1963-07-01,,existing,OP,NaT,,NaT,,,S,NaT,,,,,,,,,NaT,NaT,NaT,,HY,,,NaT,,,,,,,,AL,,19001 Lock 17 Road,,,56.0,,,,,Conventional Hydroelectric,,America/Chicago,X,,,,,NaT,,,56.0,,35476
523341,2001-01-01,3,32,Barry,195,18,Alabama Power Co,1,False,,False,153.1,,Bucks,,Mobile,NaT,final,,False,False,WT,,,PL,,,BIT,NG,,,,,,False,,coal,3,31.006900,-88.01030,,,,,1954-02-01,,existing,OP,NaT,,NaT,,,S,NaT,,,,,,,,,NaT,NaT,NaT,,ST,True,,NaT,,,,,,,,AL,,North Highway 43,True,,138.0,,,,,Conventional Steam Coal,,America/Chicago,X,,,,,NaT,,,138.0,,36512
523340,2001-01-01,3,32,Barry,195,18,Alabama Power Co,2,False,,False,153.1,,Bucks,,Mobile,NaT,final,,False,False,WT,,,PL,,,BIT,NG,,,,,,False,,coal,3,31.006900,-88.01030,,,,,1954-07-01,,existing,OP,NaT,,NaT,,,S,NaT,,,,,,,,,NaT,NaT,NaT,,ST,True,,NaT,,,,,,,,AL,,North Highway 43,True,,139.0,,,,,Conventional Steam Coal,,America/Chicago,X,,,,,NaT,,,139.0,,36512
523339,2001-01-01,3,32,Barry,195,18,Alabama Power Co,3,False,,False,272.0,,Bucks,,Mobile,NaT,final,,False,False,WT,,,PL,,,BIT,NG,,,,,,False,,coal,3,31.006900,-88.01030,,,,,1959-07-01,,existing,OP,NaT,,NaT,,,S,NaT,,,,,,,,,NaT,NaT,NaT,,ST,True,,NaT,,,,,,,,AL,,North Highway 43,True,,251.0,,,,,Conventional Steam Coal,,America/Chicago,X,,,,,NaT,,,251.0,,36512
523338,2001-01-01,3,32,Barry,195,18,Alabama Power Co,4,False,,False,403.7,,Bucks,,Mobile,NaT,final,,False,False,WT,,,PL,,,BIT,NG,,,,,,False,,coal,3,31.006900,-88.01030,,,,,1969-12-01,,existing,OP,NaT,,NaT,,,S,NaT,,,,,,,,,NaT,NaT,NaT,,ST,True,,NaT,,,,,,,,AL,,North Highway 43,True,,362.0,,,,,Conventional Steam Coal,,America/Chicago,X,,,,,NaT,,,362.0,,36512
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4,2022-01-01,65721,16835,RT405 Westerlo Solar 2,64985,14123,"RT405 Westerlo Solar 2, LLC",7104,False,,False,5.0,,Albany,,Albany,NaT,monthly_update,,,False,,,,,,,SUN,,,,,,,,,solar,1,42.445060,-74.01609,,,,7.0,2020-03-01,,existing,OP,NaT,,NaT,,,,NaT,,,,,,,,,NaT,NaT,NaT,,PV,,,NaT,,,,,,,,NY,,6844 Route 32,,,5.0,,,,,Solar Photovoltaic,,America/New_York,X,,,,,NaT,,,4.8,,12083
3,2022-01-01,65722,16852,Webster Solar,64882,14134,"Webster Solar, LLC",2538,False,,False,1.0,,Webster,,Worcester,NaT,monthly_update,,,False,,,,,,,SUN,,,,,,,,,solar,1,42.028100,-71.85031,,,,1.0,2017-12-01,,existing,OP,NaT,,NaT,,,,NaT,,,,,,,,,NaT,NaT,NaT,,PV,,,NaT,,,,,,,,MA,,338 Thompson Rd,,,1.0,,,,,Solar Photovoltaic,,America/New_York,X,,,,,NaT,,,0.8,,01570
2,2022-01-01,65723,16836,LR Wheatfield Solar 1,64986,14122,"LR Wheatfield Solar 1, LLC",15557,False,,False,5.0,,Wheatfield,,Niagara,NaT,monthly_update,,,False,,,,,,,SUN,,,,,,,,,solar,1,43.121630,-78.91409,,,,6.3,2020-10-01,,existing,OP,NaT,,NaT,,,,NaT,,,,,,,,,NaT,NaT,NaT,,PV,,,NaT,,,,,,,,NY,,2469 Lockport Road,,,5.0,,,,,Solar Photovoltaic,,America/New_York,X,,,,,NaT,,,4.8,,14132
1,2022-01-01,65724,16837,Pendleton Solar 1,64987,14121,"Pendleton Solar 1, LLC",15889,False,,False,5.0,,Lockport,,Niagara,NaT,monthly_update,,,False,,,,,,,SUN,,,,,,,,,solar,1,43.097230,-78.75382,,,,6.0,2020-10-01,,existing,OP,NaT,,NaT,,,,NaT,,,,,,,,,NaT,NaT,NaT,,PV,,,NaT,,,,,,,,NY,,6707 Bear Ridge Road,,,5.0,,,,,Solar Photovoltaic,,America/New_York,X,,,,,NaT,,,4.8,,14094


## Exploring `pudl_out` Arguments
Below, we'll explore the main arguments that are used to customize the PUDL output object. You can mix and match these options.

By default, the output object will read data from all available years, do no time aggregation, and not attempt to fill in missing values.

In [10]:
# here are the default arguments for the pudl_out object
pudl_out = pudl.output.pudltabl.PudlTabl(
    pudl_engine=pudl_engine, # we always need a pudl_engine
    freq=None,               # Desired time grouping to aggregate PUDL tables to.
    start_date=None,         # Beginning date for data to pull from the PUDL DB.
    end_date=None,           # End date for data to pull from the PUDL DB.
    fill_fuel_cost=False,    # Whether to fill in missing fuel costs with EIA monthly state-level averages.
    roll_fuel_cost=False,    # Whether to fill in monthly missing fuel costs with a 12-month rolling average.
)

### Time series aggregation
The PUDL output object can aggregate data on a monthly or annual basis, if you set the `freq` argument to `AS` (annual starting at the beginning of the calendar year) or `MS` (monthly starting at the beginning of the month) or [other equivalent frequency abbreviations](https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#timeseries-offset-aliases).

**NOTE:** Not all columns can be aggregated, so you may lose access to some kinds of information in aggregated outputs. If you need to retain information that gets lost in the default aggregation / groupby process, you may need to pull the unaggregated data and do your own aggregation.

In [11]:
pudl_out_as = pudl.output.pudltabl.PudlTabl(
    pudl_engine=pudl_engine, # we always need a pudl_engine
    freq='AS',               # Aggregate tables annually
)

In [12]:
pudl_out_as.gen_eia923()

2022-12-22 01:22:32 [    INFO] catalystcoop.pudl.transform.eia861:456 Started with 81265 missing BA Codes out of 185357 records (43.84%)
2022-12-22 01:22:34 [    INFO] catalystcoop.pudl.transform.eia861:480 Ended with 14923 missing BA Codes out of 185357 records (8.05%)
2022-12-22 01:22:35 [    INFO] catalystcoop.pudl.output.eia860:177 91.6% of plant records have consistently reported BA Codes
2022-12-22 01:22:35 [    INFO] catalystcoop.pudl.output.eia860:227 Before any filling treatment has been applied. 43.8% of records have no BA codes
2022-12-22 01:22:36 [    INFO] catalystcoop.pudl.output.eia860:227 Backfilling and consistent value is the same. Filled w/ most consistent BA code. 10.9% of records have no BA codes
2022-12-22 01:22:36 [    INFO] catalystcoop.pudl.output.eia860:227 SWPP is most consistent value. Filled w/ oldest BA code. 8.3% of records have no BA codes
2022-12-22 01:22:36 [    INFO] catalystcoop.pudl.output.eia860:227 NWMT is most consistent value. Filled w/ oldest B

Unnamed: 0,report_date,plant_id_eia,plant_id_pudl,plant_name_eia,utility_id_eia,utility_id_pudl,utility_name_eia,generator_id,net_generation_mwh,unit_id_pudl
0,2008-01-01,3,32,Barry,195,18,Alabama Power Co,1,873997.0,
1,2009-01-01,3,32,Barry,195,18,Alabama Power Co,1,221908.0,1
2,2010-01-01,3,32,Barry,195,18,Alabama Power Co,1,435334.0,1
3,2011-01-01,3,32,Barry,195,18,Alabama Power Co,1,312130.0,1
4,2012-01-01,3,32,Barry,195,18,Alabama Power Co,1,152102.0,1
...,...,...,...,...,...,...,...,...,...,...
50443,2020-01-01,64020,14650,West Riverside Energy Center,20856,364,Wisconsin Power & Light Co,STG2,,1
50444,2021-01-01,64020,14650,West Riverside Energy Center,20856,364,Wisconsin Power & Light Co,STG2,713710.0,1
50445,2020-01-01,64408,15419,Georges River Energy,16191,3018,Robbins Lumber Inc,WEG,,1
50446,2021-01-01,64408,15419,Georges River Energy,16191,3018,Robbins Lumber Inc,WEG,,1


In [13]:
pudl_out_ms = pudl.output.pudltabl.PudlTabl(
    pudl_engine=pudl_engine, # we always need a pudl_engine
    freq='MS',               # Aggregate tables monthly
)

In [14]:
pudl_out_ms.gen_eia923()

2022-12-22 01:23:00 [    INFO] catalystcoop.pudl.transform.eia861:456 Started with 81265 missing BA Codes out of 185357 records (43.84%)
2022-12-22 01:23:02 [    INFO] catalystcoop.pudl.transform.eia861:480 Ended with 14923 missing BA Codes out of 185357 records (8.05%)
2022-12-22 01:23:04 [    INFO] catalystcoop.pudl.output.eia860:177 91.6% of plant records have consistently reported BA Codes
2022-12-22 01:23:04 [    INFO] catalystcoop.pudl.output.eia860:227 Before any filling treatment has been applied. 43.8% of records have no BA codes
2022-12-22 01:23:04 [    INFO] catalystcoop.pudl.output.eia860:227 Backfilling and consistent value is the same. Filled w/ most consistent BA code. 10.9% of records have no BA codes
2022-12-22 01:23:04 [    INFO] catalystcoop.pudl.output.eia860:227 SWPP is most consistent value. Filled w/ oldest BA code. 8.3% of records have no BA codes
2022-12-22 01:23:04 [    INFO] catalystcoop.pudl.output.eia860:227 NWMT is most consistent value. Filled w/ oldest B

Unnamed: 0,report_date,plant_id_eia,plant_id_pudl,plant_name_eia,utility_id_eia,utility_id_pudl,utility_name_eia,generator_id,net_generation_mwh,unit_id_pudl
0,2008-01-01,3,32,Barry,195,18,Alabama Power Co,1,96021.0,
1,2008-02-01,3,32,Barry,195,18,Alabama Power Co,1,79256.0,
2,2008-03-01,3,32,Barry,195,18,Alabama Power Co,1,91687.0,
3,2008-04-01,3,32,Barry,195,18,Alabama Power Co,1,73693.0,
4,2008-05-01,3,32,Barry,195,18,Alabama Power Co,1,68161.0,
...,...,...,...,...,...,...,...,...,...,...
604589,2021-08-01,65367,16674,POET Bioprocessing- Mitchell,64697,14042,POET Bioprocessing- Mitchell,1,,
604590,2021-09-01,65367,16674,POET Bioprocessing- Mitchell,64697,14042,POET Bioprocessing- Mitchell,1,,
604591,2021-10-01,65367,16674,POET Bioprocessing- Mitchell,64697,14042,POET Bioprocessing- Mitchell,1,,
604592,2021-11-01,65367,16674,POET Bioprocessing- Mitchell,64697,14042,POET Bioprocessing- Mitchell,1,,


### Filling in Missing Fuel Costs
 * The original EIA data is often incomplete.
 * Many utilities withold information about their fuel costs.
 * We have a couple of ways of estimating missing values, if you need complete data.

The ouput object created in the next cell will attempt to use all of these methods to fill in missing data.
To fill in missing fuel costs, we can pull monthly state-level average fuel costs from EIA, and we can use rolling averages to fill in short gaps in the data.
* Set `fill_fuel_cost=True` when creating an output object to use average monthly fuel costs pre-downloaded in bulk from the EIA API.
* Set `roll_fuel_cost=True` when creating an output object to use a 12-month rolling average based on available data to fill in gaps.
* These options can be used together to fill in as many gaps as possible.

In [15]:
pudl_out_fill = pudl.output.pudltabl.PudlTabl(
    pudl_engine=pudl_engine, # we always need a pudl_engine
    freq='MS',               # Aggregate tables monthly
    fill_fuel_cost=True,     # Fill in missing fuel cost records with state-level averages from EIA's API
    roll_fuel_cost=True,     # Fill in missing fuel cost records with a 12-month rolling average.
)

In [16]:
%%time
pudl_out_fill.frc_eia923()

2022-12-22 01:23:13 [    INFO] catalystcoop.pudl.output.eia923:288 filling in fuel cost NaNs
2022-12-22 01:23:15 [    INFO] catalystcoop.pudl.output.eia923:298 filling in fuel cost NaNs with rolling averages
2022-12-22 01:24:14 [    INFO] catalystcoop.pudl.transform.eia861:456 Started with 81265 missing BA Codes out of 185357 records (43.84%)
2022-12-22 01:24:17 [    INFO] catalystcoop.pudl.transform.eia861:480 Ended with 14923 missing BA Codes out of 185357 records (8.05%)
2022-12-22 01:24:18 [    INFO] catalystcoop.pudl.output.eia860:177 91.6% of plant records have consistently reported BA Codes
2022-12-22 01:24:18 [    INFO] catalystcoop.pudl.output.eia860:227 Before any filling treatment has been applied. 43.8% of records have no BA codes
2022-12-22 01:24:18 [    INFO] catalystcoop.pudl.output.eia860:227 Backfilling and consistent value is the same. Filled w/ most consistent BA code. 10.9% of records have no BA codes
2022-12-22 01:24:18 [    INFO] catalystcoop.pudl.output.eia860:22

CPU times: user 1min 10s, sys: 1.52 s, total: 1min 12s
Wall time: 1min 12s


Unnamed: 0,report_date,plant_id_eia,plant_id_pudl,plant_name_eia,utility_id_eia,utility_id_pudl,utility_name_eia,ash_content_pct,chlorine_content_ppm,fuel_consumed_mmbtu,fuel_cost_from_eiaapi,fuel_cost_per_mmbtu,fuel_mmbtu_per_unit,fuel_received_units,fuel_type_code_pudl,mercury_content_ppm,moisture_content_pct,sulfur_content_pct,total_fuel_cost
0,2008-01-01,3,32,Barry,195,18,Alabama Power Co,5.450288,,7183512.000,False,2.131684,23.049712,311653.0,coal,,,0.488324,1.531298e+07
1,2008-02-01,3,32,Barry,195,18,Alabama Power Co,5.593900,,5679395.265,False,2.143524,22.995086,246983.0,coal,,,0.502347,1.217392e+07
2,2008-03-01,3,32,Barry,195,18,Alabama Power Co,5.510000,,6720962.130,False,2.574383,22.987393,292376.0,coal,,,0.506358,1.730233e+07
3,2008-04-01,3,32,Barry,195,18,Alabama Power Co,5.586936,,8092480.028,False,2.787388,22.919484,353083.0,coal,,,0.500435,2.255688e+07
4,2008-05-01,3,32,Barry,195,18,Alabama Power Co,5.309342,,7715891.226,False,2.788092,22.886312,337140.0,coal,,,0.528132,2.151261e+07
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
246432,2021-07-01,64020,14650,West Riverside Energy Center,20856,364,Wisconsin Power & Light Co,0.000000,,2559615.346,False,3.539000,1.046000,2447051.0,gas,0.0,,0.000000,9.058479e+06
246433,2021-08-01,64020,14650,West Riverside Energy Center,20856,364,Wisconsin Power & Light Co,0.000000,,2329902.072,False,3.892000,1.048000,2223189.0,gas,0.0,,0.000000,9.067979e+06
246434,2021-09-01,64020,14650,West Riverside Energy Center,20856,364,Wisconsin Power & Light Co,0.000000,,2013847.500,False,4.687000,1.068000,1885625.0,gas,0.0,,0.000000,9.438903e+06
246435,2021-10-01,64020,14650,West Riverside Energy Center,20856,364,Wisconsin Power & Light Co,0.000000,,2065165.968,False,4.942000,1.068000,1933676.0,gas,0.0,,0.000000,1.020605e+07


## Free Memory
Because we use this notebook on our JupyterHub, which has limited memory, we need to delete the cached dataframes when we're done with them.

In [17]:
del pudl_out
del pudl_out_ms
del pudl_out_as
del pudl_out_fill

# Denormalized Output Tables
* Below, we'll extract and show a sample of several of the available denormalized PUDL output tables.
* You can see the full list of available output methods in the [PUDL API docs](https://catalystcoop-pudl.readthedocs.io/en/latest/autoapi/pudl/output/pudltabl/index.html#pudl.output.pudltabl.PudlTabl)

In [18]:
pudl_out = pudl.output.pudltabl.PudlTabl(pudl_engine=pudl_engine)

### EIA 860 Plants

In [19]:
%%time
pudl_out.plants_eia860()

2022-12-22 01:24:25 [    INFO] catalystcoop.pudl.transform.eia861:456 Started with 81265 missing BA Codes out of 185357 records (43.84%)
2022-12-22 01:24:27 [    INFO] catalystcoop.pudl.transform.eia861:480 Ended with 14923 missing BA Codes out of 185357 records (8.05%)
2022-12-22 01:24:28 [    INFO] catalystcoop.pudl.output.eia860:177 91.6% of plant records have consistently reported BA Codes
2022-12-22 01:24:28 [    INFO] catalystcoop.pudl.output.eia860:227 Before any filling treatment has been applied. 43.8% of records have no BA codes
2022-12-22 01:24:28 [    INFO] catalystcoop.pudl.output.eia860:227 Backfilling and consistent value is the same. Filled w/ most consistent BA code. 10.9% of records have no BA codes
2022-12-22 01:24:29 [    INFO] catalystcoop.pudl.output.eia860:227 SWPP is most consistent value. Filled w/ oldest BA code. 8.3% of records have no BA codes
2022-12-22 01:24:29 [    INFO] catalystcoop.pudl.output.eia860:227 NWMT is most consistent value. Filled w/ oldest B

CPU times: user 7.36 s, sys: 439 ms, total: 7.8 s
Wall time: 7.91 s


Unnamed: 0,plant_id_eia,plant_name_eia,city,county,latitude,longitude,state,street_address,zip_code,timezone,report_date,ash_impoundment,ash_impoundment_lined,ash_impoundment_status,balancing_authority_code_eia,balancing_authority_name_eia,datum,energy_storage,ferc_cogen_docket_no,ferc_cogen_status,ferc_exempt_wholesale_generator_docket_no,ferc_exempt_wholesale_generator,ferc_small_power_producer_docket_no,ferc_small_power_producer,ferc_qualifying_facility_docket_no,grid_voltage_1_kv,grid_voltage_2_kv,grid_voltage_3_kv,iso_rto_code,liquefied_natural_gas_storage,natural_gas_local_distribution_company,natural_gas_storage,natural_gas_pipeline_name_1,natural_gas_pipeline_name_2,natural_gas_pipeline_name_3,nerc_region,net_metering,pipeline_notes,primary_purpose_id_naics,regulatory_status_code,reporting_frequency_code,sector_id_eia,sector_name_eia,service_area,transmission_distribution_owner_id,transmission_distribution_owner_name,transmission_distribution_owner_state,utility_id_eia,water_source,data_maturity,plant_id_pudl,utility_name_eia,utility_id_pudl,balancing_authority_code_eia_consistent_rate
0,1,Sand Point,Sand Point,Aleutians East,55.339722,-160.497222,AK,100 Power Plant Way,99661,America/Anchorage,2022-01-01,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,63560,,monthly_update,14527,"TDX Sand Point Generating, LLC",6409,
1,1,Sand Point,Sand Point,Aleutians East,55.339722,-160.497222,AK,100 Power Plant Way,99661,America/Anchorage,2021-01-01,False,False,,,,,False,,False,,False,,False,,0.48,,,,False,,False,,,,UNK,,,22,RE,A,1,Electric Utility,,63560,"TDX Sand Point Generating, LLC",AK,63560,,final,14527,"TDX Sand Point Generating, LLC",6409,
2,1,Sand Point,Sand Point,Aleutians East,55.339722,-160.497222,AK,100 Power Plant Way,99661,America/Anchorage,2020-01-01,False,False,,,,,False,,False,,False,,False,,0.48,,,,False,,False,,,,UNK,,,22,NR,A,2,IPP Non-CHP,,63560,"TDX Sand Point Generating, LLC",AK,63560,,final,14527,"TDX Sand Point Generating, LLC",6409,
3,1,Sand Point,Sand Point,Aleutians East,55.339722,-160.497222,AK,100 Power Plant Way,99661,America/Anchorage,2019-01-01,False,False,,,,,False,,False,,False,,False,,0.48,,,,False,,False,,,,UNK,,,22,NR,A,2,IPP Non-CHP,,63560,"TDX Sand Point Generating, LLC",AK,63560,,final,14527,"TDX Sand Point Generating, LLC",6409,
4,2,Bankhead Dam,Northport,Tuscaloosa,33.458665,-87.356820,AL,19001 Lock 17 Road,35476,America/Chicago,2022-01-01,,,,SOCO,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,195,,monthly_update,848,Alabama Power Co,18,1.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
185352,65959,WMATA - Cheverly Metro,Cheverly,Prince Georges,38.915643,-76.917372,MD,5501 Columbia Park Rd.,20785,America/New_York,2021-01-01,,,,PJM,"PJM Interconnection, LLC",,False,,False,,False,pending,True,,13.00,,,,,,False,,,,RFC,,,22,NR,,2,IPP Non-CHP,,15270,Potomac Electric Power Co,MD,61944,,final,17198,GSRP,6113,1.0
185353,65960,WMATA - Naylor Rd. Metro,Temple Hills,Prince Georges,38.850340,-76.957300,MD,3101 Branch Ave,20748,America/New_York,2021-01-01,,,,PJM,"PJM Interconnection, LLC",,False,,False,,False,pending,True,,13.00,,,,,,False,,,,RFC,,,22,NR,,2,IPP Non-CHP,,15270,Potomac Electric Power Co,MD,61944,,final,17199,GSRP,6113,1.0
185354,65961,WMATA - S. Ave. Carport (East),Hillcrest Heights,Prince Georges,38.840660,-76.975200,MD,1411 Southern Ave SE,20748,America/New_York,2021-01-01,,,,PJM,"PJM Interconnection, LLC",,False,,False,,False,pending,True,,13.00,,,,,,False,,,,RFC,,,22,NR,,2,IPP Non-CHP,,15270,Potomac Electric Power Co,MD,61944,,final,17200,GSRP,6113,1.0
185355,65962,"Mesquite Solar 4, LLC",Tonopah,Maricopa,33.326467,-112.918950,AZ,39904 W. Elliot Road,85354,America/Phoenix,2021-01-01,,,,CISO,California Independent System Operator,,True,,False,EG22-218-000,True,,False,,230.00,,,,,,False,,,,WECC,,,22,NR,,2,IPP Non-CHP,,16609,San Diego Gas & Electric Co,CA,56769,,final,17201,Consolidated Edison Development Inc.,1318,1.0


### EIA 860 Generators

In [20]:
%%time
pudl_out.gens_eia860()

2022-12-22 01:24:47 [    INFO] catalystcoop.pudl.transform.eia861:456 Started with 81265 missing BA Codes out of 185357 records (43.84%)
2022-12-22 01:24:49 [    INFO] catalystcoop.pudl.transform.eia861:480 Ended with 14923 missing BA Codes out of 185357 records (8.05%)
2022-12-22 01:24:50 [    INFO] catalystcoop.pudl.output.eia860:177 91.6% of plant records have consistently reported BA Codes
2022-12-22 01:24:50 [    INFO] catalystcoop.pudl.output.eia860:227 Before any filling treatment has been applied. 43.8% of records have no BA codes
2022-12-22 01:24:50 [    INFO] catalystcoop.pudl.output.eia860:227 Backfilling and consistent value is the same. Filled w/ most consistent BA code. 10.9% of records have no BA codes
2022-12-22 01:24:50 [    INFO] catalystcoop.pudl.output.eia860:227 SWPP is most consistent value. Filled w/ oldest BA code. 8.3% of records have no BA codes
2022-12-22 01:24:50 [    INFO] catalystcoop.pudl.output.eia860:227 NWMT is most consistent value. Filled w/ oldest B

CPU times: user 35.7 s, sys: 2.7 s, total: 38.4 s
Wall time: 39.4 s


Unnamed: 0,report_date,plant_id_eia,plant_id_pudl,plant_name_eia,utility_id_eia,utility_id_pudl,utility_name_eia,generator_id,associated_combined_heat_power,bga_source,bypass_heat_recovery,capacity_mw,carbon_capture,city,cofire_fuels,county,current_planned_operating_date,data_maturity,deliver_power_transgrid,distributed_generation,duct_burners,energy_source_1_transport_1,energy_source_1_transport_2,energy_source_1_transport_3,energy_source_2_transport_1,energy_source_2_transport_2,energy_source_2_transport_3,energy_source_code_1,energy_source_code_2,energy_source_code_3,energy_source_code_4,energy_source_code_5,energy_source_code_6,energy_storage_capacity_mwh,ferc_qualifying_facility,fluidized_bed_tech,fuel_type_code_pudl,fuel_type_count,latitude,longitude,minimum_load_mw,multiple_fuels,nameplate_power_factor,net_capacity_mwdc,operating_date,operating_switch,operational_status,operational_status_code,original_planned_operating_date,other_combustion_tech,other_modifications_date,other_planned_modifications,owned_by_non_utility,ownership_code,planned_derate_date,planned_energy_source_code_1,planned_modifications,planned_net_summer_capacity_derate_mw,planned_net_summer_capacity_uprate_mw,planned_net_winter_capacity_derate_mw,planned_net_winter_capacity_uprate_mw,planned_new_capacity_mw,planned_new_prime_mover_code,planned_repower_date,planned_retirement_date,planned_uprate_date,previously_canceled,prime_mover_code,pulverized_coal_tech,reactive_power_output_mvar,retirement_date,rto_iso_lmp_node_id,rto_iso_location_wholesale_reporting_id,solid_fuel_gasification,startup_source_code_1,startup_source_code_2,startup_source_code_3,startup_source_code_4,state,stoker_tech,street_address,subcritical_tech,summer_capacity_estimate,summer_capacity_mw,summer_estimated_capability_mw,supercritical_tech,switch_oil_gas,syncronized_transmission_grid,technology_description,time_cold_shutdown_full_load_code,timezone,topping_bottoming_code,turbines_inverters_hydrokinetics,turbines_num,ultrasupercritical_tech,unit_id_pudl,uprate_derate_completed_date,uprate_derate_during_year,winter_capacity_estimate,winter_capacity_mw,winter_estimated_capability_mw,zip_code
523342,2001-01-01,2,848,Bankhead Dam,195,18,Alabama Power Co,1,False,,False,45.0,,Northport,,Tuscaloosa,NaT,final,,False,False,,,,,,,WAT,,,,,,,False,,hydro,1,33.458665,-87.35682,,,,,1963-07-01,,existing,OP,NaT,,NaT,,,S,NaT,,,,,,,,,NaT,NaT,NaT,,HY,,,NaT,,,,,,,,AL,,19001 Lock 17 Road,,,56.0,,,,,Conventional Hydroelectric,,America/Chicago,X,,,,,NaT,,,56.0,,35476
523341,2001-01-01,3,32,Barry,195,18,Alabama Power Co,1,False,,False,153.1,,Bucks,,Mobile,NaT,final,,False,False,WT,,,PL,,,BIT,NG,,,,,,False,,coal,3,31.006900,-88.01030,,,,,1954-02-01,,existing,OP,NaT,,NaT,,,S,NaT,,,,,,,,,NaT,NaT,NaT,,ST,True,,NaT,,,,,,,,AL,,North Highway 43,True,,138.0,,,,,Conventional Steam Coal,,America/Chicago,X,,,,,NaT,,,138.0,,36512
523340,2001-01-01,3,32,Barry,195,18,Alabama Power Co,2,False,,False,153.1,,Bucks,,Mobile,NaT,final,,False,False,WT,,,PL,,,BIT,NG,,,,,,False,,coal,3,31.006900,-88.01030,,,,,1954-07-01,,existing,OP,NaT,,NaT,,,S,NaT,,,,,,,,,NaT,NaT,NaT,,ST,True,,NaT,,,,,,,,AL,,North Highway 43,True,,139.0,,,,,Conventional Steam Coal,,America/Chicago,X,,,,,NaT,,,139.0,,36512
523339,2001-01-01,3,32,Barry,195,18,Alabama Power Co,3,False,,False,272.0,,Bucks,,Mobile,NaT,final,,False,False,WT,,,PL,,,BIT,NG,,,,,,False,,coal,3,31.006900,-88.01030,,,,,1959-07-01,,existing,OP,NaT,,NaT,,,S,NaT,,,,,,,,,NaT,NaT,NaT,,ST,True,,NaT,,,,,,,,AL,,North Highway 43,True,,251.0,,,,,Conventional Steam Coal,,America/Chicago,X,,,,,NaT,,,251.0,,36512
523338,2001-01-01,3,32,Barry,195,18,Alabama Power Co,4,False,,False,403.7,,Bucks,,Mobile,NaT,final,,False,False,WT,,,PL,,,BIT,NG,,,,,,False,,coal,3,31.006900,-88.01030,,,,,1969-12-01,,existing,OP,NaT,,NaT,,,S,NaT,,,,,,,,,NaT,NaT,NaT,,ST,True,,NaT,,,,,,,,AL,,North Highway 43,True,,362.0,,,,,Conventional Steam Coal,,America/Chicago,X,,,,,NaT,,,362.0,,36512
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4,2022-01-01,65721,16835,RT405 Westerlo Solar 2,64985,14123,"RT405 Westerlo Solar 2, LLC",7104,False,,False,5.0,,Albany,,Albany,NaT,monthly_update,,,False,,,,,,,SUN,,,,,,,,,solar,1,42.445060,-74.01609,,,,7.0,2020-03-01,,existing,OP,NaT,,NaT,,,,NaT,,,,,,,,,NaT,NaT,NaT,,PV,,,NaT,,,,,,,,NY,,6844 Route 32,,,5.0,,,,,Solar Photovoltaic,,America/New_York,X,,,,,NaT,,,4.8,,12083
3,2022-01-01,65722,16852,Webster Solar,64882,14134,"Webster Solar, LLC",2538,False,,False,1.0,,Webster,,Worcester,NaT,monthly_update,,,False,,,,,,,SUN,,,,,,,,,solar,1,42.028100,-71.85031,,,,1.0,2017-12-01,,existing,OP,NaT,,NaT,,,,NaT,,,,,,,,,NaT,NaT,NaT,,PV,,,NaT,,,,,,,,MA,,338 Thompson Rd,,,1.0,,,,,Solar Photovoltaic,,America/New_York,X,,,,,NaT,,,0.8,,01570
2,2022-01-01,65723,16836,LR Wheatfield Solar 1,64986,14122,"LR Wheatfield Solar 1, LLC",15557,False,,False,5.0,,Wheatfield,,Niagara,NaT,monthly_update,,,False,,,,,,,SUN,,,,,,,,,solar,1,43.121630,-78.91409,,,,6.3,2020-10-01,,existing,OP,NaT,,NaT,,,,NaT,,,,,,,,,NaT,NaT,NaT,,PV,,,NaT,,,,,,,,NY,,2469 Lockport Road,,,5.0,,,,,Solar Photovoltaic,,America/New_York,X,,,,,NaT,,,4.8,,14132
1,2022-01-01,65724,16837,Pendleton Solar 1,64987,14121,"Pendleton Solar 1, LLC",15889,False,,False,5.0,,Lockport,,Niagara,NaT,monthly_update,,,False,,,,,,,SUN,,,,,,,,,solar,1,43.097230,-78.75382,,,,6.0,2020-10-01,,existing,OP,NaT,,NaT,,,,NaT,,,,,,,,,NaT,NaT,NaT,,PV,,,NaT,,,,,,,,NY,,6707 Bear Ridge Road,,,5.0,,,,,Solar Photovoltaic,,America/New_York,X,,,,,NaT,,,4.8,,14094


### EIA 923 Generation and Fuel Consumption

In [21]:
%%time
pudl_out.gf_eia923()

2022-12-22 01:25:24 [    INFO] catalystcoop.pudl.transform.eia861:456 Started with 81265 missing BA Codes out of 185357 records (43.84%)
2022-12-22 01:25:26 [    INFO] catalystcoop.pudl.transform.eia861:480 Ended with 14923 missing BA Codes out of 185357 records (8.05%)
2022-12-22 01:25:27 [    INFO] catalystcoop.pudl.output.eia860:177 91.6% of plant records have consistently reported BA Codes
2022-12-22 01:25:27 [    INFO] catalystcoop.pudl.output.eia860:227 Before any filling treatment has been applied. 43.8% of records have no BA codes
2022-12-22 01:25:28 [    INFO] catalystcoop.pudl.output.eia860:227 Backfilling and consistent value is the same. Filled w/ most consistent BA code. 10.9% of records have no BA codes
2022-12-22 01:25:28 [    INFO] catalystcoop.pudl.output.eia860:227 SWPP is most consistent value. Filled w/ oldest BA code. 8.3% of records have no BA codes
2022-12-22 01:25:28 [    INFO] catalystcoop.pudl.output.eia860:227 NWMT is most consistent value. Filled w/ oldest B

CPU times: user 35.6 s, sys: 2.43 s, total: 38 s
Wall time: 40.6 s


Unnamed: 0,report_date,plant_id_eia,prime_mover_code,energy_source_code,plant_id_pudl,fuel_type_code_aer,plant_name_eia,utility_id_pudl,utility_name_eia,fuel_type_code_pudl,utility_id_eia,data_maturity,fuel_consumed_for_electricity_mmbtu,fuel_consumed_for_electricity_units,fuel_consumed_mmbtu,fuel_consumed_units,net_generation_mwh,fuel_mmbtu_per_unit
0,2001-01-01,2,HY,WAT,848,HYC,Bankhead Dam,18,Alabama Power Co,hydro,195,final,195479.69,0.0,195479.69,0.0,18918.000,0.000
1,2001-01-01,3,UNK,BIT,32,COL,Barry,18,Alabama Power Co,coal,195,final,8275496.00,348330.0,8275496.00,348330.0,852306.000,23.760
2,2001-01-01,3,UNK,DFO,32,DFO,Barry,18,Alabama Power Co,oil,195,final,0.00,0.0,0.00,0.0,0.000,0.000
3,2001-01-01,3,UNK,NG,32,NG,Barry,18,Alabama Power Co,gas,195,final,2230976.00,2140642.0,2230976.00,2140642.0,306338.000,1.040
4,2001-01-01,4,HY,WAT,847,HYC,Walter Bouldin Dam,18,Alabama Power Co,hydro,195,final,535693.72,0.0,535693.72,0.0,51843.000,0.000
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2687316,2021-12-01,65731,PV,SUN,16978,SUN,Rockford CS 1,14145,"Rockford CS 1, LLC",solar,64990,final,1161.00,0.0,1161.00,0.0,131.248,0.000
2687317,2021-12-01,65732,PV,SUN,16979,SUN,Rockford CS 2,14143,"Rockford CS 2, LLC",solar,64988,final,1180.00,0.0,1180.00,0.0,133.418,0.000
2687318,2021-12-01,65740,PV,SUN,16986,SUN,"Canal Energy S23, LLC",14142,"Canal Energy S23, LLC",solar,64948,final,0.00,0.0,0.00,0.0,0.000,0.000
2687319,2021-12-01,65767,IC,DFO,17013,DFO,Gustavus,432,Alaska Power Co,oil,219,final,599.00,103.0,599.00,103.0,53.905,5.820


### EIA 923 Fuel Receipts and Costs

In [22]:
%%time
pudl_out.frc_eia923()

2022-12-22 01:25:59 [    INFO] catalystcoop.pudl.transform.eia861:456 Started with 81265 missing BA Codes out of 185357 records (43.84%)
2022-12-22 01:26:01 [    INFO] catalystcoop.pudl.transform.eia861:480 Ended with 14923 missing BA Codes out of 185357 records (8.05%)
2022-12-22 01:26:02 [    INFO] catalystcoop.pudl.output.eia860:177 91.6% of plant records have consistently reported BA Codes
2022-12-22 01:26:02 [    INFO] catalystcoop.pudl.output.eia860:227 Before any filling treatment has been applied. 43.8% of records have no BA codes
2022-12-22 01:26:03 [    INFO] catalystcoop.pudl.output.eia860:227 Backfilling and consistent value is the same. Filled w/ most consistent BA code. 10.9% of records have no BA codes
2022-12-22 01:26:03 [    INFO] catalystcoop.pudl.output.eia860:227 SWPP is most consistent value. Filled w/ oldest BA code. 8.3% of records have no BA codes
2022-12-22 01:26:03 [    INFO] catalystcoop.pudl.output.eia860:227 NWMT is most consistent value. Filled w/ oldest B

CPU times: user 16.3 s, sys: 865 ms, total: 17.2 s
Wall time: 17.5 s


Unnamed: 0,report_date,plant_id_eia,plant_id_pudl,plant_name_eia,utility_id_eia,utility_id_pudl,utility_name_eia,ash_content_pct,chlorine_content_ppm,coalmine_county_id_fips,contract_expiration_date,contract_type_code,data_maturity,energy_source_code,fuel_consumed_mmbtu,fuel_cost_from_eiaapi,fuel_cost_per_mmbtu,fuel_group_code,fuel_mmbtu_per_unit,fuel_received_units,fuel_type_code_pudl,mercury_content_ppm,mine_id_msha,mine_name,mine_state,mine_type_code,moisture_content_pct,natural_gas_delivery_contract_type_code,natural_gas_transport_code,primary_transportation_mode_code,secondary_transportation_mode_code,sulfur_content_pct,supplier_name,total_fuel_cost
0,2008-01-01,3,32,Barry,195,18,Alabama Power Co,5.4,,,2008-04-01,C,final,BIT,5992417.200,False,2.135,coal,23.100,259412.0,coal,,,mina pribbenow,COL,SU,,,firm,RV,,0.49,interocean coal,1.279381e+07
1,2008-01-01,3,32,Barry,195,18,Alabama Power Co,5.7,,,2008-04-01,C,final,BIT,1191094.800,False,2.115,coal,22.800,52241.0,coal,,,mina pribbenow,COL,SU,,,firm,RV,,0.48,interocean coal,2.519166e+06
2,2008-01-01,3,32,Barry,195,18,Alabama Power Co,0.0,,,NaT,C,final,NG,2892180.141,False,8.631,natural_gas,1.039,2783619.0,gas,,,,,,,,firm,PL,,0.00,bay gas pipeline,2.496241e+07
3,2008-01-01,7,207,Gadsden,195,18,Alabama Power Co,14.7,,01007,2015-12-01,C,final,BIT,625020.170,False,2.776,coal,24.610,25397.0,coal,,,alabama coal,AL,SU,,,firm,TR,,1.69,alabama coal,1.735056e+06
4,2008-01-01,7,207,Gadsden,195,18,Alabama Power Co,15.5,,01145,2008-11-01,S,final,BIT,18676.744,False,3.381,coal,24.446,764.0,coal,,,flat rock 2,AL,S,,,firm,TR,,0.84,d & e mining,6.314607e+04
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
608560,2021-12-01,62115,12284,AES Alamitos Energy Center,61669,7013,"AES Alamitos Energy, LLC",0.0,,,NaT,T,final,NG,1470609.790,False,,natural_gas,1.033,1423630.0,gas,0.0,,,,,,firm,firm,PL,,0.00,various (natural gas spot purchases only),
608561,2021-12-01,62565,12709,"Hill Top Energy Center, LLC",62064,7022,"Hill Top Energy Center, LLC",0.0,,,2040-05-01,C,final,NG,2882999.532,False,,natural_gas,1.041,2769452.0,gas,0.0,,,,,,firm,firm,PL,,0.00,eqt,
608562,2021-12-01,63335,13618,HO Clarke Generating,63082,6832,ProEnergy Services,0.0,,,2027-11-01,C,final,NG,46249.860,False,,natural_gas,1.020,45343.0,gas,0.0,,,,,,firm,firm,PL,,0.00,kinder morgan,
608563,2021-12-01,63688,13638,Topaz Generating,63082,6832,ProEnergy Services,0.0,,,2029-06-01,C,final,NG,84151.020,False,,natural_gas,1.020,82501.0,gas,0.0,,,,,,firm,firm,PL,,0.00,morgan stanley,


### EIA 923 Net Generation by Generator

In [23]:
%%time
pudl_out.gen_eia923()

2022-12-22 01:26:12 [    INFO] catalystcoop.pudl.transform.eia861:456 Started with 81265 missing BA Codes out of 185357 records (43.84%)
2022-12-22 01:26:15 [    INFO] catalystcoop.pudl.transform.eia861:480 Ended with 14923 missing BA Codes out of 185357 records (8.05%)
2022-12-22 01:26:16 [    INFO] catalystcoop.pudl.output.eia860:177 91.6% of plant records have consistently reported BA Codes
2022-12-22 01:26:16 [    INFO] catalystcoop.pudl.output.eia860:227 Before any filling treatment has been applied. 43.8% of records have no BA codes
2022-12-22 01:26:16 [    INFO] catalystcoop.pudl.output.eia860:227 Backfilling and consistent value is the same. Filled w/ most consistent BA code. 10.9% of records have no BA codes
2022-12-22 01:26:16 [    INFO] catalystcoop.pudl.output.eia860:227 SWPP is most consistent value. Filled w/ oldest BA code. 8.3% of records have no BA codes
2022-12-22 01:26:16 [    INFO] catalystcoop.pudl.output.eia860:227 NWMT is most consistent value. Filled w/ oldest B

CPU times: user 12.7 s, sys: 626 ms, total: 13.3 s
Wall time: 13.8 s


Unnamed: 0,report_date,plant_id_eia,plant_id_pudl,plant_name_eia,utility_id_eia,utility_id_pudl,utility_name_eia,generator_id,data_maturity,net_generation_mwh,unit_id_pudl
0,2008-01-01,3,32,Barry,195,18,Alabama Power Co,1,final,96021.0,
1,2008-02-01,3,32,Barry,195,18,Alabama Power Co,1,final,79256.0,
2,2008-03-01,3,32,Barry,195,18,Alabama Power Co,1,final,91687.0,
3,2008-04-01,3,32,Barry,195,18,Alabama Power Co,1,final,73693.0,
4,2008-05-01,3,32,Barry,195,18,Alabama Power Co,1,final,68161.0,
...,...,...,...,...,...,...,...,...,...,...,...
604589,2021-08-01,65367,16674,POET Bioprocessing- Mitchell,64697,14042,POET Bioprocessing- Mitchell,1,final,,
604590,2021-09-01,65367,16674,POET Bioprocessing- Mitchell,64697,14042,POET Bioprocessing- Mitchell,1,final,,
604591,2021-10-01,65367,16674,POET Bioprocessing- Mitchell,64697,14042,POET Bioprocessing- Mitchell,1,final,,
604592,2021-11-01,65367,16674,POET Bioprocessing- Mitchell,64697,14042,POET Bioprocessing- Mitchell,1,final,,


## FERC Form 1
* Only a small subset of the 100+ tables that exist in the original FERC Form 1 have been cleaned and included in the PUDL DB.
* For tables not included here, you'll need to access the cloned multi-year FERC 1 DB that we produce. See the first tutorial notebook for more information.

### FERC 1 Large Steam Plants
The large steam plants report detailed operating expenses in this table, as well as operational characteristics.

In [24]:
%%time
pudl_out.plants_steam_ferc1()

CPU times: user 335 ms, sys: 18.7 ms, total: 354 ms
Wall time: 369 ms


Unnamed: 0,report_year,utility_id_ferc1,utility_id_pudl,utility_name_ferc1,plant_id_pudl,plant_id_ferc1,plant_name_ferc1,asset_retirement_cost,avg_num_employees,capacity_factor,capacity_mw,capex_equipment,capex_land,capex_per_mw,capex_structures,capex_total,construction_type,construction_year,installation_year,net_generation_mwh,not_water_limited_capacity_mw,opex_allowances,opex_boiler,opex_coolants,opex_electric,opex_engineering,opex_fuel,opex_fuel_per_mwh,opex_misc_power,opex_misc_steam,opex_nonfuel_per_mwh,opex_operations,opex_per_mwh,opex_plants,opex_production_total,opex_rents,opex_steam,opex_steam_other,opex_structures,opex_total_nonfuel,opex_transfer,peak_demand_mw,plant_capability_mw,plant_hours_connected_while_generating,plant_type,record_id,water_limited_capacity_mw
0,1994,342,7,AEP Generating Company,526,1076,rockport unit 1,,,0.819843,650.00,4.906841e+08,6395551.0,894688.3,84467746.0,5.815474e+08,conventional,1984.0,1984.0,4.668184e+06,650.0,,3185935.0,,353599.0,427906.0,51694529.0,11.073799,1040610.0,781181.0,1.778100,1032559.0,12.9,631598.0,59995027.0,7559.0,442763.0,,396788.0,8300498.0,,650.0,,,steam,f1_steam_1994_12_1_0_1,
1,1994,342,7,AEP Generating Company,526,1077,rockport unit 2,,,0.781755,650.00,3.933937e+07,74411.0,67173.7,4249136.0,4.366292e+07,conventional,1989.0,1989.0,4.451312e+06,650.0,,3374827.0,,384283.0,427747.0,48990225.0,11.005794,1028788.0,255391.0,16.850051,1026248.0,27.9,518870.0,123995060.0,67311927.0,446454.0,,230300.0,75004835.0,,650.0,,,steam,f1_steam_1994_12_1_0_2,
2,1994,342,7,AEP Generating Company,526,2228,rockport,,,0.800799,1300.00,5.300235e+08,6469962.0,480931.0,88716882.0,6.252103e+08,conventional,1984.0,1989.0,9.119496e+06,1300.0,,6560762.0,,737882.0,855653.0,100684754.0,11.040605,2069398.0,1036572.0,9.134862,2058807.0,20.2,1150468.0,183990087.0,67319486.0,889217.0,,627088.0,83305333.0,,1300.0,,,steam,f1_steam_1994_12_1_0_3,
3,1994,342,7,AEP Generating Company,526,1110,rockport total plant,,462.0,0.781224,2600.00,1.049180e+09,12969249.0,476006.1,175466216.0,1.237616e+09,conventional,1984.0,1989.0,1.779316e+07,2600.0,,13121517.0,,1475766.0,1711307.0,196297854.0,11.032210,4138807.0,2073142.0,9.377555,4117640.0,20.4,2300937.0,363154178.0,134884608.0,1778431.0,,1254169.0,166856324.0,,2600.0,,,steam,f1_steam_1994_12_1_0_4,
4,1994,294,18,ALABAMA POWER COMPANY,231,1,gorgas,,438.0,0.597150,1417.00,3.273578e+08,312098.0,276264.0,63796151.0,3.914661e+08,conventional,1929.0,1972.0,7.412375e+06,1302.0,,17760784.0,,1391099.0,2276025.0,118304925.0,15.960461,7506206.0,645822.0,5.766998,3065839.0,21.7,5957567.0,161052079.0,,2692720.0,,1451092.0,42747154.0,,1294.0,,8760.0,steam,f1_steam_1994_12_2_0_1,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
30696,2021,187,365,Wisconsin Public Service Corporation,500,1536,pulliam 31,0.0,3.0,0.113172,90.95,3.454014e+07,0.0,397509.6,1613363.0,3.615350e+07,conventional,2003.0,2003.0,9.016683e+04,106.0,0.0,0.0,0.0,37506.0,10998.0,6028340.0,66.857622,211659.0,0.0,6.678154,11172.0,73.5,326289.0,6630488.0,0.0,0.0,0.0,4524.0,602148.0,0.0,95.0,82.0,1617.0,combustion_turbine,steam_electric_generating_plant_statistics_lar...,79.0
30697,2021,187,365,Wisconsin Public Service Corporation,13753,1994,two creeks,9813386.0,1.0,0.218392,100.00,1.315771e+08,437227.0,1465371.0,4709377.0,1.465371e+08,conventional,2020.0,2020.0,1.913110e+05,,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,91117.0,0.0,8.757284,0.0,8.8,1252982.0,1675365.0,331266.0,0.0,0.0,0.0,1675365.0,0.0,100.0,100.0,4158.0,photovoltaic,steam_electric_generating_plant_statistics_lar...,
30698,2021,187,365,Wisconsin Public Service Corporation,343,995,west marinette,0.0,3.0,0.040480,187.20,2.919127e+07,267961.0,201735.9,8305732.0,3.776496e+07,conventional,1971.0,1993.0,6.638229e+04,204.0,0.0,0.0,0.0,143226.0,17673.0,5399094.0,81.333345,49892.0,0.0,9.088643,80666.0,90.4,301806.0,6002419.0,0.0,0.0,0.0,10062.0,603325.0,0.0,151.0,154.0,1234.0,combustion_turbine,steam_electric_generating_plant_statistics_lar...,154.0
30699,2021,187,365,Wisconsin Public Service Corporation,470,6947,"weston w31, w32",0.0,0.0,0.019257,76.34,8.270261e+06,0.0,111533.0,244169.0,8.514430e+06,conventional,1969.0,1973.0,1.287767e+04,88.0,0.0,0.0,0.0,0.0,4554.0,1160318.0,90.103073,14553.0,0.0,22.830986,4632.0,112.9,258877.0,1454328.0,0.0,0.0,0.0,11394.0,294010.0,0.0,57.0,69.0,422.0,combustion_turbine,steam_electric_generating_plant_statistics_lar...,63.0


### FERC 1 Fuel
Fuel consumption by the large steam plants, broken down by plant and fuel type.

In [25]:
%%time
pudl_out.fuel_ferc1()

CPU times: user 448 ms, sys: 11.8 ms, total: 460 ms
Wall time: 467 ms


Unnamed: 0,report_year,utility_id_ferc1,utility_id_pudl,utility_name_ferc1,plant_id_pudl,plant_name_ferc1,fuel_consumed_mmbtu,fuel_consumed_total_cost,fuel_consumed_units,fuel_cost_per_mmbtu,fuel_cost_per_unit_burned,fuel_cost_per_unit_delivered,fuel_mmbtu_per_unit,fuel_type_code_pudl,fuel_units,record_id
0,1994,342,7,AEP Generating Company,526,rockport,8.921254e+07,9.996752e+07,5377489.0,1.121,18.59000,18.530,16.590000,coal,ton,f1_fuel_1994_12_1_0_7
1,1994,342,7,AEP Generating Company,526,rockport,1.867951e+05,7.239458e+05,32345.0,,22.38200,21.910,5.775084,oil,bbl,f1_fuel_1994_12_1_0_8
2,1994,342,7,AEP Generating Company,526,rockport total plant,1.739994e+08,1.948474e+08,10486945.0,1.120,18.58000,18.530,16.592000,coal,ton,f1_fuel_1994_12_1_0_10
3,1994,342,7,AEP Generating Company,526,rockport total plant,3.642256e+05,1.403263e+06,63068.0,,22.25000,21.910,5.775126,oil,bbl,f1_fuel_1994_12_1_0_11
4,1996,342,7,AEP Generating Company,526,rockport total plant,1.687690e+08,1.841852e+08,10115617.0,1.091,18.20800,18.645,16.684000,coal,ton,f1_fuel_1996_12_1_0_10
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
48816,2021,187,365,Wisconsin Public Service Corporation,1089,columbia,1.693591e+04,2.264141e+05,2922.0,13.369,77.48600,,5.796000,oil,bbl,steam_electric_generating_plant_statistics_lar...
48817,2021,187,365,Wisconsin Public Service Corporation,156,de pere energy center,4.140083e+04,6.756207e+05,7143.0,16.319,94.58500,,5.796000,oil,bbl,steam_electric_generating_plant_statistics_lar...
48818,2021,187,365,Wisconsin Public Service Corporation,156,de pere energy center,2.018031e+06,1.042841e+07,1921934.0,5.168,5.42600,,1.050000,gas,mcf,steam_electric_generating_plant_statistics_lar...
48819,2021,187,365,Wisconsin Public Service Corporation,343,west marinette,9.571515e+05,5.397782e+06,920338.0,5.640,5.86500,,1.040000,gas,mcf,steam_electric_generating_plant_statistics_lar...


### FERC 1 Fuel by Plant
Wide-form aggregated fuel totals by plant and year, identifying the relative cost and heat content proportions of different fuels, as well as the primary fuel for the plant.

In [26]:
%%time
pudl_out.fbp_ferc1()

CPU times: user 270 ms, sys: 14.3 ms, total: 285 ms
Wall time: 284 ms


Unnamed: 0,report_year,utility_id_ferc1,utility_id_pudl,utility_name_ferc1,plant_id_pudl,plant_name_ferc1,coal_fraction_cost,coal_fraction_mmbtu,fuel_cost,fuel_mmbtu,gas_fraction_cost,gas_fraction_mmbtu,nuclear_fraction_cost,nuclear_fraction_mmbtu,oil_fraction_cost,oil_fraction_mmbtu,primary_fuel_by_cost,primary_fuel_by_mmbtu,waste_fraction_cost,waste_fraction_mmbtu
0,1997,3,45,Boston Edison Company,649,* w. f. wyman 4,,,1.462814e+06,2.000961e+05,,,,,1.000000,1.000000,oil,oil,,
1,1994,3,45,Boston Edison Company,7767,edgar,,,5.310480e+04,1.356998e+04,,,,,1.000000,1.000000,oil,oil,,
2,1995,3,45,Boston Edison Company,7767,edgar,,,7.661832e+04,2.096124e+04,,,,,1.000000,1.000000,oil,oil,,
3,1996,3,45,Boston Edison Company,7767,edgar,,,3.835280e+04,8.892612e+03,,,,,1.000000,1.000000,oil,oil,,
4,1997,3,45,Boston Edison Company,7767,edgar,,,3.985240e+04,8.758441e+03,,,,,1.000000,1.000000,oil,oil,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
25411,2012,411,107,"Entergy Louisiana, LLC",615,waterford 3,,,6.324345e+07,8.543625e+07,,,1.0,1.0,,,nuclear,nuclear,,
25412,2013,411,107,"Entergy Louisiana, LLC",615,waterford 3,,,8.904868e+07,1.024225e+08,,,1.0,1.0,,,nuclear,nuclear,,
25413,2014,411,107,"Entergy Louisiana, LLC",615,waterford 3,,,9.339549e+09,9.961602e+07,,,1.0,1.0,,,nuclear,nuclear,,
25414,2015,411,107,"Entergy Louisiana, LLC",615,waterford 3,,,6.219582e+07,8.101124e+07,,,1.0,1.0,,,nuclear,nuclear,,


### FERC 1 Plant in Service
An accounting of how much electric plant infrastructure exists in each of the many FERC accounts. This is a very wide form table.

In [27]:
%%time
pudl_out.plant_in_service_ferc1()

CPU times: user 1.31 s, sys: 88.3 ms, total: 1.4 s
Wall time: 1.77 s


Unnamed: 0,report_year,utility_id_ferc1,utility_id_pudl,utility_name_ferc1,record_id,additions,adjustments,ending_balance,ferc_account,ferc_account_label,retirements,row_type_xbrl,starting_balance,transfers
0,1994,342,7,AEP Generating Company,f1_plant_in_srvce_1994_12_1_0_2,,,64475.0,301,organization,,ferc_account,64475.0,
1,1994,342,7,AEP Generating Company,f1_plant_in_srvce_1994_12_1_0_5,,,64475.0,,intangible_plant,,calculated,64475.0,
2,1994,342,7,AEP Generating Company,f1_plant_in_srvce_1994_12_1_0_8,,,6469962.0,310,land_and_land_rights_steam_production,,ferc_account,6469962.0,
3,1994,342,7,AEP Generating Company,f1_plant_in_srvce_1994_12_1_0_9,3900025.0,,88716882.0,311,structures_and_improvements_steam_production,-2691.0,ferc_account,84820923.0,-1375.0
4,1994,342,7,AEP Generating Company,f1_plant_in_srvce_1994_12_1_0_10,-1368103.0,,373780643.0,312,boiler_plant_equipment_steam_production,-862007.0,ferc_account,376005734.0,5019.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
311789,2021,444,14138,indiana-kentucky electric corporation,electric_plant_in_service_204_2021_c011304,,,,367,underground_conductors_and_devices_distributio...,,ferc_account,,
311790,2021,444,14138,indiana-kentucky electric corporation,electric_plant_in_service_204_2021_c011304,,,,358,underground_conductors_and_devices_transmissio...,,ferc_account,,
311791,2021,444,14138,indiana-kentucky electric corporation,electric_plant_in_service_204_2021_c011304,,,,366,underground_conduit_distribution_plant,,ferc_account,,
311792,2021,444,14138,indiana-kentucky electric corporation,electric_plant_in_service_204_2021_c011304,,,,357,underground_conduit_transmission_plant,,ferc_account,,


## Free Memory
Again, because wemay be on a JupyterHub with limited RAM per user, we need to delete the cached dataframes we've just created.

In [28]:
del pudl_out

# Analysis Outputs
* The PUDL Database is mainly meant to standardize the structure of data that's been reported in different ways over different years, so that it can all be used together.
* We typically don't include calculated values or big modifications to the original data.
* We're compiling a growing library of stock analyses in the `pudl.analysis` subpackage, which operate on data stored in the database.
* Some of these analytical outputs are build into the output object so that they can take advantage of the dataframe caching, and for convenient access.

## The Marginal Cost of Electricity (MCOE)
* One of our first analysis modules calculates fuel costs, heat rates, and capacity factors on a generator by generator basis.
* The long term goal is for it to provide a comprehensive marginal cost of electricity production (MCOE).
* The integration of operating costs from FERC Form 1 is still a work in progress, and hasn't been added in here yet.

### MCOE Requires Aggregation
* Fuel costs and other data need to be aggregated by month or year to calculate MCOE.
* This means we need an output object that aggregates by month or year.
* Because a single `NA` value can wipe out a whole aggregated category, you'll get more information with a monthly aggregation, but it currently takes more memory than the JupyterHub has access to.

In [29]:
pudl_out_annual = pudl.output.pudltabl.PudlTabl(
    pudl_engine=pudl_engine,
    freq="AS",
    fill_fuel_cost=True,
    roll_fuel_cost=True,
)

### Heat Rate by Generation Unit (MMBTU/MWh)
* A "Generation Unit" (identifyed by `unit_id_pudl` here) is a group of "boilers" (where fuel is consumed) and "generators" (where electricity is made) which are connected to each other.
* Because the fuel inputs and electricity outputs are comingled, this is the most granular level at which a direct heat rate calculation can be done.

In [30]:
%%time
pudl_out_annual.hr_by_unit()

2022-12-22 01:26:31 [    INFO] catalystcoop.pudl.transform.eia861:456 Started with 81265 missing BA Codes out of 185357 records (43.84%)
2022-12-22 01:26:33 [    INFO] catalystcoop.pudl.transform.eia861:480 Ended with 14923 missing BA Codes out of 185357 records (8.05%)
2022-12-22 01:26:34 [    INFO] catalystcoop.pudl.output.eia860:177 91.6% of plant records have consistently reported BA Codes
2022-12-22 01:26:34 [    INFO] catalystcoop.pudl.output.eia860:227 Before any filling treatment has been applied. 43.8% of records have no BA codes
2022-12-22 01:26:35 [    INFO] catalystcoop.pudl.output.eia860:227 Backfilling and consistent value is the same. Filled w/ most consistent BA code. 10.9% of records have no BA codes
2022-12-22 01:26:35 [    INFO] catalystcoop.pudl.output.eia860:227 SWPP is most consistent value. Filled w/ oldest BA code. 8.3% of records have no BA codes
2022-12-22 01:26:35 [    INFO] catalystcoop.pudl.output.eia860:227 NWMT is most consistent value. Filled w/ oldest B

CPU times: user 39.7 s, sys: 1.46 s, total: 41.2 s
Wall time: 42.5 s


Unnamed: 0,report_date,plant_id_eia,unit_id_pudl,net_generation_mwh,fuel_consumed_mmbtu,heat_rate_mmbtu_mwh
0,2008-01-01,7,1,279559.0,3.748396e+06,13.408247
1,2008-01-01,7,2,319739.0,4.208735e+06,13.163034
2,2008-01-01,8,1,4481336.0,4.370694e+07,9.753106
3,2008-01-01,8,2,501760.0,5.915611e+06,11.789722
4,2008-01-01,8,3,591564.0,6.806666e+06,11.506222
...,...,...,...,...,...,...
30454,2021-01-01,63922,1,,,
30455,2021-01-01,63923,1,,,
30456,2021-01-01,63924,1,,,
30457,2021-01-01,63927,1,,,


### Heat Rate by Generator (mmBTU/MWh)
* However, we do need per-generator heat rates to estimate per-generator fuel costs.

In [31]:
%%time
pudl_out_annual.hr_by_gen()

2022-12-22 01:27:25 [    INFO] catalystcoop.pudl.transform.eia861:456 Started with 81265 missing BA Codes out of 185357 records (43.84%)
2022-12-22 01:27:27 [    INFO] catalystcoop.pudl.transform.eia861:480 Ended with 14923 missing BA Codes out of 185357 records (8.05%)
2022-12-22 01:27:29 [    INFO] catalystcoop.pudl.output.eia860:177 91.6% of plant records have consistently reported BA Codes
2022-12-22 01:27:29 [    INFO] catalystcoop.pudl.output.eia860:227 Before any filling treatment has been applied. 43.8% of records have no BA codes
2022-12-22 01:27:29 [    INFO] catalystcoop.pudl.output.eia860:227 Backfilling and consistent value is the same. Filled w/ most consistent BA code. 10.9% of records have no BA codes
2022-12-22 01:27:29 [    INFO] catalystcoop.pudl.output.eia860:227 SWPP is most consistent value. Filled w/ oldest BA code. 8.3% of records have no BA codes
2022-12-22 01:27:29 [    INFO] catalystcoop.pudl.output.eia860:227 NWMT is most consistent value. Filled w/ oldest B

CPU times: user 37.3 s, sys: 2.68 s, total: 40 s
Wall time: 41 s


Unnamed: 0,report_date,plant_id_eia,unit_id_pudl,generator_id,heat_rate_mmbtu_mwh,fuel_type_code_pudl,fuel_type_count
0,2009-01-01,3,1,1,10.284149,coal,2
1,2009-01-01,3,2,2,10.271086,coal,2
2,2009-01-01,3,3,3,10.157073,coal,2
3,2009-01-01,3,4,4,9.935606,coal,2
4,2009-01-01,3,5,5,9.906513,coal,2
...,...,...,...,...,...,...,...
46530,2021-01-01,59093,1,11,7.003300,gas,1
46531,2021-01-01,59093,1,12,7.003300,gas,1
46532,2021-01-01,62289,1,GTG-1,,gas,1
46533,2021-01-01,62289,1,GTG-2,,gas,1


### Per-generator Fuel Costs
* Calculate per-generator fuel costs based on heat rates and fuel deliveries
* Because we told the `pudl_out` object to try and fill in missing values, this will request monthly average fuel cost data by date from the EIA API. It might take a minute.
* This also means you'll need to have set your EIA API Key at the top of the notebook.

In [None]:
%%time
pudl_out_annual.fuel_cost()

### Per-generator Capacity Factor

In [None]:
%%time
pudl_out_annual.capacity_factor()

### Per-generator MCOE
* This function uses the cached dataframes that were generated above to produce a huge table of per-generator statistics.
* If you just called this function alone, all of those other dataframes would be automatically generated, and available within the output object.

In [None]:
%%time
pudl_out_annual.mcoe()

## Free Memory

In [None]:
del pudl_out_annual

# Interim Output Tables 
* Integrating a new dataset into the PUDL database requires many steps (datastore, extract, transform, load, outputs).
* Sometimes we need to use tables from new datasets as soon as possible for analysis.
* The interim extract and transform steps can be hacked into the output object to run on the fly, prior to DB integration.
* The data extraction and transformation can take a while though -- and it will need to be re-run from scratch every time you create a new output object.
* **WARNING:** None of this data has been fully validated, and the structure is likely to change. Some of it (especially the FERC 714) is still in a pretty raw state.

As of December 2022, we have preliminarily integrated EIA 861 and FERC 714 in this format.

## EIA Form 861
* The interim EIA 861 ETL is set up to automatically run in its entirety as soon as you request any EIA 861 table.
* This should take 2-5 minutes if you already have the raw input data avaialble.
* If raw input data needs to be downloaded [from our Zenodo archives](https://zenodo.org/record/4127029) first (which should happen automatically), it will take longer.

In [None]:
pudl_out = pudl.output.pudltabl.PudlTabl(pudl_engine=pudl_engine)

In [None]:
# here are all of the EIA 861 tables
methods_eia861 = [t for t in dir(pudl_out) if t.endswith('_eia861')]
methods_eia861

### EIA 861 Balancing Authorities

In [None]:
%%time
pudl_out.balancing_authority_eia861()

### EIA 861 Sales
How much electricity did utilities report selling to different types of customers in each year by state?

In [None]:
%%time
pudl_out.sales_eia861()

### EIA 861 Service Territories
Which counties (with FIPS codes) each utility reported serving in each year.

In [None]:
%%time
pudl_out.service_territory_eia861()

### Free Memory

In [None]:
del pudl_out

## FERC Form 714
* **NOTE:** Most of the FERC Form 714 tables have not yet been fully processed.
* We have primarily been focused on the historical hourly demand reported by planning areas.
* As with the EIA 861, the full interim ETL will be run as soon as you ask for any FERC 714 table.
* Also as with the EIA 861, if you don't have the [raw FERC 714 input files](https://zenodo.org/record/4127101) cached locally already, they might take a minute to download.

In [None]:
pudl_out = pudl.output.pudltabl.PudlTabl(pudl_engine=pudl_engine)

In [None]:
# here are all of the FERC 714 tables
methods_ferc714 = [t for t in dir(pudl_out) if t.endswith('_ferc714')]
methods_ferc714

### FERC 714 Respondents
Currently the processing of the hourly planning area demand table exceeds the available memory on this JupyterHub, so the following cells are commented out.

In [None]:
%%time
pudl_out.respondent_id_ferc714()

### FERC 714 Hourly Demand by Planning Area

In [None]:
pudl_out.demand_hourly_pa_ferc714()