# Add Overrides to Train FERC-EIA Connecter

The FERC-EIA record linkage process requries training data in order to work properly. Training matches also serve as overrides. This notebook helps you check whether the machine learning algroythem did a good job of matching FERC and EIA records. If you find a good match (or you correct a bad match), this process will turn it into training data.

This notebook has two purposes: 

1) [**Output override tools to verify connection between EIA and FERC1**](#verify-tools)
2) [**Upload changes to training data**](#upload-overrides)

## Settings

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
import pudl_rmi
from pudl_rmi.create_override_spreadsheets import *
                                           
import pudl
import sqlalchemy as sa
import logging
import sys

import warnings
warnings.filterwarnings('ignore')

logger = logging.getLogger()
logger.setLevel(logging.DEBUG)
handler = logging.StreamHandler(stream=sys.stdout)
formatter = logging.Formatter('%(message)s')
handler.setFormatter(formatter)
logger.handlers = [handler]

pudl_settings = pudl.workspace.setup.get_defaults()
pudl_engine = sa.create_engine(pudl_settings["pudl_db"])
pudl_out = pudl.output.pudltabl.PudlTabl(pudl_engine, freq='AS',fill_fuel_cost=True,roll_fuel_cost=True,fill_net_gen=True)
rmi_out = pudl_rmi.coordinate.Output(pudl_out)

In [12]:
rmi_out.grab_ferc1_to_eia(clobber=True)

FERC to EIA granular connection not found at /Users/aesharpe/Desktop/Work/Catalyst_Coop/Repos/rmi-ferc1-eia/outputs/ferc1_eia.pkl.gz... Generating a new output.
Reading the plant part list from /Users/aesharpe/Desktop/Work/Catalyst_Coop/Repos/rmi-ferc1-eia/outputs/plant_parts_eia.pkl.gz
Preparing the FERC1 tables.
loading steam table
loading small gens table
loading hydro table
loading pumped storage table
prepping steam table
prepping hydro tables
combining all tables
Generated 168541 all candidate features.


AssertionError: Not all training data is associated with EIA records.
record_id_ferc1's of bad training data records are: ['f1_gnrt_plant_2008_12_108_0_5', 'f1_gnrt_plant_2009_12_108_0_5', 'f1_gnrt_plant_2010_12_108_0_5', 'f1_gnrt_plant_2011_12_108_0_4', 'f1_gnrt_plant_2012_12_108_0_4', 'f1_gnrt_plant_2013_12_108_0_4', 'f1_gnrt_plant_2014_12_108_0_4', 'f1_gnrt_plant_2015_12_108_0_4', 'f1_gnrt_plant_2016_12_108_0_4', 'f1_gnrt_plant_2017_12_108_0_4', 'f1_gnrt_plant_2018_12_108_0_4', 'f1_gnrt_plant_2019_12_108_0_4', 'f1_gnrt_plant_2015_12_157_0_7', 'f1_gnrt_plant_2016_12_157_0_7', 'f1_gnrt_plant_2017_12_157_0_7', 'f1_hydro_2005_12_134_4_1', 'f1_gnrt_plant_2018_12_157_0_7', 'f1_gnrt_plant_2019_12_157_0_7', 'f1_steam_2005_12_134_3_4', 'f1_steam_2005_12_157_1_1', 'f1_steam_2005_12_157_0_4', 'f1_steam_2005_12_134_3_5', 'f1_steam_2005_12_108_1_3', 'f1_steam_2005_12_210_2_3', 'f1_gnrt_plant_2005_12_210_0_4', 'f1_hydro_2006_12_134_4_1', 'f1_gnrt_plant_2006_12_210_0_4', 'f1_gnrt_plant_2007_12_210_0_4', 'f1_steam_2006_12_134_3_4', 'f1_steam_2006_12_157_0_4', 'f1_steam_2007_12_157_0_4', 'f1_steam_2006_12_108_1_3', 'f1_steam_2006_12_210_2_3', 'f1_steam_2006_12_108_2_3', 'f1_gnrt_plant_2008_12_210_0_4', 'f1_gnrt_plant_2009_12_210_0_4', 'f1_gnrt_plant_2010_12_210_0_4', 'f1_gnrt_plant_2011_12_210_0_4', 'f1_gnrt_plant_2012_12_210_0_4', 'f1_gnrt_plant_2013_12_210_0_4', 'f1_gnrt_plant_2014_12_210_0_4', 'f1_gnrt_plant_2015_12_210_0_4', 'f1_gnrt_plant_2016_12_210_0_4', 'f1_gnrt_plant_2017_12_210_0_4', 'f1_gnrt_plant_2018_12_210_0_4', 'f1_gnrt_plant_2019_12_210_0_4', 'f1_gnrt_plant_2008_12_108_0_7', 'f1_gnrt_plant_2009_12_108_0_7', 'f1_gnrt_plant_2010_12_108_0_7', 'f1_gnrt_plant_2011_12_108_0_7', 'f1_gnrt_plant_2007_12_134_0_4', 'f1_gnrt_plant_2012_12_108_0_7', 'f1_gnrt_plant_2013_12_108_0_7', 'f1_hydro_2007_12_134_4_1', 'f1_gnrt_plant_2014_12_108_0_7', 'f1_gnrt_plant_2015_12_108_0_10', 'f1_steam_2007_12_134_3_4', 'f1_steam_2008_12_157_0_4', 'f1_steam_2009_12_157_0_4', 'f1_steam_2007_12_108_1_3', 'f1_steam_2007_12_108_2_3', 'f1_gnrt_plant_2016_12_108_0_10', 'f1_gnrt_plant_2017_12_108_0_10', 'f1_gnrt_plant_2018_12_108_0_10', 'f1_gnrt_plant_2019_12_108_0_10', 'f1_gnrt_plant_2005_12_157_0_9', 'f1_gnrt_plant_2006_12_157_0_9', 'f1_gnrt_plant_2007_12_157_0_9', 'f1_gnrt_plant_2008_12_157_0_9', 'f1_gnrt_plant_2009_12_157_0_6', 'f1_gnrt_plant_2007_12_157_0_21', 'f1_gnrt_plant_2008_12_157_0_20', 'f1_gnrt_plant_2009_12_157_0_16', 'f1_gnrt_plant_2010_12_157_0_7', 'f1_gnrt_plant_2011_12_157_0_7', 'f1_gnrt_plant_2012_12_157_0_5', 'f1_gnrt_plant_2013_12_157_0_5', 'f1_gnrt_plant_2014_12_157_0_5', 'f1_gnrt_plant_2005_12_157_0_21', 'f1_gnrt_plant_2006_12_157_0_21', 'f1_gnrt_plant_2008_12_108_0_1', 'f1_gnrt_plant_2009_12_108_0_1', 'f1_gnrt_plant_2010_12_108_0_1', 'f1_gnrt_plant_2008_12_157_0_6', 'f1_gnrt_plant_2008_12_157_0_4', 'f1_gnrt_plant_2011_12_108_0_1', 'f1_hydro_2008_12_134_4_1', 'f1_gnrt_plant_2012_12_108_0_1', 'f1_gnrt_plant_2013_12_108_0_1', 'f1_steam_2008_12_134_3_4', 'f1_steam_2010_12_157_0_3', 'f1_steam_2016_12_157_0_3', 'f1_steam_2008_12_108_1_3', 'f1_steam_2008_12_108_2_3', 'f1_steam_2008_12_108_2_5', 'f1_gnrt_plant_2014_12_108_0_1', 'f1_gnrt_plant_2015_12_108_0_1', 'f1_gnrt_plant_2016_12_108_0_1', 'f1_gnrt_plant_2017_12_108_0_1', 'f1_gnrt_plant_2018_12_108_0_1', 'f1_gnrt_plant_2019_12_108_0_1', 'f1_gnrt_plant_2015_12_108_0_6', 'f1_gnrt_plant_2016_12_108_0_6', 'f1_gnrt_plant_2017_12_108_0_6', 'f1_gnrt_plant_2018_12_108_0_6', 'f1_gnrt_plant_2019_12_108_0_6', 'f1_gnrt_plant_2008_12_108_0_4', 'f1_gnrt_plant_2009_12_108_0_4', 'f1_gnrt_plant_2010_12_108_0_4', 'f1_gnrt_plant_2011_12_108_0_3', 'f1_gnrt_plant_2012_12_108_0_3', 'f1_gnrt_plant_2013_12_108_0_3', 'f1_gnrt_plant_2014_12_108_0_3', 'f1_gnrt_plant_2015_12_108_0_3', 'f1_gnrt_plant_2016_12_108_0_3', 'f1_gnrt_plant_2017_12_108_0_3', 'f1_gnrt_plant_2018_12_108_0_3', 'f1_gnrt_plant_2019_12_108_0_3', 'f1_gnrt_plant_2008_12_108_0_2', 'f1_gnrt_plant_2009_12_157_0_4', 'f1_hydro_2009_12_134_4_1', 'f1_gnrt_plant_2009_12_108_0_2', 'f1_gnrt_plant_2010_12_108_0_2', 'f1_steam_2009_12_134_3_4', 'f1_gnrt_plant_2011_12_108_0_2', 'f1_steam_2017_12_157_0_3', 'f1_steam_2009_12_108_1_3', 'f1_steam_2009_12_108_2_3', 'f1_steam_2009_12_108_2_5', 'f1_gnrt_plant_2012_12_108_0_2', 'f1_gnrt_plant_2013_12_108_0_2', 'f1_gnrt_plant_2014_12_108_0_2', 'f1_gnrt_plant_2015_12_108_0_2', 'f1_gnrt_plant_2016_12_108_0_2', 'f1_gnrt_plant_2017_12_108_0_2', 'f1_gnrt_plant_2018_12_108_0_2', 'f1_gnrt_plant_2019_12_108_0_2', 'f1_gnrt_plant_2008_12_108_0_3', 'f1_gnrt_plant_2009_12_108_0_3', 'f1_gnrt_plant_2010_12_108_0_3', 'f1_gnrt_plant_2005_12_134_0_2', 'f1_gnrt_plant_2007_12_134_0_2', 'f1_gnrt_plant_2005_12_134_0_13', 'f1_gnrt_plant_2006_12_134_0_13', 'f1_gnrt_plant_2007_12_134_0_12', 'f1_gnrt_plant_2008_12_134_0_10', 'f1_gnrt_plant_2009_12_134_0_10', 'f1_gnrt_plant_2010_12_134_0_10', 'f1_gnrt_plant_2011_12_134_0_10', 'f1_gnrt_plant_2012_12_134_0_8', 'f1_gnrt_plant_2013_12_134_0_8', 'f1_gnrt_plant_2014_12_134_0_8', 'f1_hydro_2010_12_134_4_1', 'f1_gnrt_plant_2015_12_134_0_8', 'f1_gnrt_plant_2005_12_134_0_15', 'f1_steam_2010_12_134_3_4', 'f1_gnrt_plant_2006_12_134_0_15', 'f1_steam_2018_12_157_0_3', 'f1_steam_2010_12_108_1_3', 'f1_steam_2010_12_108_2_5', 'f1_gnrt_plant_2007_12_134_0_14', 'f1_gnrt_plant_2008_12_134_0_12', 'f1_gnrt_plant_2009_12_134_0_12', 'f1_gnrt_plant_2010_12_134_0_12', 'f1_gnrt_plant_2011_12_134_0_12', 'f1_gnrt_plant_2012_12_134_0_10', 'f1_gnrt_plant_2013_12_134_0_10', 'f1_gnrt_plant_2014_12_134_0_10', 'f1_gnrt_plant_2015_12_134_0_10', 'f1_gnrt_plant_2016_12_134_0_9', 'f1_gnrt_plant_2017_12_134_0_9', 'f1_gnrt_plant_2018_12_134_0_9', 'f1_gnrt_plant_2019_12_134_0_9', 'f1_gnrt_plant_2005_12_134_0_32', 'f1_gnrt_plant_2006_12_134_0_32', 'f1_gnrt_plant_2007_12_134_0_30', 'f1_gnrt_plant_2008_12_134_0_29', 'f1_gnrt_plant_2009_12_134_0_29', 'f1_gnrt_plant_2010_12_134_0_29', 'f1_gnrt_plant_2011_12_134_0_29', 'f1_gnrt_plant_2011_12_134_0_21', 'f1_gnrt_plant_2012_12_134_0_25', 'f1_gnrt_plant_2013_12_134_0_25', 'f1_hydro_2011_12_134_4_1', 'f1_gnrt_plant_2014_12_134_0_24', 'f1_gnrt_plant_2015_12_134_0_24', 'f1_gnrt_plant_2016_12_134_0_23', 'f1_steam_2019_12_157_0_3', 'f1_steam_2011_12_108_1_3', 'f1_steam_2011_12_108_2_5', 'f1_gnrt_plant_2017_12_134_0_23', 'f1_gnrt_plant_2018_12_134_0_23', 'f1_gnrt_plant_2019_12_134_0_23', 'f1_gnrt_plant_2005_12_134_0_37', 'f1_gnrt_plant_2006_12_134_0_37', 'f1_gnrt_plant_2007_12_134_0_35', 'f1_gnrt_plant_2008_12_134_0_34', 'f1_gnrt_plant_2009_12_134_0_34', 'f1_gnrt_plant_2010_12_134_0_34', 'f1_gnrt_plant_2011_12_134_0_34', 'f1_gnrt_plant_2012_12_134_0_30', 'f1_gnrt_plant_2013_12_134_0_30', 'f1_gnrt_plant_2014_12_134_0_29', 'f1_gnrt_plant_2015_12_134_0_29', 'f1_gnrt_plant_2016_12_134_0_28', 'f1_gnrt_plant_2017_12_134_0_28', 'f1_gnrt_plant_2018_12_134_0_28', 'f1_gnrt_plant_2019_12_134_0_28', 'f1_gnrt_plant_2005_12_134_0_34', 'f1_gnrt_plant_2006_12_134_0_34', 'f1_gnrt_plant_2007_12_134_0_32', 'f1_gnrt_plant_2008_12_134_0_31', 'f1_hydro_2012_12_134_4_1', 'f1_gnrt_plant_2009_12_134_0_31', 'f1_gnrt_plant_2010_12_134_0_31', 'f1_gnrt_plant_2011_12_134_0_31', 'f1_steam_2011_12_157_0_3', 'f1_gnrt_plant_2012_12_134_0_27', 'f1_steam_2012_12_108_1_3', 'f1_steam_2012_12_108_2_5', 'f1_gnrt_plant_2013_12_134_0_27', 'f1_gnrt_plant_2014_12_134_0_26', 'f1_gnrt_plant_2015_12_134_0_26', 'f1_gnrt_plant_2016_12_134_0_25', 'f1_gnrt_plant_2017_12_134_0_25', 'f1_gnrt_plant_2018_12_134_0_25', 'f1_gnrt_plant_2019_12_134_0_25', 'f1_gnrt_plant_2005_12_134_0_17', 'f1_gnrt_plant_2006_12_134_0_17', 'f1_gnrt_plant_2007_12_134_0_16', 'f1_gnrt_plant_2008_12_134_0_14', 'f1_gnrt_plant_2009_12_134_0_14', 'f1_gnrt_plant_2010_12_134_0_14', 'f1_gnrt_plant_2011_12_134_0_14', 'f1_gnrt_plant_2012_12_134_0_12', 'f1_gnrt_plant_2013_12_134_0_12', 'f1_gnrt_plant_2014_12_134_0_12', 'f1_gnrt_plant_2015_12_134_0_12', 'f1_gnrt_plant_2016_12_134_0_11', 'f1_gnrt_plant_2017_12_134_0_11', 'f1_gnrt_plant_2018_12_134_0_11', 'f1_gnrt_plant_2019_12_134_0_11', 'f1_gnrt_plant_2005_12_134_0_23', 'f1_hydro_2013_12_134_4_1', 'f1_gnrt_plant_2006_12_134_0_23', 'f1_gnrt_plant_2007_12_134_0_21', 'f1_steam_2012_12_157_0_3', 'f1_steam_2013_12_108_1_3', 'f1_steam_2013_12_108_2_5', 'f1_gnrt_plant_2008_12_134_0_20', 'f1_gnrt_plant_2009_12_134_0_20', 'f1_gnrt_plant_2010_12_134_0_20', 'f1_gnrt_plant_2011_12_134_0_20', 'f1_gnrt_plant_2012_12_134_0_17', 'f1_gnrt_plant_2013_12_134_0_17', 'f1_gnrt_plant_2014_12_134_0_17', 'f1_gnrt_plant_2015_12_134_0_17', 'f1_gnrt_plant_2016_12_134_0_16', 'f1_gnrt_plant_2017_12_134_0_16', 'f1_gnrt_plant_2018_12_134_0_16', 'f1_gnrt_plant_2019_12_134_0_16', 'f1_gnrt_plant_2005_12_134_0_26', 'f1_gnrt_plant_2006_12_134_0_26', 'f1_gnrt_plant_2007_12_134_0_24', 'f1_gnrt_plant_2008_12_134_0_23', 'f1_gnrt_plant_2009_12_134_0_23', 'f1_gnrt_plant_2010_12_134_0_23', 'f1_gnrt_plant_2011_12_134_0_23', 'f1_gnrt_plant_2012_12_134_0_19', 'f1_gnrt_plant_2013_12_134_0_19', 'f1_gnrt_plant_2005_12_134_0_33', 'f1_hydro_2014_12_134_4_1', 'f1_gnrt_plant_2006_12_134_0_33', 'f1_gnrt_plant_2007_12_134_0_31', 'f1_steam_2013_12_157_0_3', 'f1_steam_2014_12_108_1_2', 'f1_gnrt_plant_2008_12_134_0_30', 'f1_gnrt_plant_2009_12_134_0_30', 'f1_gnrt_plant_2010_12_134_0_30', 'f1_gnrt_plant_2011_12_134_0_30', 'f1_gnrt_plant_2012_12_134_0_26', 'f1_gnrt_plant_2013_12_134_0_26', 'f1_gnrt_plant_2014_12_134_0_25', 'f1_gnrt_plant_2015_12_134_0_25', 'f1_gnrt_plant_2016_12_134_0_24', 'f1_gnrt_plant_2017_12_134_0_24', 'f1_gnrt_plant_2018_12_134_0_24', 'f1_gnrt_plant_2019_12_134_0_24', 'f1_gnrt_plant_2005_12_134_0_27', 'f1_gnrt_plant_2006_12_134_0_27', 'f1_gnrt_plant_2007_12_134_0_25', 'f1_gnrt_plant_2008_12_134_0_24', 'f1_gnrt_plant_2009_12_134_0_24', 'f1_gnrt_plant_2010_12_134_0_24', 'f1_gnrt_plant_2011_12_134_0_24', 'f1_gnrt_plant_2012_12_134_0_20', 'f1_hydro_2015_12_134_4_1', 'f1_gnrt_plant_2013_12_134_0_20', 'f1_gnrt_plant_2014_12_134_0_19', 'f1_gnrt_plant_2015_12_134_0_19', 'f1_steam_2015_12_134_3_3', 'f1_gnrt_plant_2016_12_134_0_18', 'f1_steam_2014_12_157_0_3', 'f1_steam_2015_12_108_1_2', 'f1_gnrt_plant_2017_12_134_0_18', 'f1_gnrt_plant_2018_12_134_0_18', 'f1_gnrt_plant_2019_12_134_0_18', 'f1_gnrt_plant_2005_12_134_0_28', 'f1_gnrt_plant_2006_12_134_0_28', 'f1_gnrt_plant_2007_12_134_0_26', 'f1_gnrt_plant_2008_12_134_0_25', 'f1_gnrt_plant_2009_12_134_0_25', 'f1_gnrt_plant_2010_12_134_0_25', 'f1_gnrt_plant_2011_12_134_0_25', 'f1_gnrt_plant_2012_12_134_0_21', 'f1_gnrt_plant_2013_12_134_0_21', 'f1_gnrt_plant_2014_12_134_0_20', 'f1_gnrt_plant_2015_12_134_0_20', 'f1_gnrt_plant_2016_12_134_0_19', 'f1_gnrt_plant_2017_12_134_0_19', 'f1_gnrt_plant_2018_12_134_0_19', 'f1_gnrt_plant_2019_12_134_0_19', 'f1_gnrt_plant_2005_12_134_0_31', 'f1_gnrt_plant_2006_12_134_0_31', 'f1_gnrt_plant_2007_12_134_0_29', 'f1_gnrt_plant_2008_12_134_0_28', 'f1_gnrt_plant_2009_12_134_0_28', 'f1_steam_2015_12_157_0_3', 'f1_steam_2016_12_108_1_2', 'f1_gnrt_plant_2010_12_134_0_28', 'f1_gnrt_plant_2011_12_134_0_28', 'f1_gnrt_plant_2012_12_134_0_24', 'f1_gnrt_plant_2013_12_134_0_24', 'f1_gnrt_plant_2014_12_134_0_23', 'f1_gnrt_plant_2015_12_134_0_23', 'f1_gnrt_plant_2016_12_134_0_22', 'f1_gnrt_plant_2017_12_134_0_22', 'f1_gnrt_plant_2018_12_134_0_22', 'f1_gnrt_plant_2019_12_134_0_22', 'f1_gnrt_plant_2005_12_157_0_18', 'f1_gnrt_plant_2006_12_157_0_18', 'f1_gnrt_plant_2007_12_157_0_18', 'f1_gnrt_plant_2008_12_157_0_18', 'f1_gnrt_plant_2009_12_157_0_19', 'f1_gnrt_plant_2010_12_157_0_10', 'f1_gnrt_plant_2011_12_157_0_10', 'f1_gnrt_plant_2012_12_157_0_8', 'f1_gnrt_plant_2013_12_157_0_8', 'f1_gnrt_plant_2014_12_157_0_8', 'f1_gnrt_plant_2015_12_157_0_6', 'f1_gnrt_plant_2016_12_157_0_6', 'f1_gnrt_plant_2017_12_157_0_6', 'f1_steam_2006_12_134_3_5', 'f1_steam_2017_12_108_1_2', 'f1_gnrt_plant_2018_12_157_0_6', 'f1_gnrt_plant_2019_12_157_0_6', 'f1_gnrt_plant_2007_12_157_0_7', 'f1_gnrt_plant_2008_12_157_0_7', 'f1_gnrt_plant_2007_12_157_0_19', 'f1_gnrt_plant_2008_12_157_0_22', 'f1_gnrt_plant_2009_12_157_0_23', 'f1_gnrt_plant_2010_12_157_0_14', 'f1_gnrt_plant_2011_12_157_0_14', 'f1_gnrt_plant_2012_12_157_0_12', 'f1_gnrt_plant_2013_12_157_0_12', 'f1_gnrt_plant_2014_12_157_0_12', 'f1_gnrt_plant_2009_12_157_0_20', 'f1_gnrt_plant_2010_12_157_0_11', 'f1_gnrt_plant_2011_12_157_0_11', 'f1_gnrt_plant_2012_12_157_0_9', 'f1_gnrt_plant_2013_12_157_0_9', 'f1_gnrt_plant_2014_12_157_0_9', 'f1_gnrt_plant_2015_12_157_0_5', 'f1_gnrt_plant_2016_12_157_0_5', 'f1_gnrt_plant_2017_12_157_0_5', 'f1_gnrt_plant_2018_12_157_0_5', 'f1_gnrt_plant_2019_12_157_0_5', 'f1_steam_2007_12_134_3_5', 'f1_steam_2018_12_108_1_2', 'f1_gnrt_plant_2009_12_157_0_21', 'f1_gnrt_plant_2010_12_157_0_12', 'f1_gnrt_plant_2011_12_157_0_12', 'f1_gnrt_plant_2012_12_157_0_10', 'f1_gnrt_plant_2013_12_157_0_10', 'f1_gnrt_plant_2014_12_157_0_10', 'f1_gnrt_plant_2015_12_157_0_10', 'f1_gnrt_plant_2016_12_157_0_10', 'f1_gnrt_plant_2017_12_157_0_10', 'f1_gnrt_plant_2018_12_157_0_10', 'f1_gnrt_plant_2019_12_157_0_10', 'f1_gnrt_plant_2008_12_210_0_18', 'f1_gnrt_plant_2009_12_210_0_18', 'f1_gnrt_plant_2010_12_210_0_18', 'f1_gnrt_plant_2011_12_210_0_18', 'f1_gnrt_plant_2012_12_210_0_18', 'f1_gnrt_plant_2013_12_210_0_18', 'f1_gnrt_plant_2014_12_210_0_18', 'f1_gnrt_plant_2015_12_210_0_18', 'f1_gnrt_plant_2016_12_210_0_18', 'f1_gnrt_plant_2017_12_210_0_19', 'f1_gnrt_plant_2018_12_210_0_19', 'f1_gnrt_plant_2019_12_210_0_19', 'f1_gnrt_plant_2007_12_210_0_18', 'f1_steam_2008_12_134_3_5', 'f1_steam_2019_12_108_1_2']

## Specify Utilities & Years

In [3]:
# old

specified_utilities = {
    # 'Dominion': {'utility_id_pudl': [292, 293, 349],
    #              'utility_id_eia': [17539, 17554, 19876]},
    # 'Evergy': {'utility_id_pudl': [159, 160, 161, 1270, 13243],
    #            'utility_id_eia': [10000, 10005, 56211, 3702, 55329]}, # pudl/eia 359/22500 --> 13243/55329, 1270/3702 --> BAD
    # 'IDACORP': {'utility_id_pudl': [140],
    #             'utility_id_eia': [9191]},
    # 'Duke': {'utility_id_pudl': [90, 91, 92, 93, 96, 97],
    #          'utility_id_eia': [5416, 6455, 15470, 55729, 3542, 3046]},
    'BHE': {'utility_id_pudl': [185, 246, 204, 287],
            'utility_id_eia': [12341, 14354, 13407, 17166]},
    'Southern': {'utility_id_pudl': [123, 18, 190, 11830],
                 'utility_id_eia': [7140, 195, 12686, 17622]},
    # 'NextEra': {'utility_id_pudl': [121, 130],
    #             'utility_id_eia': [6452, 7801]},
    # 'AEP': {'utility_id_pudl': [29, 301, 144, 275, 162, 361, 7],
    #         'utility_id_eia': [733, 17698, 9324, 15474, 22053, 20521, 343]},
    # 'Entergy': {'utility_id_pudl': [107, 106, 311, 113, 110],
    #             'utility_id_eia': [11241, 814, 12465, 55937, 13478]},
    # 'Xcel': {'utility_id_pudl': [224, 302, 272, 11297],
    #          'utility_id_eia': [13781, 13780, 17718, 15466]}
}

In [3]:
specified_utilities = {
    'BHE': [12341, 14354, 13407, 17166],
    'Southern':[7140, 195, 12686, 17622]
}

specified_years = [
    2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 
    2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020
] 

<a id='verify-tools'></a>
## 1) Output Override Tools
Run the following function and you'll find excel files called `<UTILITY>_fix_FERC-EIA_overrides.xlsx` in the `outputs/overrides` directory created based on the utility and year inputs you specified above. Read the [Override Instructions](https://docs.google.com/document/d/1nJfmUtbSN-RT5U2Z3rJKfOIhWsRFUPNxs9NKTes0SRA/edit#) to learn how to begin fixing/verifying the FERC-EIA connections.

In [4]:
generate_override_tools(pudl_out, rmi_out, specified_utilities, specified_years)

Reading the FERC to EIA connection from /Users/aesharpe/Desktop/Work/Catalyst_Coop/Repos/rmi-ferc1-eia/outputs/ferc1_eia.pkl.gz
Reading the plant part list from /Users/aesharpe/Desktop/Work/Catalyst_Coop/Repos/rmi-ferc1-eia/outputs/plant_parts_eia.pkl.gz
Grabbing depreciation study output from /Users/aesharpe/Desktop/Work/Catalyst_Coop/Repos/rmi-ferc1-eia/outputs/deprish.pkl.gz

Developing outputs for BHE
Outputing table subsets to tabs

Developing outputs for Southern
Outputing table subsets to tabs



<a id='upload-overrides'></a>
## 2) Upload changes to training data
When you've finished editing the `<UTILITY>_fix_FERC-EIA_overrides.xlsx` and want to add your changes to the official override csv, move your file to the directory called `add_to_training` and then run the following function. 

**Note:** If you have changed or marked TRUE any records that have already been overridden and included in the training data, you will want to set `expect_override_overrides = True`. Otherwise, the function will check to see if you have accidentally tampered with values that have already been matched.

Right now, the module points to a COPY of the training data so it doesn't override the official version. You'll need to change that later if you want to update the official version.

In [3]:
validate_and_add_to_training(
    pudl_out, rmi_out, expect_override_overrides=True
)

Reading the FERC to EIA connection from /Users/aesharpe/Desktop/Work/Catalyst_Coop/Repos/rmi-ferc1-eia/outputs/ferc1_eia.pkl.gz
Reading the plant part list from /Users/aesharpe/Desktop/Work/Catalyst_Coop/Repos/rmi-ferc1-eia/outputs/plant_parts_eia.pkl.gz
Processing fixes in BHE_fix_FERC-EIA_overrides.xlsx
Validating overrides
Checking eia record id consistency for values that don't exist
Checking ferc record id consistency for values that don't exist
Checking for duplicate override ids
Checking for mismatched utility ids
Checking that year in override id matches report year
Adding overrides to training data
Combining all new overrides with existing training data
