# EIA 860M Update Inspection

To run this notebook, you need to refresh the changelog data first, by updating parameters:
- In `.env`, change `PUDL_VERSION` to the latest found [here](https://github.com/catalyst-cooperative/pudl/releases)
- In `src/constants.py`, set `PUDL_LATEST_YEAR` to the latest year for which PUDL has complete data.

and running `make all`.

In [2]:
import pandas as pd

from dbcp.helpers import get_sql_engine

engine = get_sql_engine()

with engine.connect() as con:
    eia860m = pd.read_sql_table("pudl_eia860m_changelog", con, schema="data_warehouse")
    pudl_eia860m_status_codes = pd.read_sql_table("pudl_eia860m_status_codes", con, schema="data_warehouse")

In [3]:
PREVIOUS_QUARTER_DATE = "2024-10-01"

def grab_current_quarter(df):
    recent_quarter = df.loc[df.groupby(["generator_id", "plant_id_eia"])['report_date'].idxmax()]
    assert ~recent_quarter.duplicated(subset=["generator_id", "plant_id_eia"]).any()
    return recent_quarter

def grab_previous_quarter(df):
    previous_quarter = df[df.report_date < PREVIOUS_QUARTER_DATE]
    previous_quarter = previous_quarter.loc[previous_quarter.groupby(["generator_id", "plant_id_eia"])['report_date'].idxmax()]
    assert ~previous_quarter.duplicated(subset=["generator_id", "plant_id_eia"]).any()
    return previous_quarter


def pct_change_mw_by_status(df):
    recent_quarter = grab_current_quarter(df)
    recent_quarter_mw_by_status = recent_quarter.groupby("operational_status_code").capacity_mw.sum()

    previous_quarter = grab_previous_quarter(df)
    
    previous_quarter_mw_by_status = previous_quarter.groupby("operational_status_code").capacity_mw.sum()

    return ((recent_quarter_mw_by_status - previous_quarter_mw_by_status) / previous_quarter_mw_by_status) * 100
    

How many generators had a status change in this quarter update? We shouldn't expect that many generators to have status changes.

Merge the quarters together using the generator ID. Each quarter should only have one record for each generator so the merge should be one to one.


In [4]:
previous_quarter = grab_previous_quarter(eia860m)
current_quarter = grab_current_quarter(eia860m)

In [12]:
previous_quarter.report_date.max(), current_quarter.report_date.max()

(Timestamp('2024-09-01 00:00:00'), Timestamp('2024-12-01 00:00:00'))

In [5]:
merged_quarters = previous_quarter.merge(current_quarter, on=["generator_id", "plant_id_eia"], validate="1:1", suffixes=("_previous", "_current"))

different_status_codes = merged_quarters["operational_status_code_previous"].ne(merged_quarters["operational_status_code_current"])
different_status_codes.value_counts()


False    35817
True       525
dtype: int64

For the generators have have a different status in the new update, check to see if the status change makes sense: ("Operational to Retired", "Under Construction to Operational", etc). A highlevel check to make sure the status changes make sense is to see if the status code numbers stay the same or increase. Higher number operational codes represent more advanced stages in a generator's life cycle.

In [6]:
new_status_code_is_greater = merged_quarters["operational_status_code_previous"].le(merged_quarters["operational_status_code_current"])

new_status_code_is_greater.value_counts()

True     36334
False        8
dtype: int64

In [7]:
merged_quarters[~new_status_code_is_greater][["raw_operational_status_code_previous", "raw_operational_status_code_current"]]

Unnamed: 0,raw_operational_status_code_previous,raw_operational_status_code_current
16411,U,T
17921,T,L
21373,V,U
22394,RE,OP
22740,U,T
22749,U,L
29871,U,T
36285,T,L


Looks like there are a handful of generators that came out of retirement. Let dig into the status codes of the generators that have a new status in the udpated data.

In [8]:
pd.set_option('display.max_colwidth', None)

pudl_eia860m_status_codes

Unnamed: 0,code,status,description
0,OT,99,proposed
1,IP,98,"Planned new indefinitely postponed, or no longer in resource plan"
2,P,1,Planned for installation but regulatory approvals not initiated; Not under construction
3,L,2,Not under construction but site preparation could be underway
4,T,3,Regulatory approvals received. Not under construction but site preparation could be underway
5,U,4,"Under construction, less than or equal to 50 percent complete (based on construction time to date of operation)"
6,V,5,"Under construction, more than 50 percent complete (based on construction time to date of operation)"
7,TS,6,"Construction complete, but not yet in commercial operation (including low power testing of nuclear units). Operating under test conditions."
8,OA,7,Was not used for some or all of the reporting period but is expected to be returned to service in the next calendar year.
9,OP,7,"Operating (in commercial service or out of service within 365 days). For generators, this means in service (commercial operation) and producing some electricity. Includes peaking units that are run on an as needed (intermittent or seasonal) basis."


In [13]:
merged_quarters[different_status_codes][["raw_operational_status_code_previous", "raw_operational_status_code_current"]].value_counts()

raw_operational_status_code_previous  raw_operational_status_code_current
V                                     OP                                     141
TS                                    OP                                      72
U                                     V                                       62
                                      OP                                      53
V                                     TS                                      27
T                                     U                                       22
OP                                    RE                                      19
U                                     TS                                      18
T                                     V                                       16
P                                     U                                       15
SB                                    RE                                      14
L                                  

Look at capacity change for each status code.

In [10]:
pct_change_mw_by_status(eia860m)

operational_status_code
1     27.461271
2     43.826788
3     11.220252
4      8.332664
5    -13.286307
6    -21.737968
7      1.346385
8      0.820580
99     0.000000
Name: capacity_mw, dtype: float64

## Capacity by status by ISO

In [14]:
ISO_REGIONS = ("MISO", "PJM", "CISO", "ERCO", "ISNE", "NYIS", "SWPP") 

In [15]:
eia860_isos = eia860m[eia860m.balancing_authority_code_eia.isin(ISO_REGIONS)]


for region in ISO_REGIONS:
    print(region)
    pct_change = pct_change_mw_by_status(eia860_isos[eia860_isos["balancing_authority_code_eia"] == region])
    print(pct_change)
    print()

MISO
operational_status_code
1    17.693121
2    47.672149
3    17.450306
4    17.752466
5   -26.207009
6   -34.145841
7     1.227313
8     3.172118
Name: capacity_mw, dtype: float64

PJM
operational_status_code
1     19.266898
2     10.840279
3      3.272093
4     23.500602
5    -36.573563
6    -27.055703
7      0.934924
8      0.030071
99     0.000000
Name: capacity_mw, dtype: float64

CISO
operational_status_code
1      8.473018
2      1.471941
3     28.332629
4      0.939804
5      5.312942
6    -28.601950
7      2.593739
8      0.452473
99     0.000000
Name: capacity_mw, dtype: float64

ERCO
operational_status_code
1     56.982032
2     15.595118
3     -2.707401
4      8.463946
5     -4.039566
6    -18.209481
7      3.108712
8      0.000000
99     0.000000
Name: capacity_mw, dtype: float64

ISNE
operational_status_code
1      48.974943
2       7.106178
3     -12.460667
4     -10.943462
5       2.663775
6     295.867765
7       0.526600
8       0.000000
99      0.000000
Name: capac