# Water & Power Interconnection Sankey Flows (WPISF)
## Thermoelectric Cooling Detailed Analysis

#### USGS thermoelectric cooling data

USGS provides thermoelectric cooling water data on a plant-level for 2015. This data includes water type, water source, withdrawal amount (MGD), and consumption amount (MGD)

For plants that have multiple types of generators (e.g. Natural Gas and Coal) these plants have simply been labeled as "COMPLEX"

In order to split out complex plant types into their respective generators, the USGS file is combined with EIA data that lists the generators, their MWH output, size, and other information for each plant. 

In order to determine how much of the total withdrawal for complex plants belongs to each of it's sub-generators, the water intensities of each technology types are calculated for the USGS data that is not listed as complex and then applied to the now known generator types. That is, if the average water intensity of NGCC Once-through plants in the USGS dataset is equal to X MGD per MWh, the calculated intensity of X is multiplied by the MWH provided by that generator. The latter of this information is provided in the EIA 923 dataset. 

Performing this analysis allows for greater accounting of specific technology types in the dataset as a whole. Many power plants are complex, without separating these out into their respective types, it would be unclear to what kind of technology the total amount of water is going.

In addition to technology type, the water source and type are filled in following sets of assumptions such as: all generators with a river listed as it's source and unknown quality is assumed to be fresh surface water. Similarly, wells are assumed to be fresh groundwater if information is not provided.

Water data in this notebook was obtained from USGS [1] and energy production data was obtained from EIA [2].

References:
1.  Dieter, C. A., Maupin, M. A., Caldwell, R. R., Harris, M. A., Ivahnenko, T. I., Lovelace, J. K., Barber, N. L., & Linsey, K. S. (2018). Estimated use of water in the United States in 2015.

2. U.S. Energy Information Administration (EIA). (2016). Form EIA-923 detailed data with previous form data (EIA-906/920). U.S. EIA. Retrieved January 15 from https://www.eia.gov/electricity/data/eia923/

In [1]:
#imports
import numpy as np
import pandas as pd

In [2]:
#read in USGS thermoelectric water data
USGS_TEC = pd.read_excel("data/2015_TE_Model_Estimates_USGS.xlsx")
#drop plants where withdrawal = 0
USGS_TEC = USGS_TEC[(USGS_TEC[['WITHDRAWAL']] > 0).all(axis=1)]

#selecting columns of interest
USGS_TEC = USGS_TEC[["EIA_PLANT_ID", "NAME_OF_WATER_SOURCE","WATER_SOURCE_CODE", 
                     "WATER_TYPE_CODE","WITHDRAWAL", "CONSUMPTION","GENERATION_TYPE","COOLING_TYPE","NET_GENERATION"]]
USGS_TEC = USGS_TEC.rename(columns={"EIA_PLANT_ID": "Plant Code"})

#read in and merge NERC_Interconnect datafile
Inter_NERC = pd.read_excel('data/NERC_Interconnect_List.xlsx")

#Read in NERC_Plant Code file
NERC = pd.read_excel("data/Plant_BalancingAuthority_NERC_List.xlsx")
Plant_inter = pd.merge(NERC, Inter_NERC, how="left", on=["NERC Region"]) #merge

Plant_inter = Plant_inter[["Plant Code", "Interconnect"]] #reduce to variables of interest

USGS_TEC = pd.merge(USGS_TEC, Plant_inter, how="left", on=["Plant Code"]) #bring it together with the TEC data

SyntaxError: EOL while scanning string literal (<ipython-input-2-0061693be6df>, line 12)

#### Corrections for water type and source
Code below simplifies some of the water source groupings and assigns water source codes.

In [None]:
#some general cleaning for municipality and city water 
USGS_TEC['WATER_SOURCE_CODE'] = np.where(USGS_TEC['NAME_OF_WATER_SOURCE'] != "Municipality", 
                                                  USGS_TEC['WATER_SOURCE_CODE'], 
                                                  "UM")

USGS_TEC['WATER_TYPE_CODE'] = np.where(USGS_TEC['NAME_OF_WATER_SOURCE'] != "Municipality", 
                                                  USGS_TEC['WATER_TYPE_CODE'], 
                                                  "FR")

USGS_TEC['WATER_SOURCE_CODE'] = np.where(USGS_TEC['NAME_OF_WATER_SOURCE'] != "City Water", 
                                                  USGS_TEC['WATER_SOURCE_CODE'], 
                                                  "UM")

USGS_TEC['WATER_TYPE_CODE'] = np.where(USGS_TEC['NAME_OF_WATER_SOURCE'] != "City Water", 
                                                  USGS_TEC['WATER_TYPE_CODE'], 
                                                  "FR")
#cleaning for wells
USGS_TEC['WATER_SOURCE_CODE'] = np.where(USGS_TEC['NAME_OF_WATER_SOURCE'] != "Wells", 
                                                  USGS_TEC['WATER_SOURCE_CODE'], 
                                                  "GW")

USGS_TEC['WATER_TYPE_CODE'] = np.where(USGS_TEC['NAME_OF_WATER_SOURCE'] != "Wells", 
                                                  USGS_TEC['WATER_TYPE_CODE'], 
                                                  "FR")
USGS_TEC['WATER_SOURCE_CODE'] = np.where(USGS_TEC['NAME_OF_WATER_SOURCE'] != "Well", 
                                                  USGS_TEC['WATER_SOURCE_CODE'], 
                                                  "GW")

USGS_TEC['WATER_TYPE_CODE'] = np.where(USGS_TEC['NAME_OF_WATER_SOURCE'] != "Well", 
                                                  USGS_TEC['WATER_TYPE_CODE'], 
                                                  "FR")

Note that some plants in the USGS dataset had blank values for source and type. Values were filled in based on assumptions regarding the name of the water source. For example, if the water source name was a lake, it was assumed that the water type was fresh and the water source was surface water. These corrections can be found in the TEC_corrections.xlsx spreadsheet in the data folder.

In [None]:
#import file of manual fixes based on plant code
manual_fix = pd.read_excel("data\TEC_corrections.xlsx")
manual_fix["Manual Fix?"] = "Yes"
USGS_TEC = pd.merge(USGS_TEC, manual_fix, how="left", on=["Plant Code"]) #merge


USGS_TEC['SOURCE'] = np.where(USGS_TEC['Manual Fix?'] == "Yes",
                              USGS_TEC["Source"],
                              USGS_TEC['WATER_SOURCE_CODE'])

USGS_TEC['TYPE'] = np.where(USGS_TEC['Manual Fix?'] == "Yes", 
                                                  USGS_TEC["Type"], 
                                                  USGS_TEC['WATER_TYPE_CODE'])

#removes multiple source and types and replaces them with assumed singular
USGS_TEC.loc[USGS_TEC['SOURCE'] == "GW & SW", 'SOURCE'] = "GW"
USGS_TEC.loc[USGS_TEC['SOURCE'] == "GW & PD", 'SOURCE'] = "GW"
USGS_TEC.loc[USGS_TEC['TYPE'] == "FR & BE", 'TYPE'] = "FR"

#removing extra type identifier in cooling type information
USGS_TEC.loc[USGS_TEC['COOLING_TYPE'] == "ONCE-THROUGH FRESH", 'COOLING_TYPE'] = "ONCE-THROUGH"
USGS_TEC.loc[USGS_TEC['COOLING_TYPE'] == "ONCE-THROUGH SALINE", 'COOLING_TYPE'] = "ONCE-THROUGH"

#drop variables no longer needed
USGS_TEC = USGS_TEC.drop(["Source", "Type", "WATER_SOURCE_CODE", "WATER_TYPE_CODE","Manual Fix?"], axis=1)

#### Filling out COMPLEX Generator Types
Code below creates a copy of the USGS dataset that does not include COMPLEX generation sources. This is to calculate the water intensity rates to be able to apply to the revealed generator types later. We only want to use the totally informed USGS data for water intensity calculations.

In [None]:
USGS_NC_gen = USGS_TEC[(USGS_TEC[['GENERATION_TYPE']] != "COMPLEX").all(axis=1)]

#### Withdrawal and Consumption Intensity Value Calculations:

Steps to calculate the withdrawn and consumption intensity by generator/cooling-type combination:
* Calculate the water/MWh intensity for each USGS generation & cooling technology combo for COMPLEX generator types
* Calculate the water/MWh intensity for each USGS generation across all cooling types for COMPLEX cooling types

In [None]:
#group the non-complex USGS datafile by generator and cooling type
USGS_Int = USGS_NC_gen.groupby(["GENERATION_TYPE","COOLING_TYPE"], as_index=False).sum()

#calculate the MGD/MWH
USGS_Int["With Intensity"] = USGS_Int["WITHDRAWAL"]/USGS_Int["NET_GENERATION"]
USGS_Int["Con Intensity"] = USGS_Int["CONSUMPTION"]/USGS_Int["NET_GENERATION"]

#separate out to just the ones that don't have complex cooling types
USGS_Int = USGS_Int[(USGS_Int[['COOLING_TYPE']] != "COMPLEX").all(axis=1)]

#Calculating the average intensity across each generation type as a fill-in for the complex cooling systems
USGS_Int_avg = USGS_Int.groupby(["GENERATION_TYPE"], as_index=False).mean()

#creating identifier row
USGS_Int_avg["COOLING_TYPE"] = "COMPLEX"

#subselecting columns
USGS_Int_avg = USGS_Int_avg[["GENERATION_TYPE", "COOLING_TYPE", "With Intensity", "Con Intensity"]]

#creates a table of intensity rates for generation/cooling-type combinations
frames = [USGS_Int, USGS_Int_avg]
intensity = pd.concat(frames)

#variables of interest from intensity dataframe
intensity = intensity[["GENERATION_TYPE","COOLING_TYPE","With Intensity", "Con Intensity"]]
#rename
intensity = intensity.rename(columns={"GENERATION_TYPE": "Class"})

#### Use EIA data to split out complex plants into their actual generator types

* Use EIA 923 plant level data to split out each COMPLEX plant in the USGS dataset into it's sub-generator types and mover types
* Calculate the percent of total generation by each generator in a plant for EIA data as a proxy for percent of total water withdrawal.
* Conform the EIA data to the USGS data labeling for generators

In [None]:
#read in EIA data
EIA = pd.read_excel("data/EIA923_Schedules_2_3_4_5_M_12_2015_Final_Revision.xlsx",skiprows=5)

#grabbing relevant columns
EIA = EIA[["Plant Id", "AER\nFuel Type Code", "Reported\nPrime Mover", "Net Generation\n(Megawatthours)"] ]

#combining EIA information by plant code to get the total withdrawal and consumption by plant
EIA_sum = EIA.groupby(["Plant Id"], as_index=False).sum()

#renaming the summed values to be able to merge back
EIA_sum = EIA_sum.rename(columns={"Net Generation\n(Megawatthours)": "Plant Total MWh"})

#merging the total generator values to the generator level data
EIA = pd.merge(EIA, EIA_sum, how="left", on=["Plant Id"])

#calculating the percent of total generation each generator contributes to the overall plant number
EIA["Percent_Gen"] = EIA["Net Generation\n(Megawatthours)"] / EIA["Plant Total MWh"]

#replaces NAN with 0
EIA.fillna(0, inplace=True)

#rename plant id to prep for merging
EIA = EIA.rename(columns={"Plant Id": "Plant Code"})

In [None]:
#renames the EIA data to match the generator naming from USGS

AER_Key = pd.read_excel("data/AERFuel-Key.xlsx")
Mover_Key = pd.read_excel("data/AERFuel-Key.xlsx","Sheet2")

EIA = pd.merge(EIA, AER_Key, how="left", on=["AER\nFuel Type Code"])
EIA = pd.merge(EIA, Mover_Key, how="left", on=["Reported\nPrime Mover"])
EIA = EIA[["Plant Code", "Technology", "key", "Percent_Gen"]]

#### Preps the USGS file and separates out the COMPLEX generation

In [None]:
#splits out just the complex generators
USGS_Complex_gen = USGS_TEC[USGS_TEC['GENERATION_TYPE'] == "COMPLEX"]

#merge USGS full dataset with the new information from EIA on generator type
USGS_Complex_gen = pd.merge(USGS_Complex_gen, EIA, how="left", on=["Plant Code"])

In [None]:
#data cleaning
def conditions(s):
    if (s['Technology'] == "NG") and (s['key'] == "CC"):
        return "NGCC"
    if (s['Technology'] == "NG") and (s['key'] == "CT"):
        return "NGCC"
    if (s['Technology'] == "NG") and (s['key'] == "CA"):
        return "NGCC"
    if (s['Technology'] == "NG") and (s['key'] == "CS"):
        return "NGCC"
    if (s['Technology'] == "NG") and (s['key'] == "IC"):
        return "NGCC"
    if (s['Technology'] == "NG") and (s['key'] == "GT"):
        return "NGCC"
    if (s['Technology'] == "NG") and (s['key'] == "ST"):
        return "GAS STEAM"
    if (s['Technology'] == "Oil") and (s['key'] == "CC"):
        return "OIL CC"
    else:
        return "."
    
USGS_Complex_gen['Class'] = USGS_Complex_gen.apply(conditions, axis=1)

#if not one of the above conditions, fill in with previous value
USGS_Complex_gen['Class'] = np.where(USGS_Complex_gen['Class'] == ".", 
                                                  USGS_Complex_gen['Technology'], 
                                                  USGS_Complex_gen['Class'])

#Calculates the generation per generator of complex systems using USGS generation values based on percentages from eia
USGS_Complex_gen["Gen_MWh"] = USGS_Complex_gen["Percent_Gen"]* USGS_Complex_gen['NET_GENERATION']

#drops rows with no output. Assumes no water withdrawal if no output
#USGS_Complex_gen = USGS_Complex_gen[(USGS_Complex_gen[['Gen_MWh']] != 0).all(axis=1)]

#reduces USGS dataframe to only incldue plants that have water withdrawal
USGS_Complex_gen = USGS_Complex_gen[(USGS_Complex_gen[['WITHDRAWAL']] > 0).all(axis=1)]

#Drops generation types that do not apply
USGS_Complex_gen = USGS_Complex_gen[(USGS_Complex_gen[['Class']] != "OTHER").all(axis=1)]
USGS_Complex_gen = USGS_Complex_gen[(USGS_Complex_gen[['Class']] != "WIND").all(axis=1)]
USGS_Complex_gen = USGS_Complex_gen[(USGS_Complex_gen[['Class']] != "SOLAR").all(axis=1)]
USGS_Complex_gen = USGS_Complex_gen[(USGS_Complex_gen[['Class']] != "HYDRO").all(axis=1)]

#variables of interest from USGS file
USGS_Complex_gen = USGS_Complex_gen[["Plant Code","Interconnect", "SOURCE", "TYPE","NAME_OF_WATER_SOURCE", 
                     "WITHDRAWAL", "CONSUMPTION", "Class", "COOLING_TYPE", "Gen_MWh"]]

#### Apply Intensity Rates to Complex Generator Types
This measures their percent of estimated withdrawals and applies that percent to actual total plant withdrawals. Note that this is a simplification and that percent of water withdrawals by a generator will not be a 1:1 ratio with their percent of plant energy generation.

* Multiply the intensity rate for each gen type and cooling combo by the generation per generator MWh

* Calculate the percent of calculated water withdrawals for each generator in a plant
* Multiply these percentages by the total USGS withdrawals per plant
* Repeat for Consumption Intensity
* Stack on non COMPLEX table

In [None]:
#merge USGS complex cooling data with intensity dataframe
USGS_Complex_gen = pd.merge(USGS_Complex_gen, intensity, how="left", on=["Class", "COOLING_TYPE"])

#multiplies intensity values by generator and cooling type in USGS dataset
USGS_Complex_gen["Withdrawal"] = USGS_Complex_gen["With Intensity"]*USGS_Complex_gen["Gen_MWh"]
USGS_Complex_gen["Consumption"] = USGS_Complex_gen["Con Intensity"]*USGS_Complex_gen["Gen_MWh"]

In [None]:
USGS_Complex_total = USGS_Complex_gen.groupby("Plant Code", as_index = False).sum()

USGS_Complex_total = USGS_Complex_total.rename(columns={"Withdrawal": "Total Estimated With"})
USGS_Complex_total = USGS_Complex_total.rename(columns={"Consumption": "Total Estimated Con"})

USGS_Complex_total = USGS_Complex_total[["Plant Code", "Total Estimated With","Total Estimated Con"]]
USGS_Complex_gen = pd.merge(USGS_Complex_gen, USGS_Complex_total, how="left", on= "Plant Code")

In [None]:
#takes the calculated withdrawal from the intensity rates and gets the fraction (PW) of total USGS estimated withdrawal
USGS_Complex_gen["PW"] = USGS_Complex_gen["Withdrawal"]/USGS_Complex_gen["Total Estimated With"]
#repeats for consumption
USGS_Complex_gen["PC"] = USGS_Complex_gen["Consumption"]/USGS_Complex_gen["Total Estimated Con"]

#replace USGS total plant withdrawal & consumption values with the generator-level contribution to total
USGS_Complex_gen["WITHDRAWAL"] = USGS_Complex_gen["PW"]*USGS_Complex_gen["WITHDRAWAL"]
USGS_Complex_gen["CONSUMPTION"] = USGS_Complex_gen["PC"]*USGS_Complex_gen["CONSUMPTION"]

#replace infinate values with 0
USGS_Complex_gen.replace([np.inf, -np.inf],  np.nan, inplace=True)
USGS_Complex_gen.fillna(0, inplace=True)

#### Combine the filled in complex dataframe to the non-complex dataframe

In [None]:
USGS_Complex_gen = USGS_Complex_gen.rename(columns={"Class": "GENERATION_TYPE"})

stack = [USGS_Complex_gen, USGS_NC_gen]
Thermo_df = pd.concat(stack)

#### Filling out COMPLEX Cooling Types
Now that complex generation is dealt with, the complex cooling has to be dealt with. The distribution of generation + cooling type in an interconnect will be applied to complex cooling type rows to fill out this information. This effectively assumes that the type of cooling for complex cooling types follows the distribution of all similar generator types within a particular interconnect. That is, if 50% of coal plants have once-through cooling in ERCOT, it is assumed that this percentage applies to the ERCOT coal plants with complex cooling.<br>
Steps:
* Separate out the non-complex cooling plants from the Thermo dataframe
* Group by Generation type and cooling type and count the number of each
* Count the total number of plants by generation type
* Calculate the percentage of each cooling type for each generation type
* Apply the percentages to the generator + complex cooling separated out from the Thermo dataframe
* Recombine

#### Determine the percent of plants with a particular cooling type for each interconnect and generator type

In [None]:
#create a separate dataframe from the filled out one that does not have complex cooling type
Thermo_NC = Thermo_df[(Thermo_df[['COOLING_TYPE']] != "COMPLEX").all(axis=1)]

#including interconnect because different regions may have cooling requirements
Thermo_NC = Thermo_NC.groupby(["Interconnect","GENERATION_TYPE","COOLING_TYPE"], as_index=False).count()

#rename the column
Thermo_NC = Thermo_NC.rename(columns={"TYPE": "Count"})

#Calculate the total number of instances with a particular interconnect + generator + cooling type
Thermo_NC_sum = Thermo_NC.groupby(["Interconnect","GENERATION_TYPE"], as_index=False).sum()
Thermo_NC_sum = Thermo_NC_sum.rename(columns={"Count": "Sum"})

#reduce to variables of interest
Thermo_NC_sum = Thermo_NC_sum[["Interconnect", "GENERATION_TYPE","Sum"]]

#merge totals back to full thermo_NC dataframe
Thermo_NC = pd.merge(Thermo_NC, Thermo_NC_sum, how="left", on=["Interconnect", "GENERATION_TYPE"])

In [None]:
#calculate the percentages of each cooling type within each generator type
Thermo_NC["Cool_percent"] = Thermo_NC["Count"] / Thermo_NC["Sum"]
Thermo_NC = Thermo_NC[["Interconnect", "GENERATION_TYPE", "COOLING_TYPE", "Cool_percent"]]

#create a pivot table of the percentages
Thermo_NC = pd.pivot_table(Thermo_NC, values='Cool_percent', index=['Interconnect', 'GENERATION_TYPE'],
                    columns=['COOLING_TYPE'], aggfunc=np.sum)
#fill blanks with zero
Thermo_NC.fillna(0, inplace=True)
#reset index
Thermo_NC = Thermo_NC.reset_index()
Thermo_NC = Thermo_NC.rename_axis(None, axis=1)                           #drop index name

#### Bring percentage table together with complex cooling dataframe

In [None]:
Thermo_complex = Thermo_df[(Thermo_df[['COOLING_TYPE']] == "COMPLEX").all(axis=1)]
Thermo_complex = pd.merge(Thermo_complex, Thermo_NC, how="left", on=["Interconnect", "GENERATION_TYPE"])

#### Calculate out Complex cooling types by percentages determined previously

In [None]:
Thermo_complex["ONCE-THROUGH_W"] = Thermo_complex["WITHDRAWAL"]*Thermo_complex["ONCE-THROUGH"]
Thermo_complex["ONCE-THROUGH_C"] = Thermo_complex["CONSUMPTION"]*Thermo_complex["ONCE-THROUGH"]

Thermo_complex["RECIRCULATING POND_W"] = Thermo_complex["WITHDRAWAL"]*Thermo_complex["RECIRCULATING POND"]
Thermo_complex["RECIRCULATING POND_C"] = Thermo_complex["WITHDRAWAL"]*Thermo_complex["RECIRCULATING POND"]

Thermo_complex["RECIRCULATING TOWER_W"] = Thermo_complex["WITHDRAWAL"]*Thermo_complex["RECIRCULATING TOWER"]
Thermo_complex["RECIRCULATING TOWER_C"] = Thermo_complex["WITHDRAWAL"]*Thermo_complex["RECIRCULATING TOWER"]

In [None]:
#split out once-through as a separate dataframe
Thermo_Complex_OT = Thermo_complex[["Plant Code", "Interconnect","GENERATION_TYPE", "SOURCE", "TYPE", 
                                    "NAME_OF_WATER_SOURCE", "ONCE-THROUGH_W", "ONCE-THROUGH_C"]].copy()
Thermo_Complex_OT["COOLING_TYPE"] = "ONCE-THROUGH"
Thermo_Complex_OT["Cool_Type"] = "OT"

#split out recirculating tower as a separate dataframe
Thermo_Complex_RT = Thermo_complex[["Plant Code", "Interconnect","GENERATION_TYPE", "SOURCE", "TYPE", 
                                    "NAME_OF_WATER_SOURCE", "RECIRCULATING TOWER_W", "RECIRCULATING TOWER_C"]].copy()
Thermo_Complex_RT["COOLING_TYPE"] = "RECIRCULATING TOWER"
Thermo_Complex_RT["Cool_Type"] = "RT"

#split out recirculating pond as a separate dataframe
Thermo_Complex_RP = Thermo_complex[["Plant Code", "Interconnect", "GENERATION_TYPE","SOURCE", 
                                    "TYPE", "NAME_OF_WATER_SOURCE", 
                                    "RECIRCULATING POND_W", "RECIRCULATING POND_C"]].copy()
Thermo_Complex_RP["COOLING_TYPE"] = "RECIRCULATING POND"
Thermo_Complex_RP["Cool_Type"] = "RP"

In [None]:
#bring the three separate cooling type dataframes together
join = [Thermo_Complex_OT, Thermo_Complex_RT,Thermo_Complex_RP]
Complex_df = pd.concat(join)
#Thermo_complex_con = pd.concat(join)

Complex_df.fillna(0, inplace=True)

In [None]:
Complex_df["WITHDRAWAL"] = Complex_df[['ONCE-THROUGH_W','RECIRCULATING TOWER_W','RECIRCULATING POND_W']].max(axis=1)
Complex_df["CONSUMPTION"] = Complex_df[['ONCE-THROUGH_C','RECIRCULATING TOWER_C','RECIRCULATING POND_C']].max(axis=1)

In [None]:
Complex_df = Complex_df[['Plant Code','Interconnect','GENERATION_TYPE','SOURCE',
                         'TYPE','NAME_OF_WATER_SOURCE','COOLING_TYPE','WITHDRAWAL','CONSUMPTION']]

#### Combine back with non-complex cooling dataframe

In [None]:
#copy of dataframe without complex cooling type
Not_complex_df = Thermo_df[(Thermo_df[['COOLING_TYPE']] != "COMPLEX").all(axis=1)]
Not_complex_df = Not_complex_df[['Plant Code','Interconnect','GENERATION_TYPE','SOURCE',
                         'TYPE','NAME_OF_WATER_SOURCE','COOLING_TYPE','WITHDRAWAL','CONSUMPTION']]

frame = [Not_complex_df, Complex_df]

#### Plant-level complete withdrawal and consumption dataset

In [None]:
#final dataframe with all complex split out into various types
Thermoelectric_df = pd.concat(frame)

#changing to MGY from MGD
Thermoelectric_df[Thermoelectric_df.select_dtypes(include=['number']).columns] *= 365

### Discharge Calculations

#### Calculating discharge to ocean and surface

The information on name of source water is used to determine discharge. It is assumed that all plants that receive their cooling water from the Pacific Ocean, the Atlantic Ocean, the Gulf of Mexico, or various harbors, channels, or sounds return it to the ocean.

In [None]:
#ocean discharge
Thermoelectric_df['TEC_OD'] = np.where(Thermoelectric_df['NAME_OF_WATER_SOURCE'].str.contains('Ocean', regex=False),
                                       Thermoelectric_df['WITHDRAWAL'] - Thermoelectric_df['CONSUMPTION'], 
                                       0)
#gulf of mexico
Thermoelectric_df['TEC_OD'] = np.where(Thermoelectric_df['NAME_OF_WATER_SOURCE'].str.contains('Gulf', regex=False),
                                       Thermoelectric_df['WITHDRAWAL'] - Thermoelectric_df['CONSUMPTION'],
                                       Thermoelectric_df['TEC_OD'])

#only bays with saline water are ocean discharge (some bays are on lakes (e.g. Green Bay))
Thermoelectric_df['TEC_OD'] = np.where(Thermoelectric_df['NAME_OF_WATER_SOURCE'].str.contains('Gulf', regex=False) & 
                                       Thermoelectric_df['TYPE'] == "SA",
                                       Thermoelectric_df['WITHDRAWAL'] - Thermoelectric_df['CONSUMPTION'],
                                       Thermoelectric_df['TEC_OD'])
#harbors
Thermoelectric_df['TEC_OD'] = np.where(Thermoelectric_df['NAME_OF_WATER_SOURCE'].str.contains('Harbor', regex=False) & 
                                       Thermoelectric_df['TYPE'] == "SA",
                                       Thermoelectric_df['WITHDRAWAL'] - Thermoelectric_df['CONSUMPTION'],
                                       Thermoelectric_df['TEC_OD'])
#channels
Thermoelectric_df['TEC_OD'] = np.where(Thermoelectric_df['NAME_OF_WATER_SOURCE'].str.contains('Channel', regex=False) & 
                                       Thermoelectric_df['TYPE'] == "SA",
                                       Thermoelectric_df['WITHDRAWAL'] - Thermoelectric_df['CONSUMPTION'],
                                       Thermoelectric_df['TEC_OD'])
#sounds
Thermoelectric_df['TEC_OD'] = np.where(Thermoelectric_df['NAME_OF_WATER_SOURCE'].str.contains('Sound', regex=False) & 
                                       Thermoelectric_df['TYPE'] == "SA",
                                       Thermoelectric_df['WITHDRAWAL'] - Thermoelectric_df['CONSUMPTION'],
                                       Thermoelectric_df['TEC_OD'])

In [None]:
Thermoelectric_df['TEC_SD'] = np.where(Thermoelectric_df['TEC_OD'] == 0, 
                                                  Thermoelectric_df['WITHDRAWAL'] - Thermoelectric_df['CONSUMPTION'], 
                                                  0)

### Grouping Generator Sub-types for Simplification

In [None]:
Thermoelectric_df["GENERATION_TYPE"] = np.where(Thermoelectric_df["GENERATION_TYPE"]=="NGCC",
                                                      "NG",
                                                      Thermoelectric_df["GENERATION_TYPE"])

Thermoelectric_df["GENERATION_TYPE"]= np.where(Thermoelectric_df["GENERATION_TYPE"]=="GAS STEAM",
                                                      "NG",
                                                      Thermoelectric_df["GENERATION_TYPE"])

Thermoelectric_df["GENERATION_TYPE"]= np.where(Thermoelectric_df["GENERATION_TYPE"]=="OIL",
                                                      "PET",
                                                      Thermoelectric_df["GENERATION_TYPE"])

Thermoelectric_df["GENERATION_TYPE"] = np.where(Thermoelectric_df["GENERATION_TYPE"]=="OIL CC",
                                                      "PET",
                                                      Thermoelectric_df["GENERATION_TYPE"])
#dropping row where generation type was blank in dataset
Thermoelectric_df = Thermoelectric_df[Thermoelectric_df.GENERATION_TYPE != 0]

In [None]:
Thermoelectric_df["SOURCE"] = np.where(Thermoelectric_df["SOURCE"] == "UM", "OT",Thermoelectric_df["SOURCE"])

Thermoelectric_df["SOURCE"] = np.where(Thermoelectric_df["TYPE"] == "BE", "OT",Thermoelectric_df["SOURCE"])

In [None]:
Thermoelectric_df["GENERATION_TYPE"] = Thermoelectric_df["GENERATION_TYPE"].astype(str)

Thermoelectric_df["In_Flow"] = Thermoelectric_df["TYPE"] + Thermoelectric_df["SOURCE"] + "_" + Thermoelectric_df["GENERATION_TYPE"] + "_" + "PT"

### Aggregating Values by Interconnect

In [None]:
Thermoelectric_Interconnect = Thermoelectric_df.drop(["Plant Code"], axis=1)
Thermoelectric_Interconnect = Thermoelectric_Interconnect.groupby(["Interconnect",
                                                                   "In_Flow"], as_index=False).sum()

In [None]:
#Pivoting withdrawal dataframe
Thermoelectric_Interconnect_With = Thermoelectric_Interconnect[Thermoelectric_Interconnect.WITHDRAWAL > 0]
Thermoelectric_Interconnect_With = pd.pivot_table(Thermoelectric_Interconnect, 
                                                  values='WITHDRAWAL', index=['Interconnect'],
                    columns=['In_Flow'], aggfunc=np.sum)

Thermoelectric_Interconnect_With = Thermoelectric_Interconnect_With.reset_index()
Thermoelectric_Interconnect_With = Thermoelectric_Interconnect_With.rename_axis(None, axis=1)

In [None]:
#reorganizing surface discharge
PT_SD = Thermoelectric_df.groupby(["Interconnect",'GENERATION_TYPE'], as_index=False).sum()
PT_SD = PT_SD[["Interconnect", 'GENERATION_TYPE', "TEC_SD"]]
PT_SD["Identifier"] = PT_SD['GENERATION_TYPE'] + "_" + "SD" + "_"+  "PT"
PT_SD = PT_SD[PT_SD.TEC_SD > 0]

PT_SD = pd.pivot_table(PT_SD, values='TEC_SD', index=['Interconnect'],
                    columns=['Identifier'], aggfunc=np.sum)
PT_SD = PT_SD.reset_index()
PT_SD = PT_SD.rename_axis(None, axis=1)

#reorganizing ocean discharge

PT_OD = Thermoelectric_df.groupby(["Interconnect",'GENERATION_TYPE'], as_index=False).sum()
PT_OD = PT_OD[["Interconnect", 'GENERATION_TYPE', "TEC_OD"]]
PT_OD["Identifier"] = PT_OD['GENERATION_TYPE'] + "_" + "OD" + "_"+  "PT"
PT_OD = PT_OD[PT_OD.TEC_OD > 0]

PT_OD = pd.pivot_table(PT_OD, values='TEC_OD', index=['Interconnect'],
                    columns=['Identifier'], aggfunc=np.sum)
PT_OD = PT_OD.reset_index()
PT_OD = PT_OD.rename_axis(None, axis=1)

#reorganizing consumptive use

PT_CU = Thermoelectric_df.groupby(["Interconnect",'GENERATION_TYPE'], as_index=False).sum()
PT_CU = PT_CU[["Interconnect", 'GENERATION_TYPE', "CONSUMPTION"]]
PT_CU["Identifier"] = PT_CU['GENERATION_TYPE'] + "_" + "CU" + "_"+  "PT"
PT_CU = PT_CU[PT_CU.CONSUMPTION > 0]

PT_CU = pd.pivot_table(PT_CU, values="CONSUMPTION", index=['Interconnect'],
                    columns=['Identifier'], aggfunc=np.sum)
PT_CU = PT_CU.reset_index()
PT_CU = PT_CU.rename_axis(None, axis=1)

In [None]:
Thermoelectric_discharge = pd.merge(PT_SD, PT_OD, how="left", on = "Interconnect")
Thermoelectric_discharge = pd.merge(Thermoelectric_discharge, PT_CU, how="left", on = "Interconnect")

#### Bringing together both the withdrawals by interconnect to each generator type and the discharge from each generator type

In [None]:
Thermoelectric_cooling = pd.merge(Thermoelectric_Interconnect_With, Thermoelectric_discharge, 
                                  how="left", on= "Interconnect")
Thermoelectric_cooling.fillna(0, inplace=True)

### Save to an excel file that can be brought in to the main notebook if desired

In [None]:
Thermoelectric_cooling.to_excel("data/Thermoelectric_Cooling_Detail.xlsx", index=False)