General naming principles _Value is the end unit used in the cost test

All pieces need to be for the full measure life

Real discount rate is the time value of money and needs to be multipled against the provided real dollar avoided cost inputs
once the real value of money is calculated those inputs are used for each measure's lifetime
Note this is just for the first year screening process
the Measure_Counts_to_Values.ipynb will have to be able to pull the cost of each measure starting at different times (this process happens after the adoption model)

In [64]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

In [65]:
#Column Cleaning Function
#change column names to be lowercase and replace spaces with underscores
# change $ to usd
def clean_column_names(df):
    df.columns = df.columns.str.lower().str.replace(' ', '_').str.replace('$', 'usd').str.replace('/', 'per').str.replace('&', 'and').str.replace('.', '')
    return df

In [66]:
df = pd.read_excel("./010_input/assembled_measures_v1.xlsx")
#df = pd.read_excel("./010_input/am-deferred-credit-v4.xlsx")
clean_column_names(df)
df

Unnamed: 0,measure_name,sector,program,electric_utility,gas_utility,market,baseline_condition,efficient_condition,building_type,measure_life_(yrs),...,annual_electric_energy_saved_(kwh)_-_refrigeration,annual_natural_gas_energy_saved_(mmbtu)_-_refrigeration,annual_propane_energy_saved_(mmbtu)_-_refrigeration,annual_heating_oil_energy_saved_(mmbtu)_-_refrigeration,annual_electric_energy_saved_(kwh)_-_other,annual_natural_gas_energy_saved_(mmbtu)_-_other,annual_propane_energy_saved_(mmbtu)_-_other,annual_heating_oil_energy_saved_(mmbtu)_-_other,water_savings_(gallons),kw-kwh_ratio
0,refrigerator_electricity_efficient_residential...,residential,LIRRet,test_utility_1,test_utility_2,RET_ER,refrigerator_electricity_existing_residential,refrigerator_electricity_efficient_residential,single_family_li,10,...,700,,,,,,,,0.1,1
1,refrigerator_electricity_efficient_residential...,residential,NLIRRet,test_utility_1,test_utility_2,RET_ER,refrigerator_electricity_existing_residential,refrigerator_electricity_efficient_residential,multi_family,10,...,700,,,,,,,,0.1,1
2,refrigerator_electricity_efficient_residential...,residential,LIRRet,test_utility_1,test_utility_2,RET_ER,refrigerator_electricity_existing_residential,refrigerator_electricity_efficient_residential,multi_family_li,10,...,700,,,,,,,,0.1,1
3,refrigerator_electricity_efficient_residential...,residential,NLIRRet,test_utility_1,test_utility_2,RET_ER,refrigerator_electricity_existing_residential,refrigerator_electricity_efficient_residential,single_family,10,...,700,,,,,,,,0.1,1
4,refrigerator_electricity_efficient_residential...,residential,LIRNC,test_utility_1,test_utility_2,NC,refrigerator_electricity_baseline_residential,refrigerator_electricity_efficient_residential,single_family_li,10,...,450,,,,,,,,0.1,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
123,refrigerator_electricity_top10_residential_ref...,residential,NLIRRepl,none,test_utility_3,ROB,refrigerator_electricity_baseline_residential,refrigerator_electricity_top10_residential,single_family,10,...,650,,,,,,,,0.1,1
124,refrigerator_electricity_top10_residential_ref...,residential,LIRRepl,none,test_utility_3,RENO,refrigerator_electricity_baseline_residential,refrigerator_electricity_top10_residential,single_family_li,10,...,650,,,,,,,,0.1,1
125,refrigerator_electricity_top10_residential_ref...,residential,NLIRRepl,none,test_utility_3,RENO,refrigerator_electricity_baseline_residential,refrigerator_electricity_top10_residential,multi_family,10,...,650,,,,,,,,0.1,1
126,refrigerator_electricity_top10_residential_ref...,residential,LIRRepl,none,test_utility_3,RENO,refrigerator_electricity_baseline_residential,refrigerator_electricity_top10_residential,multi_family_li,10,...,650,,,,,,,,0.1,1


In [67]:
# df['measure_incremental_cost'] = df['incremental_cost_(usd)']
df['measure_incremental_cost'] = df['incremental_installed_cost_year_1']

# to calculate this value for each building type divided the flh by the flh by 8760 # hours in a year
df['demand_ratio'] = 1/8760
#df['demand_ratio'] = df['kw-kwh_ratio']/8760
df['demand_ratio'] = df['demand_ratio'].astype(float)

#################################################
### Inputs From Measure Table and Adjustments ###
#################################################
# each of these are for a single installation of this measure (one widget)
# all costs and benefits are for just one year but the tests are run on Whole measures lives

df["deferred_replacement_credit_value"] = 1 #df["deferred_replacement_credit_(usd)"]

df['measure_water_savings'] = df['water_savings_(gallons)']


In [68]:
#########################################
### summing up by fuel energy savings ###
#########################################
# for Each energy impact we have columns energy_impact_1 which is the value, 
# energy_impact_1_fuel_type which tells us the fuel type, and 
# energy_impact_1_units which tells us the units (kwh, therms, gallons, etc)
# we also have energy_impact_1_end_use which tells us the end use (heating, cooling, lighting, etc)
# We need to sum up all the energy impacts by fuel type to get total savings by fuel and create new columns for each fuel type

# Initialize columns for each fuel type
df['measure_electric_energy_savings'] = 0.0
df['measure_natural_gas_savings'] = 0.0
df['measure_fuel_oil_savings'] = 0.0
df['measure_propane_savings'] = 0.0
df['measure_gasoline_savings'] = 0.0
df['measure_diesel_savings'] = 0.0

# Find all energy_impact value columns (energy_impact_1, energy_impact_2, etc.)
energy_impact_numbers = []
for col in df.columns:
    if col.startswith('energy_impact_') and not any(x in col for x in ['fuel_type', 'units', 'end_use']):
        try:
            num = int(col.split('_')[-1])
            energy_impact_numbers.append(num)
        except ValueError:
            pass

# Use pandas apply with axis=1 to process each row efficiently
def sum_energy_by_fuel(row):
    """Sum energy impacts by fuel type for a single row"""
    fuel_totals = {
        'electricity': 0.0,
        'natural_gas': 0.0,
        'fuel_oil': 0.0,
        'propane': 0.0,
        'gasoline': 0.0,
        'diesel': 0.0
    }
    
    for num in energy_impact_numbers:
        value = row.get(f'energy_impact_{num}')
        fuel_type = row.get(f'energy_impact_{num}_fuel_type')
        
        if pd.notna(value) and pd.notna(fuel_type) and fuel_type in fuel_totals:
            fuel_totals[fuel_type] += value
    
    return pd.Series([
        fuel_totals['electricity'],
        fuel_totals['natural_gas'],
        fuel_totals['fuel_oil'],
        fuel_totals['propane'],
        fuel_totals['gasoline'],
        fuel_totals['diesel']
    ])

# Apply the function to all rows and assign to the new columns
df[['measure_electric_energy_savings', 'measure_natural_gas_savings', 
    'measure_fuel_oil_savings', 'measure_propane_savings',
    'measure_gasoline_savings', 'measure_diesel_savings']] = df.apply(sum_energy_by_fuel, axis=1)
df

Unnamed: 0,measure_name,sector,program,electric_utility,gas_utility,market,baseline_condition,efficient_condition,building_type,measure_life_(yrs),...,measure_incremental_cost,demand_ratio,deferred_replacement_credit_value,measure_water_savings,measure_electric_energy_savings,measure_natural_gas_savings,measure_fuel_oil_savings,measure_propane_savings,measure_gasoline_savings,measure_diesel_savings
0,refrigerator_electricity_efficient_residential...,residential,LIRRet,test_utility_1,test_utility_2,RET_ER,refrigerator_electricity_existing_residential,refrigerator_electricity_efficient_residential,single_family_li,10,...,1200,0.000114,1,0.1,0.0,0.0,0.0,0.0,0.0,0.0
1,refrigerator_electricity_efficient_residential...,residential,NLIRRet,test_utility_1,test_utility_2,RET_ER,refrigerator_electricity_existing_residential,refrigerator_electricity_efficient_residential,multi_family,10,...,1200,0.000114,1,0.1,0.0,0.0,0.0,0.0,0.0,0.0
2,refrigerator_electricity_efficient_residential...,residential,LIRRet,test_utility_1,test_utility_2,RET_ER,refrigerator_electricity_existing_residential,refrigerator_electricity_efficient_residential,multi_family_li,10,...,1200,0.000114,1,0.1,0.0,0.0,0.0,0.0,0.0,0.0
3,refrigerator_electricity_efficient_residential...,residential,NLIRRet,test_utility_1,test_utility_2,RET_ER,refrigerator_electricity_existing_residential,refrigerator_electricity_efficient_residential,single_family,10,...,1200,0.000114,1,0.1,0.0,0.0,0.0,0.0,0.0,0.0
4,refrigerator_electricity_efficient_residential...,residential,LIRNC,test_utility_1,test_utility_2,NC,refrigerator_electricity_baseline_residential,refrigerator_electricity_efficient_residential,single_family_li,10,...,400,0.000114,1,0.1,0.0,0.0,0.0,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
123,refrigerator_electricity_top10_residential_ref...,residential,NLIRRepl,none,test_utility_3,ROB,refrigerator_electricity_baseline_residential,refrigerator_electricity_top10_residential,single_family,10,...,700,0.000114,1,0.1,0.0,0.0,0.0,0.0,0.0,0.0
124,refrigerator_electricity_top10_residential_ref...,residential,LIRRepl,none,test_utility_3,RENO,refrigerator_electricity_baseline_residential,refrigerator_electricity_top10_residential,single_family_li,10,...,700,0.000114,1,0.1,0.0,0.0,0.0,0.0,0.0,0.0
125,refrigerator_electricity_top10_residential_ref...,residential,NLIRRepl,none,test_utility_3,RENO,refrigerator_electricity_baseline_residential,refrigerator_electricity_top10_residential,multi_family,10,...,700,0.000114,1,0.1,0.0,0.0,0.0,0.0,0.0,0.0
126,refrigerator_electricity_top10_residential_ref...,residential,LIRRepl,none,test_utility_3,RENO,refrigerator_electricity_baseline_residential,refrigerator_electricity_top10_residential,multi_family_li,10,...,700,0.000114,1,0.1,0.0,0.0,0.0,0.0,0.0,0.0


In [69]:
#Read in avoided costs data
avoided_costs = pd.read_excel("010_input/11_Avoided_Cost.xlsx")
clean_column_names(avoided_costs)

#########################
#### Feature Request ####
#########################
# Requested feature from Matt - Have real discount rate change based on Cost test used
# so now we need to do all of these calculations for each and every cost test - THis will be accomplished by having different Avoided cost sheets we pull from that have different real discount rates applied
# now we need to convert all avoided costs (that are in real dollars of the start year) to discounted dollars due to the time value of money
# Money today is worth more than the same amount of money in the future due to its potential earning capacity
real_discount_rate = 0.03
# this value will need to be changeable based on test run
# I think we do this calculation for each test and just label the column with the test name that it will need to be used for

##############################
#### Applying Discounting ####
##############################
# Present Value = Future Value / (1 + real_discount_rate)^(year - base_year)
# Get list of columns that contain dollar values (looking for 'usd' in column name)
dollar_columns = [col for col in avoided_costs.columns if 'usd' in col.lower()]
avoided_costs['base_year'] = avoided_costs['year'].min()
# Apply discount formula to each dollar column
for col in dollar_columns:
    avoided_costs[col] = avoided_costs[col] / ((1 + real_discount_rate) ** (avoided_costs['year'] - avoided_costs['base_year']))
avoided_costs

Unnamed: 0,year,summer_on-peak_usdperkwh,summer_off-peak_usdperkwh,winter_on-peak_usdperkwh,winter_off-peak_usdperkwh,shoulder_on-peak_usdperkwh,shoulder_off-peak_usdperkwh,summer_gener_capacity_usdperkw-yr,summer_td_usdperkw-yr,winter_gener_capacity_usdperkw-yr,...,natural_gas_usdpermmbtu_ch4,res_fuel_oil_usdpermmbtu_ch4,com_fuel_oil_usdpermmbtu_ch4,ind_fuel_oil_usdpermmbtu_ch4,res_propane_usdpermmbtu_ch4,com_propane_usdpermmbtu_ch4,ind_propane_usdpermmbtu_ch4,diesel_transportation_usdpermmbtu_ch4,gasoline_transportation_usdpermmbtu_ch4,base_year
0,1,0.06,0.04,0.06,0.04,0.04,0.04,89.69,18.6,0.0,...,0.002688,0.008063,0.008063,0.008063,0.008063,0.008063,0.008063,0.008063,0.008063,1
1,2,0.058252,0.048544,0.058252,0.048544,0.038835,0.038835,48.15534,17.786408,0.0,...,0.002703,0.008108,0.008108,0.008108,0.008108,0.008108,0.008108,0.008108,0.008108,1
2,3,0.056556,0.04713,0.056556,0.04713,0.037704,0.037704,39.928363,17.258931,0.0,...,0.002716,0.008147,0.008147,0.008147,0.008147,0.008147,0.008147,0.008147,0.008147,1
3,4,0.054908,0.045757,0.054908,0.045757,0.036606,0.036606,57.846104,16.747092,0.0,...,0.002724,0.008173,0.008173,0.008173,0.008173,0.008173,0.008173,0.008173,0.008173,1
4,5,0.062194,0.044424,0.062194,0.044424,0.035539,0.035539,51.13243,16.250428,0.0,...,0.002731,0.008194,0.008194,0.008194,0.008194,0.008194,0.008194,0.008194,0.008194,1
5,6,0.060383,0.04313,0.060383,0.04313,0.034504,0.034504,52.558753,15.785741,0.0,...,0.002748,0.008243,0.008243,0.008243,0.008243,0.008243,0.008243,0.008243,0.008243,1
6,7,0.050249,0.041874,0.050249,0.041874,0.033499,0.033499,62.191581,15.317587,0.0,...,0.002762,0.008286,0.008286,0.008286,0.008286,0.008286,0.008286,0.008286,0.008286,1
7,8,0.048785,0.040655,0.048785,0.040655,0.032524,0.032524,54.753582,14.863313,0.0,...,0.002773,0.008319,0.008319,0.008319,0.008319,0.008319,0.008319,0.008319,0.008319,1
8,9,0.047365,0.03947,0.047365,0.03947,0.031576,0.031576,59.024128,14.422507,0.0,...,0.002781,0.008344,0.008344,0.008344,0.008344,0.008344,0.008344,0.008344,0.008344,1
9,10,0.045985,0.038321,0.045985,0.038321,0.030657,0.030657,62.432307,13.987105,0.0,...,0.002787,0.00836,0.00836,0.00836,0.00836,0.00836,0.00836,0.00836,0.00836,1


In [70]:
avoided_costs.to_csv("./010_output/avoided_costs_discounted_2026_All.csv", index=False) #_year is when we Real Discounted rated and they All is all cost test but we need the 6 versions that could be changed

In [71]:
# read in loadshapes data
loadshapes = pd.read_excel("010_input/12_loadshapes.xlsx")
clean_column_names(loadshapes)

Unnamed: 0,condition_name,competition_group,subgroup,summer_on-peak,summer_off-peak,winter_on-peak,winter_off-peak,shoulder_on-peak,shoulder_off-peak,summer_gener_capacity,winter_gener_capacity,summer_tandd,winter_tandd
0,furnace_fuel_oil_existing_residential,heating_cooling,oil_furnace,0.007878,0.018346,0.281812,0.368809,0.129864,0.193292,0.0,0.381007,0.0,0.381007
1,furnace_natural_gas_baseline_residential,heating_cooling,gas_furnace,0.007878,0.018346,0.281812,0.368809,0.129864,0.193292,0.0,0.381007,0.0,0.381007
2,furnace_natural_gas_efficient_residential,heating_cooling,gas_furnace,0.007878,0.018346,0.281812,0.368809,0.129864,0.193292,0.0,0.381007,0.0,0.381007
3,room_ac_electricity_baseline_residential,heating_cooling,room_ac,0.490621,0.371577,0.020994,0.026759,0.0484,0.04165,0.372861,0.0,0.372861,0.0
4,room_ac_electricity_efficient_residential,heating_cooling,room_ac,0.490621,0.371577,0.020994,0.026759,0.0484,0.04165,0.372861,0.0,0.372861,0.0
5,air_conditioner_electricity_baseline_residential,heating_cooling,central_ac,0.490621,0.371577,0.020994,0.026759,0.0484,0.04165,0.372861,0.0,0.372861,0.0
6,air_conditioner_electricity_efficient_residential,heating_cooling,central_ac,0.490621,0.371577,0.020994,0.026759,0.0484,0.04165,0.372861,0.0,0.372861,0.0
7,cchp_electricity_efficient_residential,heating_cooling,cchp,0.490621,0.371577,0.020994,0.026759,0.0484,0.04165,0.372861,0.0,0.372861,0.0
8,refrigerator_electricity_existing_residential,refrigeration,full_size,0.220863,0.237399,0.095015,0.110339,0.164724,0.171661,1.186104,0.888264,1.186104,0.888264
9,refrigerator_electricity_baseline_residential,refrigeration,full_size,0.220863,0.237399,0.095015,0.110339,0.164724,0.171661,1.186104,0.888264,1.186104,0.888264


In [72]:
# this function allows us to grab certain groups of columns needed in the analysis

def weighted_column_sum(df, weights_row, columns=None, fill_value=0.0):
    """
    df: DataFrame with values to weight (e.g. loadshapes)
    weights_row: Series-like with weights indexed by column name (e.g. avoided_costs.loc[0])
    columns: optional list of columns to use; if None, intersection of df.columns and weights_row.index
    """
    if columns is None:
        columns = df.columns.intersection(weights_row.index)
    w = pd.Series(weights_row).reindex(columns).astype(float).fillna(fill_value)
    return df[columns].fillna(fill_value).dot(w)

#Excel QC complete

In [73]:
# Line Losses
line_losses = pd.read_excel("010_input/15_Line_Losses.xlsx")
clean_column_names(line_losses)

Unnamed: 0,sectors,summer_on-peak,summer_off-peak,winter_on-peak,winter_off-peak,shoulder_on-peak,shoulder_off-peak,summer_gener_capacity,winter_gener_capacity,summer_tandd,winter_tandd
0,residential,0.0943,0.0943,0.0943,0.0943,0.0943,0.0943,0.0943,0.0943,0.0943,0.0943
1,commercial,0.079,0.079,0.079,0.079,0.079,0.079,0.079,0.079,0.079,0.079
2,industrial,0.079,0.079,0.079,0.079,0.079,0.079,0.079,0.079,0.079,0.079


In [74]:
###################################################################
## Electric Energy Savings Calculation with line-loss adjustment ##
###################################################################

# Find columns common to all three tables (avoided_costs, loadshapes, line_losses)
avoided_costs_electric_use = avoided_costs.iloc[:, :7]
#edit columns to drop some text so we can use the function

avoided_costs_electric_use.columns = [col.replace('_usdperkwh', '') for col in avoided_costs_electric_use.columns]

common = loadshapes.columns.intersection(avoided_costs_electric_use.columns).intersection(line_losses.columns)

# Helper to map df sector text to loss table key
def sector_key_from_sector_text(sector_text):
    s = str(sector_text).lower() if sector_text is not None else ''
    if 'res' in s or 'resident' in s or 'house' in s:
        return 'res'
    if 'com' in s or 'commercial' in s:
        return 'com'
    if 'ind' in s or 'industrial' in s:
        return 'ind'
    # fallback: try to match exact short codes if present
    if sector_text in ['res','com','ind']:
        return sector_text
    return None

# Total energy cost calculation: sum costs across measure lifetime (avoided_costs already NPVed)
def compute_electric_energy_total_cost(row):
    """
    Compute total electric energy savings value across measure lifetime using sector-specific line losses.
    Looks up the loadshape based on efficient_condition from the row.
    Note: avoided_costs are already discounted, so no additional NPV calculation needed.
    """
    measure_life = int(row.get('measure_life_(yrs)', 1))
    elec_savings = row.get('measure_electric_energy_savings', 0)
    
    if pd.isna(elec_savings) or elec_savings == 0:
        return 0.0

    # Get the loadshape vector for this measure's efficient_condition
    efficient_condition = row.get('efficient_condition', None)
    if efficient_condition and 'condition_name' in loadshapes.columns:
        loadshape_match = loadshapes[loadshapes['condition_name'] == efficient_condition]
        if not loadshape_match.empty:
            loadshape_vector = loadshape_match[common].fillna(0).iloc[0].astype(float)
        else:
            # Fallback to first row if no match found
            loadshape_vector = loadshapes[common].fillna(0).iloc[0].astype(float)
    else:
        # Fallback to first row if no efficient_condition or condition_name column
        loadshape_vector = loadshapes[common].fillna(0).iloc[0].astype(float)

    # determine sector key for selecting line_losses row
    sector_k = sector_key_from_sector_text(row.get('sector', None))

    total_cost = 0.0
    base_year = avoided_costs_electric_use['year'].min()

    # For each year in the measure life
    for year_offset in range(1, measure_life + 1):
        target_year = base_year + year_offset - 1

        # Get avoided costs for this year (or use last available if beyond data)
        if target_year in avoided_costs_electric_use['year'].values:
            year_costs = avoided_costs_electric_use[avoided_costs_electric_use['year'] == target_year][common].iloc[0].astype(float).fillna(0)
        else:
            # Use last available year if beyond data range
            year_costs = avoided_costs_electric_use[common].iloc[-1].astype(float).fillna(0)

        # Select line_losses row for this sector (try likely column names 'sector' or 'sectors')
        loss_row = None
        for lname in ['sector', 'sectors']:
            if lname in line_losses.columns:
                try:
                    mask = line_losses[lname].astype(str).str.lower().str.contains(str(sector_k))
                except Exception:
                    mask = pd.Series([False] * len(line_losses), index=line_losses.index)
                if mask.any():
                    loss_row = line_losses.loc[mask, common].iloc[0].astype(float).fillna(0)
                    break
        # Fallback to first row if no sector-matched row found # this could cause issues
        if loss_row is None:
            loss_row = line_losses.loc[0, common].astype(float).fillna(0)

        # Compute weighted sum for this year (loadshape * (1 + line_loss) * avoided_cost)
        year_value = (loadshape_vector * (1 + loss_row)).dot(year_costs)

        # Sum up total cost (no discounting since avoided_costs are already NPVed)
        total_cost += elec_savings * year_value

    return total_cost

# Apply total cost calculation to each row
df['electric_energy_savings_value'] = df.apply(compute_electric_energy_total_cost, axis=1)


In [75]:
###################################################################
# Electric Demand Total Cost Calculation (standalone cell)
# This cell contains the helper and per-row total cost function for demand-related avoided costs
###################################################################

# Helper to get loadshape values for a given condition
# (If you already have this defined elsewhere in the notebook, this re-definition is safe and will override.)
def get_loadshape_values(condition_name):
    """
    Fetch loadshape values for a given condition_name.
    Returns a dict with the period-specific loadshape values.
    If condition_name not found, fallback to first row.
    """
    if 'condition_name' in loadshapes.columns:
        matching = loadshapes[loadshapes['condition_name'] == condition_name]
        if not matching.empty:
            row = matching.iloc[0]
        else:
            # fallback to first row if condition not found
            row = loadshapes.iloc[0]
    else:
        # If no condition_name column, use first row
        row = loadshapes.iloc[0]
    
    return {
        'summer_gener_capacity': row.get('summer_gener_capacity', 0),
        'summer_tandd': row.get('summer_tandd', 0),
        'winter_gener_capacity': row.get('winter_gener_capacity', 0),
        'winter_tandd': row.get('winter_tandd', 0)
    }

# Function to compute total lifetime demand savings per row with condition-based loadshapes
def compute_electric_demand_total_cost(row):
    """
    Compute total lifetime electric demand savings value for this row using:
    - condition_name from row to select appropriate loadshape values
    - year-by-year avoided costs for demand (summer and winter capacity, T&D)
    - line losses (first row used here; consider making sector-aware)
    - Full measure lifetime summed (avoided_costs already NPVed, no additional discounting)
    """
    measure_life = int(row.get('measure_life_(yrs)', 1))
    demand_ratio = row.get('demand_ratio', 1.0 / 8760)
    elec_energy_savings = row.get('measure_electric_energy_savings', 0)
    
    if pd.isna(elec_energy_savings) or elec_energy_savings == 0:
        return 0.0
    
    # Get loadshape values for this row's condition
    condition = row.get('efficient_condition', None)
    ls_vals = get_loadshape_values(condition)
    
    # Get line loss values (first row for now)
    summer_gen_loss = line_losses.loc[0, 'summer_gener_capacity'] if 'summer_gener_capacity' in line_losses.columns else 0
    summer_td_loss = line_losses.loc[0, 'summer_tandd'] if 'summer_tandd' in line_losses.columns else 0
    winter_gen_loss = line_losses.loc[0, 'winter_gener_capacity'] if 'winter_gener_capacity' in line_losses.columns else 0
    winter_td_loss = line_losses.loc[0, 'winter_tandd'] if 'winter_tandd' in line_losses.columns else 0
    
    total_cost = 0.0
    base_year = avoided_costs['year'].min()
    
    # For each year in the measure life
    for year_offset in range(1, measure_life + 1):
        target_year = base_year + year_offset - 1
        
        # Get avoided costs for this year (or use last available if beyond data)
        if target_year in avoided_costs['year'].values:
            year_costs = avoided_costs[avoided_costs['year'] == target_year].iloc[0]
        else:
            # Use last available year if beyond data range
            year_costs = avoided_costs.iloc[-1]
        
        # Extract demand-related avoided cost values for this year
        summer_gen_cost = year_costs.get('summer_gener_capacity_usdperkw-yr', 0)
        summer_td_cost = year_costs.get('summer_td_usdperkw-yr', 0)
        winter_gen_cost = year_costs.get('winter_gener_capacity_usdperkw-yr', 0)
        winter_td_cost = year_costs.get('winter_td_usdperkw-yr', 0)
        
        # Compute weighted demand value for this year using condition-specific loadshapes
        year_demand_value = (
            (summer_gen_cost * (1 + summer_gen_loss) * ls_vals['summer_gener_capacity'])
            + (summer_td_cost * (1 + summer_td_loss) * ls_vals['summer_tandd'])
            + (winter_gen_cost * (1 + winter_gen_loss) * ls_vals['winter_gener_capacity'])
            + (winter_td_cost * (1 + winter_td_loss) * ls_vals['winter_tandd'])
        )
        
        # Sum up total cost (no discounting since avoided_costs are already NPVed)
        total_cost += demand_ratio * elec_energy_savings * year_demand_value
   
    return total_cost

# Apply per-row calculation (creates/overwrites df['electric_demand_savings_value'])
df['electric_demand_savings_value'] = df.apply(compute_electric_demand_total_cost, axis=1)

# Quick check (print a small sample)
print(df[['efficient_condition','measure_life_(yrs)','measure_electric_energy_savings','electric_demand_savings_value']])

                                efficient_condition  measure_life_(yrs)  \
0    refrigerator_electricity_efficient_residential                  10   
1    refrigerator_electricity_efficient_residential                  10   
2    refrigerator_electricity_efficient_residential                  10   
3    refrigerator_electricity_efficient_residential                  10   
4    refrigerator_electricity_efficient_residential                  10   
..                                              ...                 ...   
123      refrigerator_electricity_top10_residential                  10   
124      refrigerator_electricity_top10_residential                  10   
125      refrigerator_electricity_top10_residential                  10   
126      refrigerator_electricity_top10_residential                  10   
127      refrigerator_electricity_top10_residential                  10   

     measure_electric_energy_savings  electric_demand_savings_value  
0                            

In [76]:
###########################################################
## Natural Gas Savings (Full Measure Lifetime)         ####
###########################################################
# Calculate the lifetime savings over the full measure lifetime
# The avoided costs are already discounted, so we just sum them across the measure life

def calculate_lifetime_savings(measure_savings, avoided_cost_series, measure_life):
    """
    measure_savings: annual savings (scalar)
    avoided_cost_series: series of already-discounted avoided costs indexed by year
    measure_life: number of years the measure lasts
    """
    # Sum the discounted avoided costs across all years of measure life
    lifetime_savings = 0
    for year_offset in range(1, measure_life + 1):
        year = avoided_cost_series.index.min() + year_offset -1
        if year in avoided_cost_series.index:
            annual_cost = avoided_cost_series[year]
        else:
            # If year is beyond available data, use the last available year
            annual_cost = avoided_cost_series.iloc[-1]
        
        lifetime_savings += measure_savings * annual_cost
    
    return lifetime_savings

# Apply to natural gas savings for each row
df['natural_gas_savings_value'] = df.apply(
    lambda row: calculate_lifetime_savings(
        row['measure_natural_gas_savings'],
        avoided_costs.set_index('year')['natural_gas_usdpermmbtu'],
        int(row['measure_life_(yrs)'])
    ),
    axis=1
)

# Excel QC Complete

In [77]:
#################################
#### Other fuel Calculations ####
#################################
# These calculations are performed based on sector matching
# Sector prefixes: res = residential, com = commercial, ind = industrial
# Fuel types: fuel_oil, propane, diesel_transportation, gasoline_transportation

# Define fuel type mappings: (fuel_type, measure_column, avoided_cost_columns_by_sector)
fuel_configs = [
    ('fuel_oil', 'measure_fuel_oil_savings', {
        'res': 'res_fuel_oil_usdpermmbtu',
        'com': 'com_fuel_oil_usdpermmbtu',
        'ind': 'ind_fuel_oil_usdpermmbtu'
    }),
    ('propane', 'measure_propane_savings', {
        'res': 'res_propane_usdpermmbtu',
        'com': 'com_propane_usdpermmbtu',
        'ind': 'ind_propane_usdpermmbtu'
    }),
    # I just did this to make the loop more complete # ther is probably a better way?
    ('diesel_transportation', 'measure_diesel_savings', {
        'res': 'diesel_transportation_usdpermmbtu',
        'com': 'diesel_transportation_usdpermmbtu',
        'ind': 'diesel_transportation_usdpermmbtu'
    }),
    ('gasoline_transportation', 'measure_gasoline_savings', {
        'res': 'gasoline_transportation_usdpermmbtu',
        'com': 'gasoline_transportation_usdpermmbtu',
        'ind': 'gasoline_transportation_usdpermmbtu'
    })
]

# Process each fuel type
for fuel_type, measure_col, sector_columns in fuel_configs:
    output_col = f'{fuel_type}_savings_value'
    
    def calculate_sector_fuel_savings(row):
        """Calculate lifetime savings based on sector and fuel type"""
        # Determine sector from the 'sector' column (case-insensitive)
        sector = None
        sector_lower = str(row.get('sector', '')).lower()
        
        if 'res' in sector_lower:
            sector = 'res'
        elif 'com' in sector_lower:
            sector = 'com'
        elif 'ind' in sector_lower:
            sector = 'ind'
        else:
            return 0  # No matching sector
        
        # Get the appropriate avoided cost column for this sector
        avoided_cost_col = sector_columns.get(sector)
        
        # Check if the column exists in avoided_costs
        if avoided_cost_col not in avoided_costs.columns:
            return 0
        
        # Check if measure_col exists in the row and get measure savings
        if measure_col not in row.index or pd.isna(row[measure_col]):
            return 0
        
        measure_savings = row[measure_col]
        
        # Calculate lifetime savings using the existing function
        try:
            return calculate_lifetime_savings(
                measure_savings,
                avoided_costs.set_index('year')[avoided_cost_col],
                int(row['measure_life_(yrs)'])
            )
        except:
            return 0
    
    # Apply the calculation for this fuel type
    df[output_col] = df.apply(calculate_sector_fuel_savings, axis=1)

# Excel QC Complete

In [78]:
df['other_fuel_savings_value'] = df['fuel_oil_savings_value'] + df['propane_savings_value'] + df['diesel_transportation_savings_value'] + df['gasoline_transportation_savings_value']


In [79]:
# oandm savings are already in the dollar value for a single unit
# this value may be characterized as just the difference in oandm costs between the baseline and measure condition
# NPV of O&M savings over measure lifetime
# Each year's O&M savings is discounted back to present value using compounding discount rate
real_discount_rate = 0.03
df['annual_oandm_savings'] = df['annual_oandm_cost']
# df['annual_oandm_savings'] = df['annual_oandm_savings_(usd)']
 #will be changed to savings

def calculate_oandm_npv(row):
    """
    Calculate NPV of O&M savings over measure lifetime.
    Applies compounding discount rate: PV = FV / (1 + r)^year
    """
    annual_oandm = row['annual_oandm_savings']
    measure_life = int(row['measure_life_(yrs)'])
    
    npv_total = 0.0
    for year in range(1, measure_life + 1):
        # Discount factor compounds each year: 1/(1+r)^year
       # discount_factor = 1.0 / ((1 + real_discount_rate) ** year)
        npv_total += annual_oandm / ((1 + real_discount_rate) ** year)
    
    return npv_total

df['o_and_m_savings_value'] = df.apply(calculate_oandm_npv, axis=1)

In [80]:
#raw input from measure table * water avoided cost from avoided costs table

df['water_savings_value'] = df.apply(
    lambda row: calculate_lifetime_savings(
        row['measure_water_savings'],
        avoided_costs.set_index('year')['water_usdpergallon'],
        int(row['measure_life_(yrs)'])
    ),
    axis=1
)

In [81]:
### Incentives
# Incentive values are a fraction of the measure_incremental_cost depending on program, building type, end_use and install type 
# Input table 13_incentives 
# For now we just do incentives at the measure level and will have the percent tables something in excel other could use
incentives = pd.read_excel("010_input/13_Incentives.xlsx")
clean_column_names(incentives)
# will need to make sure there is no double counting here
# because are initial counts will be specific to the combination (test_utility_gas & test utility electric) we should
# have a electric and gas utility for each and the incentives need to know this and not add up to more than the incremental cost

# The market characterization file will add in the information on who is providing incentives
# we need to add a loop here that makes sure that we do not exceed 100% of the incremental cost and that the approprate utitlity incentives are being pulled in
# for instance if only electric utility is providing incentive then only electric utility percentage should be used (with the right name)
# might need to do deferred replacement in here

# upgrades - CUrrently this requires a none in the input data. which I think makes sense
# lots of checks that should probably be integrated in the pipeline but arent necessary for the code to work
# --- Incentives: match by measure_name and utility and compute dollar incentives ---

# Normalize / detect key columns (some files may have slightly different names)
# Required: 'measure_name' should exist in incentives (if not, try common variants)
if 'measure_name' not in incentives.columns:
    for alt in ['measure', 'measureid', 'measure_id', 'measure_name_']:
        if alt in incentives.columns:
            incentives = incentives.rename(columns={alt: 'measure_name'})
            break

# Utility column may or may not exist; if it does, we'll match per-utility; otherwise fallback to per-measure values
utility_col_name = None
for c in ['utility', 'utility_name', 'program_utility', 'provider']:
    if c in incentives.columns:
        utility_col_name = c
        break

# Find the likely percentage columns (tolerant to small name differences)
def find_col(df, must_have):
    for col in df.columns:
        low = col.lower()
        if all(part in low for part in must_have):
            return col
    return None

electric_pct_col = find_col(incentives, ['electric', 'incent', 'percent']) or find_col(incentives, ['electric', 'incentive'])
gas_pct_col = find_col(incentives, ['natural', 'gas', 'incent', 'percent']) or find_col(incentives, ['gas', 'incent'])
nonutility_pct_col = find_col(incentives, ['non', 'util', 'incent', 'percent']) or find_col(incentives, ['nonutility', 'incent'])

# As a defensive fallback, try obvious exact names used previously
if electric_pct_col is None and 'electric_incentive_percentage' in incentives.columns:
    electric_pct_col = 'electric_incentive_percentage'
if gas_pct_col is None and 'natural_gas_incentive_percentage' in incentives.columns:
    gas_pct_col = 'natural_gas_incentive_percentage'
if nonutility_pct_col is None and 'nonutility_incentive_percentage' in incentives.columns:
    nonutility_pct_col = 'nonutility_incentive_percentage'

# If any of those are still missing, set to None and we'll treat missing as 0
# Helper to get percentage for a given measure_name and utility (with fallbacks)
def get_incentive_pct(measure_name, utility_name, pct_col):
    if pct_col is None:
        return 0.0
    # try exact measure+utility if utility column exists
    if utility_col_name is not None and utility_name is not None:
        match = incentives[(incentives['measure_name'] == measure_name) & (incentives[utility_col_name] == utility_name)]
        if not match.empty and pd.notna(match.iloc[0].get(pct_col)):
            return float(match.iloc[0][pct_col])
    # next fallback: any row with the measure_name (ignore utility)
    match2 = incentives[incentives['measure_name'] == measure_name]
    if not match2.empty:
        # prefer a row where the pct_col is not null
        non_null = match2[match2[pct_col].notna()]
        if not non_null.empty:
            return float(non_null.iloc[0][pct_col])
        # otherwise take first row even if NaN (will become 0)
        val = match2.iloc[0].get(pct_col)
        return float(val) if pd.notna(val) else 0.0
    # last fallback: try to use a global/default in the incentives table (first row)
    if pct_col in incentives.columns and not incentives.empty:
        val = incentives.iloc[0].get(pct_col)
        return float(val) if pd.notna(val) else 0.0
    return 0.0

# Determine what column in the main df holds the gas utility name
gas_utility_col = None
for c in ['gas_utility', 'natural_gas_utility', 'gasutility', 'gas_provider']:
    if c in df.columns:
        gas_utility_col = c
        break

# Determine what column in the main df holds the electric utility name
electric_utility_col = None
for c in ['electric_utility', 'elec_utility', 'electricutility', 'electric_provider']:
    if c in df.columns:
        electric_utility_col = c
        break

# Compute percentage columns on df (stored so you can QA)
df['electric_incentive_pct'] = df.apply(
    lambda r: get_incentive_pct(r.get('measure_name'), r.get(electric_utility_col) if electric_utility_col else None, electric_pct_col),
    axis=1
)
df['natural_gas_incentive_pct'] = df.apply(
    lambda r: get_incentive_pct(r.get('measure_name'), r.get(gas_utility_col) if gas_utility_col else None, gas_pct_col),
    axis=1
)
df['nonutility_incentive_pct'] = df.apply(
    lambda r: get_incentive_pct(r.get('measure_name'), None, nonutility_pct_col),
    axis=1
)

# If the incentive percentages in your incentives file are expressed as whole percent (e.g., 50 for 50%),
# convert to decimals when a value > 1 is detected.
for pct_col in ['electric_incentive_pct', 'natural_gas_incentive_pct', 'nonutility_incentive_pct']:
    if df[pct_col].abs().max() > 1:
        df[pct_col] = df[pct_col] / 100.0

# Finally compute dollar incentives by multiplying the incremental cost per measure
df['electric_utility_incentive_value'] = df['measure_incremental_cost'] * df['electric_incentive_pct']
df['natural_gas_utility_incentive_value'] = df['measure_incremental_cost'] * df['natural_gas_incentive_pct']
df['nonutility_incentive_value'] = df['measure_incremental_cost'] * df['nonutility_incentive_pct']

df['total_incentive_value'] = df['electric_utility_incentive_value'] + df['natural_gas_utility_incentive_value'] + df['nonutility_incentive_value']

In [82]:
### Program Costs
# from 14_program costs
programs = pd.read_excel("010_input/14_Programs.xlsx")
clean_column_names(programs)

# will also need to adjust each utilities program costs allowing for non overlap in the fuel type groupings
# fuel type groups are needed for the tests but they all work together in that all added or all subtracted
# Merge df with programs on 'program' and 'fuel'
### Program Costs

# Robust mapping: match main df rows to programs by program, utility, and fuel,
# then compute non-measure program costs = measure_incremental_cost * non-incentive percent
# Detect possible column-name variants in the programs table
prog_program_col = None
for c in ['program', 'program_name', 'programs']:
    if c in programs.columns:
        prog_program_col = c
        break
prog_utility_col = None
for c in ['utility', 'utility_name', 'provider']:
    if c in programs.columns:
        prog_utility_col = c
        break
prog_fuel_col = None
for c in ['fuel', 'utility_fuel', 'fuel_type']:
    if c in programs.columns:
        prog_fuel_col = c
        break
# percent column (allow for dashes / underscores / long names)
pct_col = None
for c in programs.columns:
    low = c.lower()
    if 'non' in low and 'incent' in low and 'percent' in low:
        pct_col = c
        break
# fallback: look for 'non-incentive' then 'percent'
if pct_col is None:
    for c in programs.columns:
        low = c.lower()
        if 'non' in low and 'incent' in low:
            pct_col = c
            break
# If still None, try an exact expected name used historically
if pct_col is None and 'non-incentive_costs_as_a_percent_of_incentive_costs' in programs.columns:
    pct_col = 'non-incentive_costs_as_a_percent_of_incentive_costs'
# Defensive: if we still don't have a pct_col, set it to None and treat as zeros
if pct_col is None:
    print("Warning: could not find non-incentive percent column in programs; program-costs will be 0.")

# Helpers to normalize fuel names and detect match rows
def fuel_matches(val, desired):
    if pd.isna(val):
        return False
    v = str(val).lower()
    if desired == 'electric':
        return 'elect' in v or 'elec' in v
    if desired == 'natural_gas':
        return 'gas' in v and ('natural' in v or 'nat' in v) or v.strip() == 'gas'
    if desired == 'other':
        return 'other' in v or 'non' in v or 'oth' in v
    return desired in v

# Determine which columns in main df hold program, electric utility and gas utility
df_program_col = None
for c in ['program', 'program_name', 'programs']:
    if c in df.columns:
        df_program_col = c
        break
electric_utility_col = None
for c in ['electric_utility', 'elec_utility', 'electricutility', 'electric_provider']:
    if c in df.columns:
        electric_utility_col = c
        break
gas_utility_col = None
for c in ['gas_utility', 'natural_gas_utility', 'gasutility', 'gas_provider']:
    if c in df.columns:
        gas_utility_col = c
        break

# Function to lookup percent from programs table with fallbacks
def lookup_program_pct(prog_name, util_name, desired_fuel):
    # If pct_col is missing, return 0
    if pct_col is None:
        return 0.0
    # Try exact match on program, utility, and fuel
    candidates = programs
    if prog_program_col is not None and prog_name is not None:
        candidates = candidates[candidates[prog_program_col] == prog_name]
    if prog_utility_col is not None and util_name is not None:
        matched = candidates[candidates[prog_utility_col] == util_name]
        # keep candidates as matched if not empty for next step
        if not matched.empty:
            candidates = matched
    # Filter by fuel using fuzzy match
    # prefer exact fuel match rows
    fuel_mask = candidates[prog_fuel_col].apply(lambda x: fuel_matches(x, desired_fuel)) if prog_fuel_col in candidates.columns else pd.Series([False]*len(candidates), index=candidates.index)
    if fuel_mask.any():
        candidates = candidates[fuel_mask]
    # If we have any candidate rows, pick the first non-null pct_col
    if not candidates.empty:
        non_null = candidates[candidates[pct_col].notna()]
        if not non_null.empty:
            return float(non_null.iloc[0][pct_col])
        val = candidates.iloc[0].get(pct_col)
        return float(val) if pd.notna(val) else 0.0
    # Fallbacks: try any row with the fuel type across the whole table
    if prog_fuel_col in programs.columns:
        fuel_rows = programs[programs[prog_fuel_col].apply(lambda x: fuel_matches(x, desired_fuel))]
        if not fuel_rows.empty:
            non_null = fuel_rows[fuel_rows[pct_col].notna()]
            if not non_null.empty:
                return float(non_null.iloc[0][pct_col])
            val = fuel_rows.iloc[0].get(pct_col)
            return float(val) if pd.notna(val) else 0.0
    # Last resort: use first row's pct_col if present
    if pct_col in programs.columns and not programs.empty:
        val = programs.iloc[0].get(pct_col)
        return float(val) if pd.notna(val) else 0.0
    return 0.0

# Compute percent columns for each row in df and then multiply by incremental cost
def row_prog_costs(row):
    prog_name = row.get(df_program_col) if df_program_col else None
    elec_util = row.get(electric_utility_col) if electric_utility_col else None
    gas_util = row.get(gas_utility_col) if gas_utility_col else None
    # lookup percents
    elec_pct = lookup_program_pct(prog_name, elec_util, 'electric')
    gas_pct = lookup_program_pct(prog_name, gas_util, 'natural_gas')
    other_pct = lookup_program_pct(prog_name, None, 'other')
    # convert >1 to decimals
    for v in [elec_pct, gas_pct, other_pct]:
        pass
    if elec_pct > 1:
        elec_pct = elec_pct / 100.0
    if gas_pct > 1:
        gas_pct = gas_pct / 100.0
    if other_pct > 1:
        other_pct = other_pct / 100.0
    cost = row.get('measure_incremental_cost', 0.0)
    return pd.Series({
        'electric_utility_nonmeasure_program_cost': cost * elec_pct,
        'natural_gas_utility_nonmeasure_program_costs': cost * gas_pct,
        'nonutility_nonmeasure_program_costs': cost * other_pct
    })

# Apply to df (creates the three columns)
prog_costs_df = df.apply(row_prog_costs, axis=1)
df = pd.concat([df, prog_costs_df], axis=1)

# If downstream code expects these exact column names to exist (even if zeros), ensure they do
for col in ['electric_utility_nonmeasure_program_cost', 'natural_gas_utility_nonmeasure_program_costs', 'nonutility_nonmeasure_program_costs']:
    if col not in df.columns:
        df[col] = 0.0


In [83]:
## Risk Discount
# going in the benefits section
# Risk Discount Factor to the incremental installed cost, any operation and maintenance costs, 
# and any deferred replacement credit (for early-retirement retrofits)

#Input from Global inputs sheet in workpapers


# Charactized as a benefit
risk_discount = 0.02
# each of these are already the total for the measure over EUL
# Treat NAs as zero in the calculation
df['risk_discount_value'] = (df['measure_incremental_cost'].fillna(0) + df['o_and_m_savings_value'].fillna(0) + df["deferred_replacement_credit_value"].fillna(0)) * risk_discount

""" For each efficiency measure, risk discount costs are calculated by applying the Risk Discount Factor to the incremental installed cost,
any operation and maintenance costs, and any deferred replacement credit (for early-retirement retrofits). 
The societal or total resource cost-effectiveness test costs are added to the societal or total resource benefits ."""

' For each efficiency measure, risk discount costs are calculated by applying the Risk Discount Factor to the incremental installed cost,\nany operation and maintenance costs, and any deferred replacement credit (for early-retirement retrofits). \nThe societal or total resource cost-effectiveness test costs are added to the societal or total resource benefits .'

In [84]:
def compute_ghg_lifetime_savings(row, ghg_type, suffix):
    """
    Compute total GHG savings value across measure lifetime.
    GHG costs are already NPVed in avoided_costs, so we sum across years.
    """
    measure_life = int(row.get('measure_life_(yrs)', 1))
    elec_savings = row.get('measure_electric_energy_savings', 0)
    
    if pd.isna(elec_savings) or elec_savings == 0:
        return 0.0
    
    # Get the loadshape vector for this measure's efficient_condition
    efficient_condition = row.get('efficient_condition', None)
    if efficient_condition and 'condition_name' in loadshapes.columns:
        loadshape_match = loadshapes[loadshapes['condition_name'] == efficient_condition]
        if not loadshape_match.empty:
            loadshape_row = loadshape_match.iloc[0]
        else:
            loadshape_row = loadshapes.iloc[0]
    else:
        loadshape_row = loadshapes.iloc[0]
    
    # Find columns for this gas type
    ghg_cols = [col for col in avoided_costs.columns if ghg_type in col.lower()]
    avoided_costs_ghg = avoided_costs[ghg_cols].copy()
    avoided_costs_ghg.columns = [col.replace(suffix, '') for col in avoided_costs_ghg.columns]
    
    # Find common columns
    common = list(loadshapes.columns.intersection(avoided_costs_ghg.columns).intersection(line_losses.columns))
    
    # Get loadshape values for common columns
    loadshape_vector = loadshape_row[common].astype(float).fillna(0)
    
    # Get line losses
    losses = line_losses.loc[0, common].astype(float).fillna(0)
    
    total_cost = 0.0
    base_year = avoided_costs['year'].min()
    
    # For each year in the measure life
    for year_offset in range(1, measure_life + 1):
        target_year = base_year + year_offset - 1
        
        # Get avoided costs for this year (or use last available if beyond data)
        if target_year in avoided_costs['year'].values:
            year_costs = avoided_costs_ghg[avoided_costs['year'] == target_year][common].iloc[0].astype(float).fillna(0)
        else:
            # Use last available year if beyond data range
            year_costs = avoided_costs_ghg[common].iloc[-1].astype(float).fillna(0)
        
        # Compute weighted sum for this year (loadshape * (1 + line_loss) * avoided_cost)
        adjusted_weights = year_costs * (1 + losses)
        year_value = loadshape_vector.dot(adjusted_weights)
        
        # Sum up total cost (no discounting since avoided_costs are already NPVed)
        total_cost += elec_savings * year_value
    
    return total_cost

# Calculate GHG savings for each gas type
for ghg, suffix in [('carbon', '_usdperkwh_carbon'), ('n2o', '_usdperkwh_n2o'), ('ch4', '_usdperkwh_ch4')]:
    df[f'electric_{ghg}_savings_value'] = df.apply(
        lambda row: compute_ghg_lifetime_savings(row, ghg, suffix),
        axis=1
    )
df

Unnamed: 0,measure_name,sector,program,electric_utility,gas_utility,market,baseline_condition,efficient_condition,building_type,measure_life_(yrs),...,natural_gas_utility_incentive_value,nonutility_incentive_value,total_incentive_value,electric_utility_nonmeasure_program_cost,natural_gas_utility_nonmeasure_program_costs,nonutility_nonmeasure_program_costs,risk_discount_value,electric_carbon_savings_value,electric_n2o_savings_value,electric_ch4_savings_value
0,refrigerator_electricity_efficient_residential...,residential,LIRRet,test_utility_1,test_utility_2,RET_ER,refrigerator_electricity_existing_residential,refrigerator_electricity_efficient_residential,single_family_li,10,...,900.0,600.0,2100.0,120.0,360.0,360.0,25.726041,0.0,0.0,0.0
1,refrigerator_electricity_efficient_residential...,residential,NLIRRet,test_utility_1,test_utility_2,RET_ER,refrigerator_electricity_existing_residential,refrigerator_electricity_efficient_residential,multi_family,10,...,900.0,600.0,2100.0,120.0,360.0,360.0,25.726041,0.0,0.0,0.0
2,refrigerator_electricity_efficient_residential...,residential,LIRRet,test_utility_1,test_utility_2,RET_ER,refrigerator_electricity_existing_residential,refrigerator_electricity_efficient_residential,multi_family_li,10,...,900.0,600.0,2100.0,120.0,360.0,360.0,25.726041,0.0,0.0,0.0
3,refrigerator_electricity_efficient_residential...,residential,NLIRRet,test_utility_1,test_utility_2,RET_ER,refrigerator_electricity_existing_residential,refrigerator_electricity_efficient_residential,single_family,10,...,900.0,600.0,2100.0,120.0,360.0,360.0,25.726041,0.0,0.0,0.0
4,refrigerator_electricity_efficient_residential...,residential,LIRNC,test_utility_1,test_utility_2,NC,refrigerator_electricity_baseline_residential,refrigerator_electricity_efficient_residential,single_family_li,10,...,300.0,200.0,700.0,40.0,120.0,120.0,9.726041,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
123,refrigerator_electricity_top10_residential_ref...,residential,NLIRRepl,none,test_utility_3,ROB,refrigerator_electricity_baseline_residential,refrigerator_electricity_top10_residential,single_family,10,...,350.0,350.0,700.0,210.0,210.0,210.0,15.726041,0.0,0.0,0.0
124,refrigerator_electricity_top10_residential_ref...,residential,LIRRepl,none,test_utility_3,RENO,refrigerator_electricity_baseline_residential,refrigerator_electricity_top10_residential,single_family_li,10,...,350.0,350.0,700.0,210.0,210.0,210.0,15.726041,0.0,0.0,0.0
125,refrigerator_electricity_top10_residential_ref...,residential,NLIRRepl,none,test_utility_3,RENO,refrigerator_electricity_baseline_residential,refrigerator_electricity_top10_residential,multi_family,10,...,350.0,350.0,700.0,210.0,210.0,210.0,15.726041,0.0,0.0,0.0
126,refrigerator_electricity_top10_residential_ref...,residential,LIRRepl,none,test_utility_3,RENO,refrigerator_electricity_baseline_residential,refrigerator_electricity_top10_residential,multi_family_li,10,...,350.0,350.0,700.0,210.0,210.0,210.0,15.726041,0.0,0.0,0.0


In [85]:
# Externalities (GHG) calculations for all fuel types
# For each GHG type (carbon, n2o, ch4), process all fuel-type columns and compute lifetime NPV savings

# Fuel type configurations: (fuel_key, measure_column_name, sector_columns_map)
fuel_ghg_configs = [
    ('natural_gas', 'measure_natural_gas_savings', {
        'res': 'natural_gas',
        'com': 'natural_gas',
        'ind': 'natural_gas'
    }),
    ('fuel_oil', 'measure_fuel_oil_savings', {
        'res': 'res_fuel_oil',
        'com': 'com_fuel_oil',
        'ind': 'ind_fuel_oil'
    }),
    ('propane', 'measure_propane_savings', {
        'res': 'res_propane',
        'com': 'com_propane',
        'ind': 'ind_propane'
    }),
    ('diesel_transportation', 'measure_diesel_savings', {
        'res': 'diesel_transportation',
        'com': 'diesel_transportation',
        'ind': 'diesel_transportation'
    }),
    ('gasoline_transportation', 'measure_gasoline_savings', {
        'res': 'gasoline_transportation',
        'com': 'gasoline_transportation',
        'ind': 'gasoline_transportation'
    })
]

# Iterate over each GHG type (carbon, n2o, ch4)
for ghg, suffix in [('carbon', '_usdpermmbtu_carbon'), ('n2o', '_usdpermmbtu_n2o'), ('ch4', '_usdpermmbtu_ch4')]:
    # Find all columns for this GHG type in avoided_costs
    ghg_cols = [col for col in avoided_costs.columns if ghg in col.lower()]
    avoided_costs_ghg_fuel = avoided_costs[['year'] + ghg_cols].copy()
    # Remove suffix from column names to get fuel identifiers
    avoided_costs_ghg_fuel.columns = ['year'] + [col.replace(suffix, '') for col in ghg_cols]
    
    # Process each fuel type
    for fuel_type, measure_col, sector_columns in fuel_ghg_configs:
        output_col = f'{fuel_type}_{ghg}_savings_value'
        
        def calculate_fuel_ghg_savings(row):
            """Calculate lifetime NPV of GHG savings for a fuel type across sectors"""
            # Determine sector from the 'sector' column (case-insensitive)
            sector = None
            sector_lower = str(row.get('sector', '')).lower()
            
            if 'res' in sector_lower:
                sector = 'res'
            elif 'com' in sector_lower:
                sector = 'com'
            elif 'ind' in sector_lower:
                sector = 'ind'
            else:
                return 0  # No matching sector
            
            # Get the appropriate fuel column key for this sector
            fuel_col_key = sector_columns.get(sector)
            
            # Check if the fuel column exists in avoided_costs_ghg_fuel
            if fuel_col_key not in avoided_costs_ghg_fuel.columns:
                return 0
            
            # Check if measure savings column exists and has a non-zero value
            if measure_col not in row.index or pd.isna(row[measure_col]):
                return 0
            
            measure_savings = row[measure_col]
            if measure_savings == 0:
                return 0
            
            # Calculate lifetime NPV using the avoided cost series for this fuel and GHG
            try:
                avoided_cost_series = avoided_costs_ghg_fuel.set_index('year')[fuel_col_key]
                lifetime_npv = calculate_lifetime_savings(
                    measure_savings,
                    avoided_cost_series,
                    int(row['measure_life_(yrs)'])
                )
                return lifetime_npv
            except Exception as e:
                # Debug: print the error if needed
                # print(f"Error in {fuel_type}_{ghg}: {e}")
                return 0
        
        # Apply the calculation for this fuel+GHG combination
        df[output_col] = df.apply(calculate_fuel_ghg_savings, axis=1)

# Aggregate all fuel externalities into a single column (sum across all fuels and GHGs)
externality_cols = [col for col in df.columns if col.endswith(('_carbon_savings_value', '_n2o_savings_value', '_ch4_savings_value'))]
df["end_use_fuel_externalities"] = df[externality_cols].sum(axis=1) if externality_cols else 0

In [86]:
# Just a random adder
# In Penn this was DRIPE
#this is the fill in for MeasNonResource Tab 
# Nonresource benefits are matched by utility and NPV'd over measure lifetime with real discount rate

nonresource_benefits = pd.read_excel("010_input/16_Measnonresource.xlsx")
clean_column_names(nonresource_benefits)

# Detect utility column in nonresource_benefits
nrb_utility_col = None
for c in ['utility', 'utility_name', 'provider']:
    if c in nonresource_benefits.columns:
        nrb_utility_col = c
        break

# Find the benefit value column (look for 'nonresource' and 'benefit')
nrb_value_col = None
for c in nonresource_benefits.columns:
    low = c.lower()
    if 'nonresource' in low and 'benefit' in low:
        nrb_value_col = c
        break
# Fallback: look for just 'benefit'
if nrb_value_col is None:
    for c in nonresource_benefits.columns:
        low = c.lower()
        if 'benefit' in low and 'other' in low:
            nrb_value_col = c
            break
# Fallback: try exact name
if nrb_value_col is None and 'other_nonresource_benefit' in nonresource_benefits.columns:
    nrb_value_col = 'other_nonresource_benefit'

# Detect utility columns in main df (electric_utility and gas_utility)
electric_utility_col = None
for c in ['electric_utility', 'elec_utility', 'electricutility', 'electric_provider']:
    if c in df.columns:
        electric_utility_col = c
        break
gas_utility_col = None
for c in ['gas_utility', 'natural_gas_utility', 'gasutility', 'gas_provider']:
    if c in df.columns:
        gas_utility_col = c
        break

# Helper function to lookup nonresource benefit for a given utility (with fallbacks)
def get_nonresource_benefit_value(utility_name):
    if nrb_value_col is None or nonresource_benefits.empty:
        return 0.0
    # Try exact match on utility if utility_col exists
    if nrb_utility_col is not None and utility_name is not None:
        match = nonresource_benefits[nonresource_benefits[nrb_utility_col] == utility_name]
        if not match.empty and pd.notna(match.iloc[0].get(nrb_value_col)):
            return float(match.iloc[0][nrb_value_col])
    # Fallback: use first row's value if available
    if not nonresource_benefits.empty:
        val = nonresource_benefits.iloc[0].get(nrb_value_col)
        return float(val) if pd.notna(val) else 0.0
    return 0.0

# Function to compute NPV of nonresource benefit across measure lifetime
def compute_npv_nonresource_benefit(row):
    """
    For each row in df:
    1. Look up nonresource benefit value for electric_utility
    2. Look up nonresource benefit value for gas_utility
    3. If they are different utilities, sum both; otherwise use single value
    4. Discount each year's benefit by (1 + 0.03)^year and sum across measure life
    """
    measure_life = int(row.get('measure_life_(yrs)', 1))
    elec_util = row.get(electric_utility_col) if electric_utility_col else None
    gas_util = row.get(gas_utility_col) if gas_utility_col else None
    
    # Lookup benefit values for each utility
    elec_benefit = get_nonresource_benefit_value(elec_util)
    gas_benefit = get_nonresource_benefit_value(gas_util)
    
    # If utilities differ, sum both benefits; otherwise just use one
    if elec_util != gas_util:
        total_annual_benefit = elec_benefit + gas_benefit
    else:
        # Same utility, use just one (avoid double-counting)
        total_annual_benefit = elec_benefit if elec_benefit != 0 else gas_benefit
    
    # NPV across measure lifetime with real_discount_rate = 0.03
    npv = sum(
        total_annual_benefit / ((1 + real_discount_rate) ** year)
        for year in range(1, measure_life + 1)
    )
    return npv

# Apply to df
df['other_nonresource_benefits'] = df.apply(compute_npv_nonresource_benefit, axis=1)
df

Unnamed: 0,measure_name,sector,program,electric_utility,gas_utility,market,baseline_condition,efficient_condition,building_type,measure_life_(yrs),...,propane_n2o_savings_value,diesel_transportation_n2o_savings_value,gasoline_transportation_n2o_savings_value,natural_gas_ch4_savings_value,fuel_oil_ch4_savings_value,propane_ch4_savings_value,diesel_transportation_ch4_savings_value,gasoline_transportation_ch4_savings_value,end_use_fuel_externalities,other_nonresource_benefits
0,refrigerator_electricity_efficient_residential...,residential,LIRRet,test_utility_1,test_utility_2,RET_ER,refrigerator_electricity_existing_residential,refrigerator_electricity_efficient_residential,single_family_li,10,...,0,0,0,0,0,0,0,0,0.0,8.746824
1,refrigerator_electricity_efficient_residential...,residential,NLIRRet,test_utility_1,test_utility_2,RET_ER,refrigerator_electricity_existing_residential,refrigerator_electricity_efficient_residential,multi_family,10,...,0,0,0,0,0,0,0,0,0.0,8.746824
2,refrigerator_electricity_efficient_residential...,residential,LIRRet,test_utility_1,test_utility_2,RET_ER,refrigerator_electricity_existing_residential,refrigerator_electricity_efficient_residential,multi_family_li,10,...,0,0,0,0,0,0,0,0,0.0,8.746824
3,refrigerator_electricity_efficient_residential...,residential,NLIRRet,test_utility_1,test_utility_2,RET_ER,refrigerator_electricity_existing_residential,refrigerator_electricity_efficient_residential,single_family,10,...,0,0,0,0,0,0,0,0,0.0,8.746824
4,refrigerator_electricity_efficient_residential...,residential,LIRNC,test_utility_1,test_utility_2,NC,refrigerator_electricity_baseline_residential,refrigerator_electricity_efficient_residential,single_family_li,10,...,0,0,0,0,0,0,0,0,0.0,8.746824
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
123,refrigerator_electricity_top10_residential_ref...,residential,NLIRRepl,none,test_utility_3,ROB,refrigerator_electricity_baseline_residential,refrigerator_electricity_top10_residential,single_family,10,...,0,0,0,0,0,0,0,0,0.0,17.060406
124,refrigerator_electricity_top10_residential_ref...,residential,LIRRepl,none,test_utility_3,RENO,refrigerator_electricity_baseline_residential,refrigerator_electricity_top10_residential,single_family_li,10,...,0,0,0,0,0,0,0,0,0.0,17.060406
125,refrigerator_electricity_top10_residential_ref...,residential,NLIRRepl,none,test_utility_3,RENO,refrigerator_electricity_baseline_residential,refrigerator_electricity_top10_residential,multi_family,10,...,0,0,0,0,0,0,0,0,0.0,17.060406
126,refrigerator_electricity_top10_residential_ref...,residential,LIRRepl,none,test_utility_3,RENO,refrigerator_electricity_baseline_residential,refrigerator_electricity_top10_residential,multi_family_li,10,...,0,0,0,0,0,0,0,0,0.0,17.060406


In [87]:
retail_rates = pd.read_excel('010_input/17_retail_rates.xlsx')
clean_column_names(retail_rates)
# keep only retail price related columns and year
cols = [c for c in retail_rates.columns if ('retail_price' in c.lower()) or ('retail_prices' in c.lower()) or ('year' in c.lower())]
retail_rates = retail_rates[cols].copy()

dollar_columns = [col for col in retail_rates.columns if 'usd' in col.lower()]
retail_rates['base_year'] = retail_rates['year'].min()
# Apply discount formula to each dollar column
for col in dollar_columns:
    retail_rates[col] = retail_rates[col] / ((1 + real_discount_rate) ** (retail_rates['year'] - retail_rates['base_year']))


In [88]:
# Retail rates: map sector and fuel to correct retail price columns and compute NPV of customer bill savings
# retail_rates contains columns like:\n
# electricity_retail_prices_res_usdperkwh, electricity_retail_prices_commercial_usdperkwh, electricity_retail_prices_industrial_usdperkwh,\n
# natural_gas_retail_prices_res_usdpermmbtu, natural_gas_retail_prices_commercial_usdpermmbtu, natural_gas_retail_prices_industrial_usdpermmbtu,\n
# heating_oil_retail_prices_res_usdpermmbtu, heating_oil_retail_prices_commercial_usdpermmbtu, heating_oil_retail_prices_industrial_usdpermmbtu, and year


# Helper maps for expected column patterns
prefix_map = {
    'electric': 'electricity_retail_prices',
    'natural_gas': 'natural_gas_retail_prices',
    'heating_oil': 'heating_oil_retail_prices'
}
# unit suffixes seen in the file (electric kWh, fuels per MMBtu)
suffix_map = { 'electric': '_usdperkwh', 'natural_gas': '_usdpermmbtu', 'heating_oil': '_usdpermmbtu' }

# Robust column finder: try exact constructed name then fallback to contains-based search
def find_retail_col(fuel_key, sector_key):
    # exact name first
    pref = prefix_map.get(fuel_key)
    suf = suffix_map.get(fuel_key)
    if pref is None or suf is None:
        return None
    exact = f'{pref}_{sector_key}{suf}'
    if exact in retail_rates.columns:
        return exact
    # try variants: some files use full 'commercial' others 'com' or 'res' etc. search by parts
    for col in retail_rates.columns:
        low = col.lower()
        if pref.replace('_', '') in low and sector_key in low:
            return col
    # fallback: any column that contains the prefix
    for col in retail_rates.columns:
        if pref.replace('_', '') in col.lower():
            return col
    return None

def sector_key_from_sector_text(sector_text):
    s = str(sector_text).lower() if sector_text is not None else ''
    if 'res' in s or 'resident' in s or 'house' in s:
        return 'res'
    if 'com' in s or 'commercial' in s:
        return 'commercial'
    if 'ind' in s or 'industrial' in s:
        return 'industrial'
    # default to error if unknown
    return 'error'

# Build series cache to avoid repeated set_index operations
_retail_series_cache = {}
def get_retail_series(fuel_key, sector_key):
    cache_key = (fuel_key, sector_key)
    if cache_key in _retail_series_cache:
        return _retail_series_cache[cache_key]
    col = find_retail_col(fuel_key, sector_key)
    if col is None:
        _retail_series_cache[cache_key] = None
        return None
    try:
        series = retail_rates.set_index('year')[col]
    except Exception:
        _retail_series_cache[cache_key] = None
        return None
    _retail_series_cache[cache_key] = series
    return series

# For each row, compute lifetime (NPV-like) customer bill savings for electricity, natural gas, heating oil
def compute_retail_savings_row(row):
    life = int(row.get('measure_life_(yrs)', 1)) if pd.notna(row.get('measure_life_(yrs)')) else 1
    sector_txt = row.get('sector', '')
    sector_k = sector_key_from_sector_text(sector_txt)
    out = {'electric_customer_bill_savings': 0.0, 'natural_gas_customer_bill_savings': 0.0, 'heating_oil_customer_bill_savings': 0.0}
    # Electricity
    try:
        elec_sav = row.get('measure_electric_energy_savings', 0)
    except Exception:
        elec_sav = 0
    if pd.notna(elec_sav) and elec_sav != 0:
        series = get_retail_series('electric', sector_k)
        if series is not None:
            out['electric_customer_bill_savings'] = calculate_lifetime_savings(elec_sav, series, life)
    # Natural gas
    try:
        gas_sav = row.get('measure_natural_gas_savings', 0)
    except Exception:
        gas_sav = 0
    if pd.notna(gas_sav) and gas_sav != 0:
        series = get_retail_series('natural_gas', sector_k)
        if series is not None:
            out['natural_gas_customer_bill_savings'] = calculate_lifetime_savings(gas_sav, series, life)
    # Heating oil
    try:
        oil_sav = row.get('measure_fuel_oil_savings', 0)
    except Exception:
        oil_sav = 0
    if pd.notna(oil_sav) and oil_sav != 0:
        series = get_retail_series('heating_oil', sector_k)
        if series is not None:
            out['heating_oil_customer_bill_savings'] = calculate_lifetime_savings(oil_sav, series, life)
    return pd.Series(out)

# Apply and attach to df
retail_savings_df = df.apply(compute_retail_savings_row, axis=1)
df = pd.concat([df, retail_savings_df], axis=1)

# # Quick QA
# print('Retail customer bill savings sample:')
# print(df[['sector','measure_life_(yrs)','measure_electric_energy_savings','electric_customer_bill_savings','measure_natural_gas_savings','natural_gas_customer_bill_savings','measure_fuel_oil_savings','heating_oil_customer_bill_savings']].head())

# # If downstream code expects 'electric_customer_bill_savings' to exist, ensure column exists (even if all zeros)
# for col in ['electric_customer_bill_savings','natural_gas_customer_bill_savings','heating_oil_customer_bill_savings']:
#     if col not in df.columns:
#         df[col] = 0.0
# df

This is the equations for the 4 cost tests plus the three version of Utility cost test

In [89]:
# electric utility only cost test
df["EUCT_cost"] = df["electric_utility_nonmeasure_program_cost"] + df["electric_utility_incentive_value"]

df["EUCT_benefit"] = df["electric_energy_savings_value"] + df["electric_demand_savings_value"] 

df["EUCT_BCR"] = df["EUCT_benefit"] / df["EUCT_cost"]

In [90]:
# Natural Gas utility only cost test
df["GUCT_cost"] = df["natural_gas_utility_nonmeasure_program_costs"] + df["natural_gas_utility_incentive_value"]

df["GUCT_benefit"] = df["natural_gas_savings_value"] 

df["GUCT_BCR"] = df["GUCT_benefit"] / df["GUCT_cost"]

In [91]:
# Electric and Natural Gas utility cost test
df["EGUCT_cost"] = df["natural_gas_utility_nonmeasure_program_costs"] + df["natural_gas_utility_incentive_value"] + df["electric_utility_nonmeasure_program_cost"] + df["electric_utility_incentive_value"]

df["EGUCT_benefit"] = df["natural_gas_savings_value"] + df["electric_energy_savings_value"] + df["electric_demand_savings_value"] 

df["EGUCT_BCR"] = df["EGUCT_benefit"] / df["EGUCT_cost"]

In [92]:
# TRC
# incremental cost for ER is the full cost of the install then when the Deferred replacemnet credit happenes that is the balancing out 
# this way the retrofit is properly priced
df["TRC_cost"] = df["measure_incremental_cost"] + df["electric_utility_nonmeasure_program_cost"] + df["natural_gas_utility_nonmeasure_program_costs"] + df["nonutility_incentive_value"] + df["nonutility_nonmeasure_program_costs"]

df["TRC_benefit"] = df["electric_energy_savings_value"] + df["electric_demand_savings_value"] + df["natural_gas_savings_value"] + df["other_fuel_savings_value"] + df["water_savings_value"] + df["o_and_m_savings_value"] + df["other_nonresource_benefits"] + df["deferred_replacement_credit_value"].fillna(0) + df["risk_discount_value"]
df["TRC_BCR"] = df["TRC_benefit"] / df["TRC_cost"]

In [93]:
# SCT
df["SCT_cost"] = df["measure_incremental_cost"] + df["electric_utility_nonmeasure_program_cost"] + df["natural_gas_utility_nonmeasure_program_costs"] + df["nonutility_incentive_value"] + df["nonutility_nonmeasure_program_costs"]

df["SCT_benefit"] = df["electric_energy_savings_value"] + df["electric_demand_savings_value"] + df["natural_gas_savings_value"] + df["other_fuel_savings_value"] + df["water_savings_value"] + df["o_and_m_savings_value"] + df["other_nonresource_benefits"] + df["deferred_replacement_credit_value"].fillna(0) + df["risk_discount_value"] + df["electric_carbon_savings_value"] + df["electric_n2o_savings_value"] + df["electric_ch4_savings_value"] + df["end_use_fuel_externalities"]
df["SCT_BCR"] = df["SCT_benefit"] / df["SCT_cost"]

In [94]:
# RIM
# Why is this just electric bill savings not the other fuels?

 # RIM
df["RIM_cost"] = df["electric_utility_incentive_value"] + df["electric_utility_nonmeasure_program_cost"] + df["electric_customer_bill_savings"] + df["natural_gas_utility_incentive_value"] + df["natural_gas_utility_nonmeasure_program_costs"]

df["RIM_benefit"] = df["electric_energy_savings_value"] + df["electric_demand_savings_value"] + df["natural_gas_savings_value"] + df["other_fuel_savings_value"] + df["nonutility_incentive_value"] + df["nonutility_nonmeasure_program_costs"]
df["RIM_BCR"] = df["RIM_benefit"] / df["RIM_cost"]

In [95]:
# PCT
# double check the electric customer cost thing I think we need all fuels bill savings or at least an option to change it
df["PCT_cost"] = df["measure_incremental_cost"]
# currently deferred replacement value is a benefit
df["PCT_benefit"] = df["electric_utility_incentive_value"] + df["electric_customer_bill_savings"] + df["electric_energy_savings_value"] + df["electric_demand_savings_value"] + df["natural_gas_savings_value"] + df["natural_gas_utility_incentive_value"] + df["other_fuel_savings_value"] + df["nonutility_incentive_value"] + df["water_savings_value"] + df["o_and_m_savings_value"] + df["other_nonresource_benefits"] + df["deferred_replacement_credit_value"].fillna(0)
df["PCT_BCR"] = df["PCT_benefit"] / df["PCT_cost"]

In [96]:
df.to_csv("010_output/measure_costs_benefits.csv", index=False)
df.to_pickle("010_output/measure_costs_benefits.plk")
df.to_csv("020_input/measure_costs_benefits.csv", index=False)
df.to_pickle("020_input/measure_costs_benefits.plk")

This is the output needed for mike to join to the measure database
It is boolen of each cost test pass or fail and than the PCT ratio (key parts)
Will provide everything to Mike