## Planned Resources - Installed Capacity - CAISO


Installed Capacity of CAISO system comes from the following type of resources 
1. Fossil Steam
2. Hydro
3. Nuclear
4. Renewable
5. Storage
6. Combustion Turbine
7. Curtailable load

Out of the above resources
- Firm capacity is provided by Fossil Steam, Hydro, Nuclear, Storage and Combustion Turbine resources                                
- Variable Capacity is provided by Renewable Sources (Wind and Solar), and it also includes DSM programs and Demand Response of Curtailable load


This study aims to analyse the CAISO resources available from the year 2023 onwards contributing to the installed capacity. Study will separate firm and variable capacity resources 

The CPUC website has two datasets containing information regarding the resources
* [Dataset 1](https://files.cpuc.ca.gov/energy/modeling/2023PSP/SERVM_GeneratorList_20231005.xlsx)
* [Dataset 2](https://www.cpuc.ca.gov/-/media/cpuc-website/divisions/energy-division/documents/integrated-resource-plan-and-long-term-procurement-plan-irp-ltpp/2022-irp-cycle-events-and-materials/generatorlist_20230626.xlsx)


In [1]:
import pandas as pd

In [2]:
import altair as alt

In [3]:
data_gen = pd.read_excel('SERVM_GeneratorList_20231005.xlsx', header=1, sheet_name='GenList20231005')

In [4]:
data_gen['insvdt'] = pd.to_datetime(data_gen['insvdt'], format='mixed')
data_gen['retdt'] = pd.to_datetime(data_gen['retdt'], format='%Y%m%d')

The dataset gives CAISO and Non-CAISO resources as well as resources that are online and not online as of 1/2023
* Online = CAISO unit that is online
* Not online as of 1/2023 = CAISO project that is not yet online as of 1/2023 regardless of insvdt value
* Terminated = indicates a project has been terminated
* Non-CAISO = unit primarily serves non-CAISO region"


In [5]:
data_gen = data_gen.assign(service_year = data_gen['insvdt'].dt.year)

In [6]:
data_gen = data_gen.assign(retirement_year = data_gen['retdt'].dt.year)

In [7]:
data_gen.replace({'Unit Type':{'F':'Fossil Steam',
                               'H':'Hydro',
                               'N':'Nuclear',
                               'R':'Renewable',
                               'S':'Storage',
                               'T':'Combustion Turbine',
                               'C':'Curtailable load'}}, inplace = True)

In [8]:
data_gen['Flag for CAISO-contracted and either Not online as of 1/2023 or terminated as of 1/2023'].unique()

array(['Non-CAISO', 'Not online as of 1/2023', 'Online', 'Terminated'],
      dtype=object)


The projects that are terminated are excluded from further analysis


In [9]:
data_gen_online = data_gen[(data_gen['Flag for CAISO-contracted and either Not online as of 1/2023 or terminated as of 1/2023'] != 'Terminated')]

To find out the existing and committed resources Online for the FY2023, the above dataset is filtered

In [10]:
data_gen_online = data_gen[(data_gen['Flag for CAISO-contracted and either Not online as of 1/2023 or terminated as of 1/2023'] == 'Online')]

Bar chart is plotted to get more insight about the in-service year of the existing resources. It was seen that the EV-DSM capacity contribution corresponding to the in-service year of 2025 is included under resources contributing to capacity in 2023. 

In [11]:
alt.Chart(data_gen_online).mark_bar(width = 10).encode(
alt.X('service_year:N', title=''),
alt.Y('sum(Capmax)', title='Capacity')).properties(width=1000)

In [12]:
alt.Chart(data_gen_online).mark_bar(width = 10).encode(
alt.X('retirement_year:N', title=''),
alt.Y('sum(Capmax)', title='Capacity')).properties(width=1000)

Based on the above bar charts the resources corresponding to year 2025 are moved to the future resources section 

In [13]:
data_online_2025 = data_gen_online[(data_gen_online['service_year'] == 2025)]

In [14]:
data_gen_online = data_gen_online[(data_gen_online['service_year'] != 2025)]

The firm and variable resources are filtered out to obtain dependable and variable capacities

In [15]:
firm_resource_list = ['Fossil Steam','Hydro','Nuclear','Storage',
                               'Combustion Turbine']

In [16]:
variable_resource_list = ['Renewable','Curtailable load']

In [17]:
data_firm =  data_gen_online[data_gen_online['Unit Type'].isin(firm_resource_list)] 

In [18]:
data_variable = data_gen_online[data_gen_online['Unit Type'].isin(variable_resource_list)] 

In [19]:
data_firm_serv = data_firm.groupby('service_year').agg({'Capmax':'sum'}).reset_index()

In [20]:
data_firm_serv['Capmax'].sum()

45005.8154132

In [21]:
data_variable_serv = data_variable.groupby('service_year').agg({'Capmax':'sum'}).reset_index()

In [22]:
data_variable_serv['Capmax'].sum()

52672.009829129995

It can be seen that as of 2023, there is a firm installed capacity of 45005.81 MW and variable installed capacity of 52672 MW available.

Planned resources are the resources that are not online as of 1/2023 with the added 3 resources that were miscategorised as Online. 

In [23]:
data_gen_planned = data_gen[(data_gen['Flag for CAISO-contracted and either Not online as of 1/2023 or terminated as of 1/2023'] == 'Not online as of 1/2023')]

In [24]:
data_gen_plan = pd.concat([data_gen_planned, data_online_2025], axis = 0)

In [25]:
data_gen_plan['CAISO-contracted'].unique()

array(['CAISO'], dtype=object)

Similar to separating firm and variable for the online resources as of 2023, the planned resources are also separated as firm and variable resources.  

In [26]:
data_plan_firm = data_gen_plan[data_gen_plan['Unit Type'].isin(firm_resource_list)] 

In [27]:
data_plan_firm['Capmax'].sum()

45986.62098317999

In [28]:
data_plan_variable = data_gen_plan[data_gen_plan['Unit Type'].isin(variable_resource_list)]

In [29]:
data_plan_variable['Capmax'].sum()

60855.80819920001

From the initial analysis it is seen that the CAISO system plans to have additional firm capacity of 45986.6 MW and variable capacity of 65166.8 MW  

#### Dependable Capacity Analysis 

The dependable/ firm capacity analysis is carried out first.

As seen already, there is a dependable capacity of 45005.81 MW installed as of 2023. The planned firm capacity data is analysed along with retiring capacity to get the future net capacity available each year from 2023 onwards  

The data of firm planned capacity is plotted to get more insights 

In [30]:
alt.Chart(data_plan_firm).mark_bar(width = 10).encode(
alt.X('service_year:N', title=''),
alt.Y('sum(Capmax)', title='Capacity')).properties(width=800)

Even though some of the planned resources in-service year is less than or equal to 2023, those were not online as of 2023 and hence need further analysis. Also, there are many resources that are coming in-service in 2023 and the total installed capacity is a high value and hence would need further investigation. The resources are compared with another dataset that has information regarding their in-service date.

In [31]:
data_plan_firm_upto_2022 = data_plan_firm[data_plan_firm['service_year']<2023]

In [32]:
alt.Chart(data_plan_firm_upto_2022).mark_bar(width = 10).encode(
alt.X('service_year:N', title=''),
alt.Y('sum(Capmax)', title='Capacity')).properties(width=800)

The above chart shows the capacity planned for some of the resources that are still not online as of 2023, but in-service year was supposed to be before 2023.

In [33]:
data_new_gen_info = pd.read_excel('GeneratorList_20230626.xlsx', header = 1)

As the information from the above dataset alone does not provide a complete information regarding the date at/ year in which some of the resources would be coming online another dataset with more information regarding the years in which some of the planned resources would contribute to annual capacity is read for further analysis. 

In [34]:
data_plan_firm_upto_2022.merge(data_new_gen_info, left_on = 'Unit Name', right_on = 'Unit Name', how='inner', indicator=True).info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 14 entries, 0 to 13
Data columns (total 67 columns):
 #   Column                                                                                   Non-Null Count  Dtype         
---  ------                                                                                   --------------  -----         
 0   Baseline/LSE Planned 25 MMT/LSE Planned 30 MMT                                           14 non-null     object        
 1   CAISO-contracted                                                                         14 non-null     object        
 2   Region_x                                                                                 14 non-null     object        
 3   Unit Name                                                                                14 non-null     object        
 4   Unit Type_x                                                                              14 non-null     object        
 5   Unit Category_x  


The new dataset merged does not provide any new information regarding when the resources that were supposed to come online before 2023 would be coming online. 

A similar merging for resources that were supposed to come online in the year 2023 was carried out and dataframe information shows that for 46 of these resources’ additional information regarding in service year is available. The information shows the 46 nonempty values to contribute to annual capacity for the year 2024, 2026, 2030 and 2035 respectively. 


In [35]:
data_firm_plan_2023 = data_plan_firm[data_plan_firm['service_year']==2023]

In [36]:
data_firm_plan_2023.merge(data_new_gen_info, left_on = 'Unit Name', right_on = 'Unit Name', how='inner', indicator=True).info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 79 entries, 0 to 78
Data columns (total 67 columns):
 #   Column                                                                                   Non-Null Count  Dtype         
---  ------                                                                                   --------------  -----         
 0   Baseline/LSE Planned 25 MMT/LSE Planned 30 MMT                                           79 non-null     object        
 1   CAISO-contracted                                                                         79 non-null     object        
 2   Region_x                                                                                 79 non-null     object        
 3   Unit Name                                                                                79 non-null     object        
 4   Unit Type_x                                                                              79 non-null     object        
 5   Unit Category_x  

In [37]:
data_firm_plan_after_2023 = data_plan_firm[data_plan_firm['service_year'] > 2023]

In [38]:
data_firm_plan_after_2023.merge(data_new_gen_info, left_on = 'Unit Name', right_on = 'Unit Name', how='inner', indicator=True).info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 23 entries, 0 to 22
Data columns (total 67 columns):
 #   Column                                                                                   Non-Null Count  Dtype         
---  ------                                                                                   --------------  -----         
 0   Baseline/LSE Planned 25 MMT/LSE Planned 30 MMT                                           23 non-null     object        
 1   CAISO-contracted                                                                         23 non-null     object        
 2   Region_x                                                                                 23 non-null     object        
 3   Unit Name                                                                                23 non-null     object        
 4   Unit Type_x                                                                              23 non-null     object        
 5   Unit Category_x  


Merging the dataset and observing the resources whose in service date was supposed to be greater than 2023, there were no resources which had value for annual capacity contributing to system capacity for the years 2024, 2026, 2030 and 2035 respectively. 


In [39]:
data_firm_plan_update = data_plan_firm.merge(data_new_gen_info, left_on = 'Unit Name', right_on = 'Unit Name', how='left', indicator=True)

Drawing insights from above observations, from the merged datasets - the datapoints corresponding to resources planned that were not online till 2023, and the datapoints corresponding to the resources that were supposed to be online in 2023 and that are not online yet and no further information is available from the merged dataset are excluded from further analysis to contribute to installed capacity for now. 


As mentioned earlier and as is evident from the plot it is seen that there are a lot of resources coming online in the year 2023. As further information is not available for these planned resources, those resources will be kept for further investigation for later and for now will not be considered to add capacity to the system. These resources unit categorised distribution is shown in the bar chart below for future consideration.

In [40]:
data_plan_2023_exclude = data_firm_plan_update[data_firm_plan_update['2024 Annual'].isnull()][data_firm_plan_update['insvdt_x'].dt.year == 2023].groupby('Unit Category_x').agg({'Capmax_x':'sum'}).reset_index()

  data_plan_2023_exclude = data_firm_plan_update[data_firm_plan_update['2024 Annual'].isnull()][data_firm_plan_update['insvdt_x'].dt.year == 2023].groupby('Unit Category_x').agg({'Capmax_x':'sum'}).reset_index()


In [41]:
alt.Chart(data_plan_2023_exclude).mark_bar(width = 20).encode(
alt.X('Unit Category_x', title = 'Unit Category'),
alt.Y('Capmax_x', title = 'Capacity')).properties(width = 1000)

In [42]:
data_firm_plan_2023_update = data_firm_plan_update[data_firm_plan_update['insvdt_x'].dt.year > 2023]

In [43]:
data_firm_plan_2023 = data_firm_plan_update[(data_firm_plan_update['insvdt_x'].dt.year == 2023) & (~data_firm_plan_update['2024 Annual'].isnull())]

In [44]:
data_serv_plan_firm = data_firm_plan_2023_update.groupby('service_year').agg({'Capmax_x':'sum'}).reset_index()

In [45]:
data_serv_plan_firm

Unnamed: 0,service_year,Capmax_x
0,2024,989.7
1,2025,525.0
2,2026,374.8
3,2028,79.0
4,2030,211.43


In [46]:
alt.Chart(data_serv_plan_firm).mark_bar(width = 20).encode(
    alt.X('service_year:N', title = 'Service Year'), 
    alt.Y('Capmax_x', title = 'Capcity')).properties(width = 700) 

Retired data for firm capacity before 2023 is found out from data_firm dataset 

In [47]:
data_retd_firm = data_firm.groupby('retirement_year').agg({'Capmax':'sum'}).reset_index()

In [48]:
data_retd_firm_upto_2035 = data_retd_firm[data_retd_firm['retirement_year'] < 2036]

In [49]:
data_retd_plan_firm = data_firm_plan_2023_update.groupby('retirement_year').agg({'Capmax_x':'sum'}).reset_index()

In [50]:
data_retd_plan_firm['retirement_year'].unique()

array([2040, 2050, 2099])

The planned resources has retirement year greater than 2035, and hence the retirement information corresponding to planned resources is not used for further analysis

The dataframe data_retd_firm is renamed as data_net_cap_firm for further analysis

In [51]:
data_net_cap_firm = data_retd_firm

In [52]:
data_net_cap_firm = data_net_cap_firm.rename( columns={'retirement_year':'service_year'})


The planned resources whose isvdt_x was 2023 and is not online in 2023 are the resources that would come online in stages and contributes to annual capacities in the years 2024, 2026, 2030 and 2035 respectively

The planned capacity that will be added in 4 stages in years 2024, 2026, 2030 and 2035 are as given 

In [53]:
columns = [2024, 2026, 2030, 2035]
sum_columns = []
for column in columns:
    column_name = str(column) + ' Annual'
    column_sum = data_firm_plan_2023[column_name].sum()
    sum_columns.append(column_sum)
new_column = []
new_column.append(sum_columns[0])
for i in range(1, len(sum_columns)):
    new_value = sum_columns[i] - sum_columns[i-1]
    new_column.append(new_value)

In [54]:
d = {'service_year':columns,
    'capacity': new_column}
phased_capacity = pd.DataFrame(data = d)

In [55]:
phased_capacity

Unnamed: 0,service_year,capacity
0,2024,3833.28
1,2026,5693.82
2,2030,4831.93
3,2035,6024.97


Merging the 3 dataframes would give a way to calculate the planned firm capacity from 2023 onwards. 

In [56]:
data_all_firm = data_net_cap_firm.merge(data_serv_plan_firm, left_on = 'service_year', right_on ='service_year', how='outer').merge(phased_capacity, left_on = 'service_year', right_on = 'service_year', how = 'outer')

In [57]:
data_all_firm

Unnamed: 0,service_year,Capmax,Capmax_x,capacity
0,2024,1150.0,989.7,3833.28
1,2025,1630.0,525.0,
2,2030,121.617963,211.43,4831.93
3,2031,225.1,,
4,2032,308.26,,
5,2033,93.37,,
6,2034,213.296091,,
7,2035,229.0,,6024.97
8,2036,160.64,,
9,2037,53.4,,


In [58]:
data_all_firm = data_all_firm.fillna(0.0)

There were no new firm resources added in the year 2027 and 2029. for consistency, these years are added with 0 capacity. To get the initial capacity in the year 2023, the inital firm capacity calculated is also added to the dataset. 

In [59]:
data_all_firm.loc[17] = [2027, 0, 0, 0]
data_all_firm.loc[18] = [2029,0,0,0]
data_all_firm.loc[19] = [2023,0,0,data_firm_serv['Capmax'].sum()]

In [60]:
data_all_update_firm = data_all_firm.sort_values('service_year').reset_index(drop=True)

As the planning horizon considered is up to 2035 the dataframe data_all is filtered upto 2035

In [61]:
data_all_update_2035_firm = data_all_update_firm[data_all_update_firm['service_year'] <= 2035]

Each years capacity contribution is obtained by adding the new resource capacity corresponding to the year and subtratcting the capacity that is being retired the same year

In [62]:
data_all_update_2035_firm['capacity_contribution'] = data_all_update_2035_firm['Capmax_x'] + data_all_update_2035_firm['capacity'] - data_all_update_2035_firm['Capmax']

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  data_all_update_2035_firm['capacity_contribution'] = data_all_update_2035_firm['Capmax_x'] + data_all_update_2035_firm['capacity'] - data_all_update_2035_firm['Capmax']


Net capacity at the end of each year is the net capacity from previous year plus the capacity contribution of the same year

In [63]:
data_all_update_2035_firm['net_capacity'] = 0
data_all_update_2035_firm = data_all_update_2035_firm.reset_index(drop = True)
range_len = len(data_all_update_2035_firm['net_capacity'])
data_all_update_2035_firm['net_capacity'][0] = data_all_update_2035_firm.loc[0]['capacity_contribution'] 
for i in range(1, range_len):
    data_all_update_2035_firm['net_capacity'][i] = data_all_update_2035_firm['net_capacity'][(i-1)] + data_all_update_2035_firm['capacity_contribution'][i]

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  data_all_update_2035_firm['net_capacity'] = 0
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  data_all_update_2035_firm['net_capacity'][0] = data_all_update_2035_firm.loc[0]['capacity_contribution']


In [64]:
data_all_update_2035_firm

Unnamed: 0,service_year,Capmax,Capmax_x,capacity,capacity_contribution,net_capacity
0,2023.0,0.0,0.0,45005.815413,45005.815413,45005.815413
1,2024.0,1150.0,989.7,3833.28,3672.98,48678.795413
2,2025.0,1630.0,525.0,0.0,-1105.0,47573.795413
3,2026.0,0.0,374.8,5693.82,6068.62,53642.415413
4,2027.0,0.0,0.0,0.0,0.0,53642.415413
5,2028.0,0.0,79.0,0.0,79.0,53721.415413
6,2029.0,0.0,0.0,0.0,0.0,53721.415413
7,2030.0,121.617963,211.43,4831.93,4921.742037,58643.157451
8,2031.0,225.1,0.0,0.0,-225.1,58418.057451
9,2032.0,308.26,0.0,0.0,-308.26,58109.797451


Bar plot is plotted to visualise the firm planned capacity in the planning horizon.

In [65]:
alt.Chart(data_all_update_2035_firm).mark_bar(color= 'lightblue', width = 20).encode(
alt.X('service_year:N', title = 'Year'),
alt.Y('net_capacity', title = 'Capacity')).properties(width = 1000)

#### Variable Capacity Analysis

As we have already seen there is a variable capacity of 52672 MW available as of 2023. Now the planned variable capacity data is analysed along with retirement of already installed variable capacities to learn more about available capacity each year from 2023 onwards. 

The data of variable planned is plotted to get more insights 

In [66]:
alt.Chart(data_plan_variable).mark_bar(width = 10).encode(
alt.X('service_year:N', title=''),
alt.Y('sum(Capmax)', title='Capacity')).properties(width=800)

Even though some of the planned variable resources in service year is less than or equal to 2023, those were not online as of 2023 and hence need further analysis. Also, there are many resources that are coming in service in 2023 with a high total installed capacity and would need further investigation. Similar to firm capacity analysis, this datasets resource information is analysed along with the new dataset to get more information regarding the in-service date.

In [67]:
data_plan_variable_upto_2022 = data_plan_variable[data_plan_variable['service_year']<2023]

In [68]:
alt.Chart(data_plan_variable_upto_2022).mark_bar(width = 10).encode(
alt.X('service_year:N', title=''),
alt.Y('sum(Capmax)', title='Capacity')).properties(width=800)

The above chart shows the capacity planned for some of the resources that are still not online as of 2023, but in service year was supposed to be before 2023.


Similar to the analysis done for firm capacity, additional data available is analysed to get better insights on the resources that were supposed to be online on or before 2023 and that are still not online as of 2023

In [69]:
data_plan_variable_upto_2022.merge(data_new_gen_info, left_on = 'Unit Name', right_on = 'Unit Name', how='inner', indicator=True).info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7 entries, 0 to 6
Data columns (total 67 columns):
 #   Column                                                                                   Non-Null Count  Dtype         
---  ------                                                                                   --------------  -----         
 0   Baseline/LSE Planned 25 MMT/LSE Planned 30 MMT                                           7 non-null      object        
 1   CAISO-contracted                                                                         7 non-null      object        
 2   Region_x                                                                                 7 non-null      object        
 3   Unit Name                                                                                7 non-null      object        
 4   Unit Type_x                                                                              7 non-null      object        
 5   Unit Category_x    


The new merged dataset does not provide any new information regarding when the resources that were supposed to come online before 2023 would be coming online. 

A similar merging for resources that were supposed to come online in the year 2023 was carried out and dataframe information shows that for 34 of these resources’ additional information regarding in service year is available. The information shows the 34 nonempty values to contribute to annual capacity for the year 2024, 2026, 2030 and 2035 respectively. 

In [70]:
data_variable_plan_2023 = data_plan_variable[data_plan_variable['service_year']==2023]

In [71]:
data_variable_plan_2023.merge(data_new_gen_info, left_on = 'Unit Name', right_on = 'Unit Name', how='inner', indicator=True).info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 64 entries, 0 to 63
Data columns (total 67 columns):
 #   Column                                                                                   Non-Null Count  Dtype         
---  ------                                                                                   --------------  -----         
 0   Baseline/LSE Planned 25 MMT/LSE Planned 30 MMT                                           64 non-null     object        
 1   CAISO-contracted                                                                         64 non-null     object        
 2   Region_x                                                                                 64 non-null     object        
 3   Unit Name                                                                                64 non-null     object        
 4   Unit Type_x                                                                              64 non-null     object        
 5   Unit Category_x  

In [72]:
data_variable_plan_after_2023 = data_plan_variable[data_plan_variable['service_year'] > 2023]

In [73]:
data_variable_plan_after_2023.merge(data_new_gen_info, left_on = 'Unit Name', right_on = 'Unit Name', how='inner', indicator=True).info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 27 entries, 0 to 26
Data columns (total 67 columns):
 #   Column                                                                                   Non-Null Count  Dtype         
---  ------                                                                                   --------------  -----         
 0   Baseline/LSE Planned 25 MMT/LSE Planned 30 MMT                                           27 non-null     object        
 1   CAISO-contracted                                                                         27 non-null     object        
 2   Region_x                                                                                 27 non-null     object        
 3   Unit Name                                                                                27 non-null     object        
 4   Unit Type_x                                                                              27 non-null     object        
 5   Unit Category_x  

Merging the dataset and observing the resources whose in service date was supposed to be greater than 2023, there were 3 resources which had value for annual capacity contributing to system capacity for the years 2024, 2026, 2030 and 2035 respectively. 


In [74]:
data_variable_plan_update = data_plan_variable.merge(data_new_gen_info, left_on = 'Unit Name', right_on = 'Unit Name', how='left', indicator=True)

Drawing insights from above observations, i.e. from the merged datasets - the datapoints corresponding to resources planned that were not online till 2023, and the datapoints corresponding to the resources that were supposed to be online in 2023 and that are not online yet and no further information is available from the merged dataset are excluded from further analysis to contribute to installed capacity for now. 


In [75]:
data_variable_plan_2023_update = data_variable_plan_update[(data_variable_plan_update['insvdt_x'].dt.year > 2023) & (data_variable_plan_update['2024 Annual'].isnull())]

In [76]:
data_variable_plan_2023_staged = data_variable_plan_update[(~data_variable_plan_update['2024 Annual'].isnull())]

In [77]:
data_serv_plan_variable = data_variable_plan_2023_update.groupby('service_year').agg({'Capmax_x':'sum'}).reset_index()

In [78]:
data_serv_plan_variable

Unnamed: 0,service_year,Capmax_x
0,2025,136.35
1,2026,289.0
2,2027,2544.0
3,2029,93.0
4,2030,959.79
5,2032,40.0
6,2035,288.0


Retired data for variable capacity before 2023 is found out from data_varaible dataset 

In [79]:
data_retd_variable = data_variable.groupby('retirement_year').agg({'Capmax':'sum'}).reset_index()

In [80]:
data_retd_variable_plan = data_plan_variable.groupby('retirement_year').agg({'Capmax':'sum'}).reset_index()

In [81]:
data_retd_variable['retirement_year'].unique(), data_retd_variable_plan['retirement_year'].unique()

(array([2050, 2099]), array([2050, 2099]))

The planned resources has retirement year greater than 2035, and hence there is no need to consider the retirement years for calculating the capacity in the planning horizon up to 2035.


The planned resources whose isvdt_x was 2023 and is not online in 2023, are the resources that would come online in stages and contribute to annual capacities in the years 2024, 2026, 2030 and 2035 respectively. The planned capacity that will be added in 4 stages in years 2024, 2026, 2030 and 2035 are as given.

In [82]:
columns = [2024, 2026, 2030, 2035]
sum_columns = []
for column in columns:
    column_name = str(column) + ' Annual'
    column_sum = data_variable_plan_2023_staged[column_name].sum()
    sum_columns.append(column_sum)
new_column = []
new_column.append(sum_columns[0])
for i in range(1, len(sum_columns)):
    new_value = sum_columns[i] - sum_columns[i-1]
    new_column.append(new_value)

In [83]:
d = {'service_year':columns,
    'capacity': new_column}
staged_capacity = pd.DataFrame(data = d)

In [84]:
staged_capacity

Unnamed: 0,service_year,capacity
0,2024,1568.19
1,2026,6343.92
2,2030,13786.499315
3,2035,10286.99


Merging the 2 dataframes would give a way to calculate the planned variable capacity from 2023 onwards. 

In [85]:
data_all_variable = data_serv_plan_variable.merge(staged_capacity, left_on = 'service_year', right_on = 'service_year', how = 'outer')

In [86]:
data_all_variable

Unnamed: 0,service_year,Capmax_x,capacity
0,2025,136.35,
1,2026,289.0,6343.92
2,2027,2544.0,
3,2029,93.0,
4,2030,959.79,13786.499315
5,2032,40.0,
6,2035,288.0,10286.99
7,2024,,1568.19


In [87]:
data_all_variable = data_all_variable.fillna(0.0)

There are no new variable resources added in the year 2028, 2031, 2033 and 2034. For consistency these years are added with 0 capacity. To get the initial capacity in the year 2023, the inital variable capacity calculated is also added.

In [88]:
data_all_variable.loc[8] = [2028, 0, 0]
data_all_variable.loc[9] = [2031, 0, 0]
data_all_variable.loc[10] = [2033, 0, 0]
data_all_variable.loc[11] = [2034, 0, 0]
data_all_variable.loc[12] = [2023, 0, data_variable_serv['Capmax'].sum()]

In [89]:
data_all_update_variable = data_all_variable.sort_values('service_year').reset_index(drop = True)

In [90]:
data_all_update_variable

Unnamed: 0,service_year,Capmax_x,capacity
0,2023.0,0.0,52672.009829
1,2024.0,0.0,1568.19
2,2025.0,136.35,0.0
3,2026.0,289.0,6343.92
4,2027.0,2544.0,0.0
5,2028.0,0.0,0.0
6,2029.0,93.0,0.0
7,2030.0,959.79,13786.499315
8,2031.0,0.0,0.0
9,2032.0,40.0,0.0


Each years capacity contribution is obtained by adding the various new resource capacities added that year. 

In [91]:
data_all_update_variable['capacity_contribution'] = data_all_update_variable['Capmax_x'] + data_all_update_variable['capacity']

Net capacity at the end of each year is the net capacity from previous year plus the capacity contribution of the same year

In [92]:
data_all_update_variable['net_capacity'] = 0
range_len = len(data_all_update_variable['net_capacity'])
data_all_update_variable['net_capacity'][0] = data_all_update_variable.loc[0]['capacity_contribution'] 
for i in range(1, range_len):
    data_all_update_variable['net_capacity'][i] = data_all_update_variable['net_capacity'][(i-1)] + data_all_update_variable['capacity_contribution'][i]

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  data_all_update_variable['net_capacity'][0] = data_all_update_variable.loc[0]['capacity_contribution']


In [93]:
data_all_update_variable

Unnamed: 0,service_year,Capmax_x,capacity,capacity_contribution,net_capacity
0,2023.0,0.0,52672.009829,52672.009829,52672.009829
1,2024.0,0.0,1568.19,1568.19,54240.199829
2,2025.0,136.35,0.0,136.35,54376.549829
3,2026.0,289.0,6343.92,6632.92,61009.469829
4,2027.0,2544.0,0.0,2544.0,63553.469829
5,2028.0,0.0,0.0,0.0,63553.469829
6,2029.0,93.0,0.0,93.0,63646.469829
7,2030.0,959.79,13786.499315,14746.289315,78392.759144
8,2031.0,0.0,0.0,0.0,78392.759144
9,2032.0,40.0,0.0,40.0,78432.759144


In [94]:
alt.Chart(data_all_update_variable).mark_bar(color= 'lightgreen', width = 20).encode(
alt.X('service_year:N', title = 'Year'),
alt.Y('net_capacity', title = 'Capacity')).properties(width = 1000)

### Conclusion


This study was conducted to learn about the planned resources installed capacity in the CAISO system that would contribute to the system capacity from 2023 to 2035. The result of the installed capacity up to 2035 was obtained with the following analysis assumptions 
- There were some data inconsistencies that were observed  
- In this analysis the resources whose in-service year was supposed to be before 2023 and was still not online in the year 2023 was ignored contributing to installed capacity 
- Some of the resources whose in-service year was 2023 and still was not online in the year was analysed further by incorporating another dataset with more conclusive installation/ start year
- Some other resources whose start date was supposed to be in 2023 and no further information was available was also excluded from contributing to installed capacity, these resources need to be analysed further to learn about its contribution to capacity in the planning horizon
