## Massachusetts RPS / PTS

The Commonwealth of Massachusetts provides data on all of the renewable energy generation facilites, including both commercial and residential, through its website [Lists of Qualified Generation Units](https://www.mass.gov/service-details/lists-of-qualified-generation-units) and directs the user to 7 different programs, see list below.

* [RPS 1](https://www.mass.gov/doc/eligible-class-i-renewable-generation-units-0)
* [SREC I](https://www.mass.gov/files/documents/2017/10/27/Solar%20Carve-Out%20Qualified%20Units%20080717_0.xlsx)
* [SREC II](https://www.mass.gov/doc/solar-carve-out-ii-qualified-renewable-generation-units)
* [SMART](https://www.mass.gov/doc/smart-qualified-units-list)
* [CLASS 2](https://www.mass.gov/doc/rps-class-ii-qualified-units-list-2/download)
* [WASTE](https://www.mass.gov/files/documents/2017/10/27/eligible-class2-waste-units.xls)
* [APS](https://www.mass.gov/doc/aps-qualified-units-list-6)

Verification of monthly production data is centralized through the Massachusetts Clean Energy Center - [Production Tracking System](https://www.masscec.com/about-pts).  PTS data is lagged and currently (Nov21) through March, 2021.  The overlap with the individual *Lists of Qualified Generation Units* is high but with differing data items; see detail and a column naming normalization below.  The [monthly production data](http://files.masscec.com/innovate-clean-energy/prod-track-system/PTSSRECCapacityReport.xlsx) for a large fraction of all systems is available for the years 2010 - 2016 and shows an average 13.71% "capacity"; the proportion of stated kW capacity actually produced.

### Issues

* no unique key
* no (lat,lon) coordinates / location indeterminate



In [1]:
import pandas as pd
import numpy  as np

import matplotlib.pyplot as plt
import matplotlib.ticker as mtick  ##format charts for dollars ($) 

from datetime import datetime, timedelta
import locale  ##format currency for display

locale . setlocale ( locale.LC_ALL , 'en_US.UTF-8' )

'en_US.UTF-8'

In [2]:
## Monthly production data from 2010 - 2016 by installation; no key to PTS data
def get_monthly_production_data_2010_2016 ( ) :
    file  =  'http://files.masscec.com/innovate-clean-energy/prod-track-system/PTSSRECCapacityReport.xlsx'

    raw   =  pd . read_excel ( file , sheet_name = 2 )
    raw   =  raw .iloc [ 5 : , : ] #[ spec [ 'skip_rows' ] : , spec [ 'skip_columns' ] : ]
    raw . columns = [ 'year' , 'kW' , 'Jan' , 'Feb' , 'Mar' , 'Apr' , 'May' , 'Jun' , 'Jul' , 'Aug' , 'Sep' , 'Oct' , 'Nov' , 'Dec' , 'total'  , 'cf' ]
    return raw

production_data = get_monthly_production_data_2010_2016 ( )

production_data . groupby ( 'year' ) . agg ( { "cf" : [ np . mean , len ] } )

Unnamed: 0_level_0,cf,cf
Unnamed: 0_level_1,mean,len
year,Unnamed: 1_level_2,Unnamed: 2_level_2
2010,0.131844,1175
2011,0.127483,1640
2012,0.134607,3035
2013,0.132894,6177
2014,0.130688,10504
2015,0.131204,11137
2016,0.135889,12751


In [3]:
norms = {
    'http://files.masscec.com/uploads/attachments/PVinPTSwebsite.xlsx' : {
        'short_name' :  'pts' ,
        'skip_rows'  :  8 ,
        'skip_columns' :   0 ,
        'columns'    :  ['kW','effective_date','cost','grant','city','zip','county','program','type','installer',
                         'module_mfgr','inverter_mfgr','meter_mfgr','utility','owner','srec','est_annual_kWh']
    },
    '/data/energy/REC/MA/Solar Carve-Out Qualified Units 080717_0.xlsx' : {
        'short_name' : 'srec1' ,
        'skip_rows'  :  6 ,
        'skip_columns' : 1 ,
        'columns'    : ['rps_id','nepool_id','aggregation','applicant','name','type','city','zip','kW',
                        'effective_date','operation_date','sq_date','utility','installer','cost','perWatt']
    },
    '/data/energy/REC/MA/RPS_Solar_Carve-out_II_Renewable_Generation_Units_2021-09-01.xlsx' : {
        'short_name' : 'srec2' ,
        'skip_rows'  : 12 ,
        'skip_columns' :   0 ,
        'columns'    : ['rps_id','nepool_id','applicant','name','type','city','zip','kW','sector','subsector',
                        'srec_factor','effective_date','operation_date','qualification_date','distributer','installer',
                        'cost','perWatt']
    },
    '/data/energy/REC/MA/SMART_Solar_Tariff_Generation_Units.xlsx' : {
        'short_name' : 'smart' ,
        'skip_rows'  :  20 ,
        'skip_columns' : 0 ,
        'columns'    : ['project','status','capacity_block','expiration_date','operation_date','effective_date',
                        'distributor','applicant','installer','owner','ownership_type',
                        'type','city','zip','size','kW_ac','kW',
                        'location','location_tranche','off_taker','off_taker_tranche','tracking','tracking_tranche',
                        'pollinator','pollinator_tranche','storage','storage_tranche','storage_kVa','storage_duration',
                        'low_income','land_use','interconnection','meter_type','standalone','cost','perWatt']
    },    
    '/data/energy/REC/MA/RPS_Class_0I_Renewable_Generation_Units.xlsx' : {
        'short_name' : 'rps1' ,
        'skip_rows'  :  16 ,
        'skip_columns' :   0 ,
        'columns'    : ['type','rps_id','nepool_id','name','city','state','fuel','MW','output_percent','aggregator',
                        'verifier','effective_date','qualification_date']
    },
    '/data/energy/REC/MA/Eligible-Class2-Units_20210921.xlsx' : {
        'short_name'   : 'class2' ,
        'skip_rows'    :  17 ,
        'skip_columns' :   1 ,
        'columns'      : ['rps_id','nepool_id','name','city','state','fuel','MW','output_percent',
                        'verifier','effective_date','sq_date']
    },
    '/data/energy/REC/MA/eligible-class2-waste-units.xlsx' : {
        'short_name'   : 'waste' ,
        'skip_rows'    :  2 ,
        'skip_columns' :   1 ,
        'columns'      : ['rps_id','nepool_id','name','city','state','fuel','MW','sq_date','effective_date','output_percent']
    },
    '/data/energy/REC/MA/APS_QUL_1-13-21.xlsx' : {
        'short_name'   : 'aps' ,
        'skip_rows'    :  2 ,
        'skip_columns' :  1 ,
        'columns'    : ['rps_id','nepool_id','name','city','state','fuel','MW','effective_date','operation_date',
                        'rep','verifier']
    },
    
}

In [27]:
mask = pts.city=='Arlington'
print('PTS,{installs},{first},{last}'.format(installs=len(pts[mask]),first=pts[mask].effective_date.min(),last=pts[mask].effective_date.max()))

mask = rps.city=='Arlington'
print('RPS,{installs},{first},{last}'.format(installs=len(rps[mask]),first=rps[mask].effective_date.min(),last=rps[mask].effective_date.max()))


PTS,947,2003-08-18 00:00:00,2021-03-09 00:00:00
RPS,1060,2010-07-22 00:00:00,2021-08-27 00:00:00


In [28]:
mask = pts.effective_date<='2018-12-31'
print('PTS,{installs},{first},{last}'.format(installs=len(pts[mask]),first=pts[mask].effective_date.min(),last=pts[mask].effective_date.max()))

mask = rps.effective_date<='2018-12-31'
print('RPS,{installs},{first},{last}'.format(installs=len(rps[mask]),first=rps[mask].effective_date.min(),last=rps[mask].effective_date.max()))


PTS,90835,2002-12-17 00:00:00,2018-12-31 00:00:00
RPS,87617,2010-01-01 00:00:00,2018-12-31 00:00:00


In [4]:
def extract_massgov_rps ( file , spec ):
    
    raw  =  pd . read_excel ( file )

    rps  =  raw .iloc [ spec [ 'skip_rows' ] : , spec [ 'skip_columns' ] : ]


    ## assign column names, change the defaults to normalize across datasets
    ### cols =  raw . iloc [ skip_rows - 1 , : ] . tolist ( )
    rps . columns  =  spec [ 'columns' ]

    ## reset index after dropped rows
    rps  =  rps . reset_index ( drop = True )
    
    ## make effective date a pandas datetime index
    mask = rps [ 'effective_date' ] . astype ( str ) . str . contains ( 'Pending|Not Operational' )
    rps . loc [ mask , 'effective_date' ]  =  np.nan
    rps [ 'effective_date' ]  =  pd . to_datetime ( rps [ 'effective_date' ] )

    for col in [ 'kW' , 'cost' , 'est_annual_kWh' ] :
        if col in spec [ 'columns' ] :
            rps [ col ] =  rps [ col ] . astype ( np . float64 )
            
    ## MW to floats
    if 'MW' in spec [ 'columns' ] :
        
        if spec [ 'short_name' ] == 'rps1' :
            mask = rps . MW . str . contains ( 'capacity' ) == True
            rps .loc  [ mask, 'MW' ] = '0'
        elif spec [ 'short_name' ] == 'aps' :
            mask = rps . MW . str . contains ( ' thermal ') == True
            rps . loc [ mask , 'MW' ] = 0

        rps . MW = rps . MW . astype ( np . float64 ) 

    if 'state' in spec [ 'columns' ] :
        rps. loc [ : , 'state'] = rps . state . str . upper ( ) . str . replace ( ' ' , '' )        

    return rps    

In [5]:
docs = {}
for file in norms . keys ( ) :   
    docs [ norms [ file ] [ 'short_name' ] ]  = extract_massgov_rps ( file , norms [ file ] )

## 2 installs before 2003, outliers
docs [ 'pts' ] = docs [ 'pts' ] [ docs [ 'pts' ] . effective_date >= '2002-12-17' ]

## remove footers
docs [ 'class2' ]  =  docs [ 'class2' ] . iloc [ : -7 , : ]
docs [ 'rps1'   ]  =  docs [ 'rps1'   ] . iloc [ : -5 , : ]
docs [ 'aps'    ]  =  docs [ 'aps'    ] . iloc [ : -6 , : ]
docs [ 'waste'  ]  =  docs [ 'waste'  ] . iloc [ : -2 , : ]

## merge PTS and rps (RPS1, RPS2, SMART)

In [6]:
docs['smart']['rps_id'] = docs['smart']['project']

rps = docs['smart'].append(docs['srec1']).append(docs['srec2'])
rps = rps.reset_index(drop=True)

In [7]:
## Compare number of rows in PTS to the sum of the three PV programs; 
## RPS1, RPS2 and the current SMART solar programs 
## taking into consideration the reporting lag in the PTS system

summary  =  pd . DataFrame ( )

for  program  in  docs . keys ( ) :
    if 'kW' in docs [ program ] . columns :
        summary  =  summary . append ( {
            'program' : program ,
            'systems' : len ( docs [ program ] ) ,
            'MW'      : docs [ program ] . kW . sum ( ) / 1000 ,
            'start'   : docs [ program ] [ 'effective_date' ] . min ( ) . strftime ( '%Y-%m-%d' ) ,
            'last'    : docs [ program ] [ 'effective_date' ] . max ( ) . strftime( '%Y-%m-%d' )
        } , ignore_index = True )
    else :
        summary  =  summary . append ( {
            'program' : program ,
            'systems' : len ( docs [ program ] ) ,
            'MW'      : docs [ program ] . MW . sum ( )  ,
            'start'   : docs [ program ] [ 'effective_date' ] . min ( ) . strftime ( '%Y-%m-%d' ) ,
            'last'    : docs [ program ] [ 'effective_date' ] . max ( ) . strftime( '%Y-%m-%d' )
        } , ignore_index = True )

summary [ 'systems' ]  =  summary [ 'systems' ] . astype ( int )

print(summary.to_markdown())


mask = docs [ 'smart' ] . effective_date <= summary . loc [ summary . program == 'pts' , 'last' ] . values [ 0 ]

diff = summary . loc [ summary . program == 'pts' , 'systems' ] . values [ 0 ] - \
                      (len ( docs [ 'smart' ] [ mask ] ) + \
                         summary . loc [ summary . program == 'srec1' , 'systems' ] . values [ 0 ] + \
                         summary . loc [ summary . program == 'srec2' , 'systems' ] . values [ 0 ]
                      )

MW = round ( docs [ 'smart' ] [ mask ] . kW . sum ( ) / 1000 , 1 )

print ('\n\n' ,
       diff ,
       'more PTS installs compared to the combined SREC1 + SREC2 + SMART programs reducing SMART MW by' , 
       MW
      )

|    | program   |   systems |       MW | start      | last       |
|---:|:----------|----------:|---------:|:-----------|:-----------|
|  0 | pts       |    114552 | 2961.03  | 2002-12-17 | 2021-04-06 |
|  1 | srec1     |     11795 |  653.329 | 2010-01-01 | 2013-12-31 |
|  2 | srec2     |     75869 | 1753.68  | 2013-07-23 | 2020-10-01 |
|  3 | smart     |     45325 | 2169     | 2018-06-18 | 2021-12-16 |
|  4 | rps1      |      9261 | 6753.5   | 2002-01-01 | 2021-07-01 |
|  5 | class2    |       162 |  394.039 | 2009-01-01 | 2021-01-01 |
|  6 | waste     |         7 |  283.545 | 2009-01-01 | 2009-01-01 |
|  7 | aps       |       125 |  515.832 | 2009-04-01 | 2020-03-31 |


 261 more PTS installs compared to the combined SREC1 + SREC2 + SMART programs reducing SMART MW by 664.6


|    | program   |   systems |       MW | start      | last       |
|---:|:----------|----------:|---------:|:-----------|:-----------|
|  0 | pts       |    114552 | 2961.03  | 2002-12-17 | 2021-04-06 |
|  1 | srec1     |     11795 |  653.329 | 2010-01-01 | 2013-12-31 |
|  2 | srec2     |     75869 | 1753.68  | 2013-07-23 | 2020-10-01 |
|  3 | smart     |     45325 | 2169     | 2018-06-18 | 2021-12-16 |
|  4 | rps1      |      9261 | 6753.5   | 2002-01-01 | 2021-07-01 |
|  5 | class2    |       162 |  394.039 | 2009-01-01 | 2021-01-01 |
|  6 | waste     |         7 |  283.545 | 2009-01-01 | 2009-01-01 |
|  7 | aps       |       125 |  515.832 | 2009-04-01 | 2020-03-31 |

In [8]:
def exceptions(pts):

    total_installations   = len ( pts . kW )
    total_kW_installed    = pts . kW . sum ( )

    ##150 zero cost thru 4/2021 about 2.2% of total kW installed
    zero_cost = pts [ pts.cost == 0 ] . kW . sum ( )

    ##100 have no install dates, less than .03% of kW installed
    no_install_date = pts [ pd.isnull(pts.effective_date) ] . kW . sum ( )

    ##432 have costs per Watt much too high, 0.15% of kW installed
    high_Per_Watt = pts [ pts.perWatt > 10 ] . kW . sum ( )

    #      "{no_num} show no install date totaling {no_install_date}MW comprising {no_percent}% of total Watt capacity\n"
    print (
        ( "Exceptions in PTS dataset:\n\n" +\
          "{zc_num} show no cost         totaling {zero_cost}MW comprising {zc_percent}% of total Watt capacity\n"
          "{hi_num} show Per Watt > $10  totaling {high_Per_Watt}MW comprising {hi_percent}% of total Watt capacity\n"
        ) \
        . format (
            total_installations  =  total_installations,
            zc_num               =  len ( pts [ pts.cost == 0 ] ) ,
            zero_cost            =  round ( zero_cost / 1000 , 1 ) ,
            zc_percent           =  round ( 100 * zero_cost / total_kW_installed , 1 ) ,
    #         no_num               =  len ( pts [ pd.isnull(pts.effective_date) ] ) ,
    #         no_install_date      =  round ( no_install_date / 1000 , 1 ) ,
    #         no_percent           =  round ( 100 * no_install_date / total_kW_installed , 1 ) ,
            hi_num               =  len ( pts [ pts.perWatt > 10 ] ) ,
            high_Per_Watt        =  round ( high_Per_Watt / 1000 , 1 ) ,
            hi_percent           =  round ( 100 * high_Per_Watt / total_kW_installed , 1 ) ,
        )
    )


In [10]:
pts = docs [ 'pts' ] . copy ( )
pts [ 'perWatt' ] = pts . cost . div ( pts . kW ) / 1000

NM_RATE  = 0.215
RATE     = 0.11
url = 'http://files.masscec.com/uploads/attachments/PVinPTSwebsite.xlsx'

mask = pts [ 'type' ] . str . contains ( 'Residential' )

#     "and net metered rebates of {nmreturning}M (assuming a constant {nm_rate} rate)**.\n\n" +\
#     "with an estimated annual retail (net metered) value of {est_annual_value}M ({est_annual_nmval}M)."
print (
    ("PTS dataset: {url}\n\n" +\
     "There are {count} PV systems ({Rcount} residential) in Massachusetts\n" +\
     "installed between {min_date} and {max_date}\n"  +\
     "by {vendors} unique vendors (22 vendors account for 85% of all installs).\n\n" +\
     "A total of {kW}MW capacity has been installed of which {kW_res}MW is residential\n"    +\
     "at a cost of {invested}M\n" +\
     "generating an estimated {generated}TWh of electricity over the past {years} years\n" +\
     "with  a retail  value of {returning}M (assuming a constant {rate} rate)\n\n" +\
     "Expected energy output from all PV installations is {est_annual}TWh per year\n" +\
     "with an estimated annual retail value of {est_annual_value}M.\n\n"
    ) \
       . format (
    url      =  url ,
    count    =  len ( pts ) ,
    Rcount   =  len ( pts [ mask ] ) ,
    vendors  =  len ( pts . installer . unique ( ) ),
    kW       =  int ( round ( pts . kW . sum ( ) / 1000 , 0 ) ) ,   ## convert kW to MW
    kW_res   =  int ( round ( pts [ mask ] . kW . sum ( ) / 1000 , 0 ) ) ,   ## convert kW to MW
    invested =  locale . currency ( pts . cost . sum ( ) / 1e6 , grouping = True ) , ##convert to millions of $s
    generated = round ( ( ( ( datetime . today ( ) - pts . effective_date ) / timedelta ( days = 365 ) ) * pts.est_annual_kWh ) . sum ( ) / 1e9 , 2 ) , 
    returning   = locale . currency ( RATE * ( ( ( datetime . today ( ) - pts . effective_date ) / timedelta ( days = 365 ) ) * pts . est_annual_kWh ) . sum ( ) / 1e6 , grouping = True ), 
    nmreturning = locale . currency ( NM_RATE * ( ( ( datetime . today ( ) - pts [ mask ] . effective_date ) / timedelta ( days = 365 ) ) * pts [ mask ] . est_annual_kWh ) . sum ( ) / 1e6 , grouping = True ), 
    years = round ( ( pts . effective_date . max ( ) - pts . effective_date . min ( ) ) / timedelta ( days = 365 ) , 1 ) ,
    min_date =  pts . effective_date . min ( ) . strftime( "%Y-%m-%d" ) ,
    max_date =  pts . effective_date . max ( ) . strftime( "%Y-%m-%d" ) ,
    est_annual = round ( ( pts . est_annual_kWh ) . sum ( ) / 1e9 , 2 ) ,
    est_annual_value = locale . currency ( RATE * ( ( pts . est_annual_kWh ) . sum ( ) ) / 1e6 ) ,  ##in millions of $s
    est_annual_nmval = locale . currency ( NM_RATE * ( ( pts [ mask ] . est_annual_kWh ) . sum ( ) ) / 1e6 ) ,  ##in millions of $s
    rate = locale . currency ( RATE ),
    nm_rate = locale . currency ( NM_RATE )
) )

exceptions(pts)

PTS dataset: http://files.masscec.com/uploads/attachments/PVinPTSwebsite.xlsx

There are 114552 PV systems (107441 residential) in Massachusetts
installed between 2002-12-17 and 2021-04-06
by 991 unique vendors (22 vendors account for 85% of all installs).

A total of 2961MW capacity has been installed of which 803MW is residential
at a cost of $8,902.49M
generating an estimated 18.83TWh of electricity over the past 18.3 years
with  a retail  value of $2,070.79M (assuming a constant $0.11 rate)

Expected energy output from all PV installations is 3.62TWh per year
with an estimated annual retail value of $398.10M.


Exceptions in PTS dataset:

150 show no cost         totaling 65.4MW comprising 2.2% of total Watt capacity
427 show Per Watt > $10  totaling 4.7MW comprising 0.2% of total Watt capacity



In [22]:
mask = (pts.city=='Arlington') & (pts.effective_date<='20181231') & (pts.kW>10)
pts[mask].sort_values('kW')

Unnamed: 0,kW,effective_date,cost,grant,city,zip,county,program,type,installer,module_mfgr,inverter_mfgr,meter_mfgr,utility,owner,srec,est_annual_kWh,perWatt,install_yr
68465,10.14,2016-02-01,56784.0,0.0,Arlington,2474,Middlesex,Commonwealth Solar SREC II;Non-RET Funded Grants,Residential (3 or fewer dwelling units per bui...,SolarCity DBA Tesla Energy,REC Solar;SolarEdge Technologies;Continental C...,SolarEdge Technologies,Continental Control Systems /WattNode (SolarEdge),NSTAR (DBA EverSource),Y,Y,12387.0,5.6,2016
28984,10.15,2018-07-17,30450.0,0.0,Arlington,2474,Middlesex,Commonwealth Solar SREC II;Non-RET Funded Grants,Residential (3 or fewer dwelling units per bui...,Great Sky Solar,LG Electronics;Enphase Energy;Enphase Energy,Enphase Energy,Enphase Energy,NSTAR (DBA EverSource),N,Y,11163.0,3.0,2018
54112,10.34,2016-08-31,37958.8,0.0,Arlington,2474,Middlesex,Commonwealth Solar SREC II;Non-RET Funded Grants,Residential (3 or fewer dwelling units per bui...,"SolarFlair Energy, Inc.",Canadian Solar;Enphase Energy;Milbank,Enphase Energy,Milbank,NSTAR (DBA EverSource),N,Y,14073.0,3.671064,2016
52522,10.4,2016-10-03,51168.0,0.0,Arlington,2474,Middlesex,Commonwealth Solar SREC II;Non-RET Funded Grants,Residential (3 or fewer dwelling units per bui...,SolarCity DBA Tesla Energy,Trina Solar;Delta;Delta,Delta,Delta,NSTAR (DBA EverSource),Y,Y,10870.0,4.92,2016
98735,10.46,2014-07-03,55736.0,2000.0,Arlington,2476,Middlesex,Commonwealth Solar II;Commonwealth Solar SREC II,Residential (3 or fewer dwelling units per bui...,New England Clean Energy (formerly New England...,SunPower;Power-One;TBD,Power-One,TBD,NSTAR (DBA EverSource),N,Y,13734.0,5.328489,2014
92348,10.46,2015-01-05,57502.5,0.0,Arlington,2474,Middlesex,Commonwealth Solar SREC II;Non-RET Funded Grants,Residential (3 or fewer dwelling units per bui...,SolarCity DBA Tesla Energy,Canadian Solar;SolarEdge Technologies;SolarEdg...,SolarEdge Technologies;SolarEdge Technologies,Continental Control Systems /WattNode (SolarEdge),NSTAR (DBA EverSource),Y,Y,12601.0,5.497371,2015
73351,10.5,2015-12-04,49199.0,0.0,Arlington,2474,Middlesex,Commonwealth Solar SREC II;Non-RET Funded Grants,Residential (3 or fewer dwelling units per bui...,Boston Solar Company,SunPower;SunPower;Locus Energy,SunPower,Locus Energy,NSTAR (DBA EverSource),N,Y,12600.0,4.685619,2015
27023,10.56,2018-09-14,20267.0,0.0,Arlington,2474,Middlesex,Commonwealth Solar SREC II;Non-RET Funded Grants,Residential (3 or fewer dwelling units per bui...,Sunrun Inc.,LG Electronics;SolarEdge Technologies;Continen...,SolarEdge Technologies,Continental Control Systems /WattNode (SolarEdge),NSTAR (DBA EverSource),Y,Y,9011.0,1.919223,2018
73370,10.79,2015-12-04,48478.2,0.0,Arlington,2474,Middlesex,Commonwealth Solar SREC II;Non-RET Funded Grants,Residential (3 or fewer dwelling units per bui...,RevoluSun,SunPower;SolarEdge Technologies;Locus Energy,SolarEdge Technologies,Locus Energy,NSTAR (DBA EverSource),N,Y,11158.0,4.492882,2015
34616,10.85,2018-01-24,37975.0,0.0,Arlington,2476,Middlesex,Commonwealth Solar SREC II;Non-RET Funded Grants,Residential (3 or fewer dwelling units per bui...,Sunlight Solar Energy Inc.,LG Electronics;Enphase Energy;Enphase Energy,Enphase Energy,Enphase Energy,NSTAR (DBA EverSource),N,Y,13086.0,3.5,2018


In [11]:
PERCENT_OF_ALL_INSTALLS = 0.85
pts['perWatt'] = pts . cost . div ( pts . kW ) / 1000

mask = (pts['type'].str.contains('Residential')) &\
                                (pts['effective_date']>='2020-01-01') &\
                                (pts['effective_date']<='2021-12-31') &\
                                (pts['cost']>0)   &\
                                (pts['perWatt']<10)  ##note 150 installs have no cost## 432 installs with outlier costs



installers = pts[mask].groupby('installer').agg({
    "kW":[sum,np.median],
    "effective_date":len,
    "perWatt":[np.median,min,max]
})#.sort_values('date',ascending=False)
installers.columns = ['kW_total','kW_median','systems','costPerWatt_median','costPerWatt_min','costPerWatt_max']

for col in ['costPerWatt_median','costPerWatt_min','costPerWatt_max'] :
    installers[col] = installers[col].apply(locale.currency)
    
installers=installers.sort_values('systems',ascending=False)

mask = installers['systems'].cumsum() <= PERCENT_OF_ALL_INSTALLS*len(pts[mask])
print(installers[mask].to_markdown())

| installer                                 |   kW_total |   kW_median |   systems | costPerWatt_median   | costPerWatt_min   | costPerWatt_max   |
|:------------------------------------------|-----------:|------------:|----------:|:---------------------|:------------------|:------------------|
| Vivint Solar                              |   28758.1  |       7.8   |      3458 | $3.46                | $1.63             | $7.24             |
| Trinity Solar                             |   17474.2  |       6.93  |      2315 | $4.02                | $2.16             | $8.38             |
| Sunrun Inc.                               |    8588.69 |       8.23  |      1046 | $2.99                | $1.49             | $6.72             |
| Boston Solar Company                      |    3945.81 |      10.08  |       392 | $3.44                | $1.79             | $8.43             |
| SolarCity DBA Tesla Energy                |    3184.85 |       8.16  |       364 | $2.85                | $1.8

In [12]:
mask = (pts['type'].str.contains('Residential')) &\
                                (pts['effective_date']>='2020-01-01') &\
                                (pts['effective_date']<='2021-12-31')&\
                                (pts['county']=='Middlesex')&\
                                (pts['city']=='Arlington')
installers = pts[mask].groupby('installer').agg({
    "kW":[sum,np.median],
    "effective_date":len,
    "perWatt" : np.median
}).sort_values(("effective_date","len"),ascending=False)
installers.columns = ['total kW installed','median size (kW)','installs','cost_per_Watt']
installers['cost_per_Watt'] = installers['cost_per_Watt'].apply(locale.currency)
installers

Unnamed: 0_level_0,total kW installed,median size (kW),installs,cost_per_Watt
installer,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Trinity Solar,89.04,5.83,14,$4.20
SunBug Solar,105.13,7.4,13,$3.88
Great Sky Solar,75.55,8.76,9,$3.02
Boston Solar Company,61.34,6.55,8,$4.42
Vivint Solar,47.38,4.55,7,$3.46
Sunrun Inc.,28.33,4.95,5,$3.53
SolarCity DBA Tesla Energy,34.62,7.86,4,$3.70
"SolarFlair Energy, Inc.",24.06,5.025,4,$3.30
NuWatt Energy LLC,11.05,3.58,3,$3.85
ReVision Energy,22.47,7.4,3,$3.28


In [24]:
pts [ 'install_yr' ] = pts [ 'effective_date' ] . dt . strftime( '%Y' )

mask = pts [ 'type' ] . str . contains ( 'Residential|residential' )

evolution = pts [ mask ] . groupby ( 'install_yr' ) . agg (
    {
        "kW" : [ sum , np . median , min , max ] ,
        "install_yr" : len ,
        "perWatt" : [ np . median , min , max ] ,
        "est_annual_kWh" : sum
    }
)

evolution . columns = ['kW_sum'         ,
                       'kW_median'      ,
                       'kW_min'         ,
                       'kW_max'         ,
                       'installs'       ,
                       'perWatt_median' ,
                       'perWatt_min'    ,
                       'perWatt_max'    ,
                       'est_annual_kWh'
                      ]


%matplotlib widget

fig, ( ax1 ) = plt . subplots ( ncols = 1 )


x = np . array ( evolution [ 'est_annual_kWh' ] )

ax1 . plot ( 
    evolution [ 'est_annual_kWh' ] . cumsum ( ) / 1e6 , 
    label = 'Cumulative Energy Generated' , 
    lw = 3
)
ax1 . set_title  ( 'MA Residential Solar PV Systems\nAdded Capacity' )
ax1 . set_xlabel ( 'Year Installed' , fontsize = 12 )
ax1 . set_ylabel ( 'Annual GWh Generated (est.)' , fontsize = 12 )

ax = ( evolution [ 'kW_sum' ] / 1000 ) . plot . bar ( 
    secondary_y = True , 
    color = 'g' , 
    label = 'New Power Capacity' ,
    rot = 45
)

ax . set_ylabel ( 'MW', fontsize = 12 )
ax . set_xlabel ( 'Year Installed', fontsize = 12 )

h1, l1 = ax1 . get_legend_handles_labels ( )
h2, l2 = ax  . get_legend_handles_labels ( )


plt . legend ( h1 + h2 , l1 + l2, loc = 'center left' )
plt . show ( )

Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

In [23]:
%matplotlib widget

fig, ( ax1 ) = plt . subplots ( ncols = 1 )

##note 150 installs have no cost## 432 installs with outlier costs

mask = ( pts [ 'type' ] . str . contains ( 'Residential' ) ) &\
       ( pts [ 'effective_date' ] >= '2018-01-01' )           &\
       ( pts [ 'effective_date' ] <= '2021-12-31' )            &\
       ( pts [ 'cost' ] > 0 )                                   &\
       ( pts [ 'perWatt' ] < 10 ) 


x = np . array ( pts [ mask ] [ 'perWatt' ] )
# filtered = x [ ~is_outlier ( x , 3 ) ]

ax1 . hist ( x , bins = 30 )
ax1 . set_title  ( 'Residential Solar PV Systems\nCost Per Watt Installed\nSince 2018' )
ax1 . set_xlabel ( '' )
ax1 . set_ylabel ( 'Systems' )

fmt   =  '${x:,.2f}'
tick  =  mtick . StrMethodFormatter ( fmt )
ax1 . xaxis . set_major_formatter ( tick )

textstr = '\n'.join((
    r'$\mathrm{average}=\$%.2f$' %(x.mean(),),
    r'$\mathrm{median}=\$%.2f$' %(np.median(x),),
    r'$\sigma=\$%.2f$' %(x.std(),)
))

props = dict(boxstyle='round', facecolor='wheat', alpha = 0.5)
ax1.text(.485, .85, textstr, transform=ax1.transAxes, fontsize=14, verticalalignment='top', bbox = props)

plt.show()

Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

## Commercial Renewable Assets, out of state PVs

* aps, rps2 and class2 are all commercial(-like) generators

* aps =  mostly cogeneration CHP facilities at universities/hospitals, some fuel cells, flywheels and waste incinerators
* class2 = mostly out-of-state hydroelectric, some landfill gas, coupla wind
* rps0 = mostly out-of-state PV, PV farms and wind.  some other
* waste = municipal trash incinerators


In [14]:
def combine_commercial ( docs ) :

    all_df = pd . DataFrame ( )

    for program in ['aps' , 'rps1' , 'class2' , 'waste' ] :

    #     df = docs [ program ] . groupby ( [ 'state' , 'fuel' ] ) . agg ( { "MW" : [ len , sum ] } ) . reset_index ( )
        df = docs [ program ] . groupby ( [ 'fuel' ] ) . agg ( { "MW" : [ len , sum ] } ) . reset_index ( )
        df . columns = [ 'fuel' , 'systems' , 'MW']
        df [ 'program' ] = program
        all_df = all_df . append ( df )

    return all_df . sort_values ( [ 'MW' ] , ascending = False ) . reset_index ( drop = True )

commercial = combine_commercial ( docs )

In [15]:
commercial

Unnamed: 0,fuel,systems,MW,program
0,Wind,133,4349.9253,rps1
1,Photovoltaic,8971,1649.693935,rps1
2,Natural Gas/CHP,96,499.888,aps
3,Hydroelectric,57,436.3067,rps1
4,Hydroelectric,150,367.6787,class2
5,Waste to Energy,7,283.545,waste
6,Landfill Gas,62,258.783,rps1
7,Anaerobic Digester,30,55.875,rps1
8,Landfill Gas,8,20.165,class2
9,Natural Gas/Fuel Cell,19,8.653863,aps


In [17]:
mask = docs['rps1'].fuel=='Photovoltaic'
#mask = docs['rps1'].fuel=='Wind'
foo=docs['rps1'][mask]
foo = foo[['state','type']].groupby('state').count().sort_values('type',ascending=False).reset_index()
foo.columns = ['state','PV systems in RPS 1']
foo

Unnamed: 0,state,PV systems in RPS 1
0,NH,4415
1,CT,2441
2,ME,676
3,VT,528
4,MA,494
5,RI,410
6,NY,7
