# Gridded EPA Methane Inventory
## Category: 1B2ab Abandoned Oil and Gas Wells

***
#### Authors: 
Erin E. McDuffie, Bram Maasakkers
#### Date Last Updated: 
see Step 0
#### Notebook Purpose: 
This Notebook calculates and reports annual gridded (0.1°x0.1°) methane emission fluxes (molec./cm2/s) from Abandoned Oil and Gas Wells in the CONUS region between 2012-2018. 
#### Summary & Notes:
EPA GHGI emissions from Abandoned Oil and Gas wells (AOG) are read in at the national level. Unlike other sources in the GEPA, the national AOG emissions are first disaggregated into emissions are the region level (non-Appalachia vs. Appalachia), as a function of well type and plugging status, using regional well counts, plugging statues, and regional emission factors available in the GHGI workbook. Regional emissions are then allocated to the state level (as a function of well type and status) by using state level counts of abandoned wells from the national GHGI workbook (includes Enverus & historical ‘missing’ wells population). Resulting state-level emissions are then distributed onto a 0.1⁰x0.1⁰ grid using a map of grid-level AOG well locations based on the relative counts of total abandoned oil and gas wells in each state. Note that emissions from both plugged and unplugged are allocated using the same proxy (differentiated by oil vs. gas wells) as we do not have well-level time series information on the plugging status of each individual well. Emissions are converted to flux annual emission fluxes (molec./cm2/s) are written to final netCDFs in the ‘/code/Final_Gridded_Data/’ folder.
***

-------
## Step 0. Set-Up Notebook Modules, Functions, and Local Parameters and Constants
_____

In [None]:
#Confirm working directory & print last update time
import os
import time
modtime = os.path.getmtime('./1B2ab_Abandoned_Oil_Gas.ipynb')
modificationTime = time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(modtime))
print("This file was last modified on: ", modificationTime)
print('')
print("The directory we are working in is {}" .format(os.getcwd()))

In [None]:
## Include plots within notebook
%matplotlib inline

In [None]:
# Import base modules
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import re
import pyodbc
import PyPDF2 as pypdf
import tabula as tb
import shapefile as shp
from datetime import datetime
from copy import copy
from scipy.interpolate import interp1d

# Import additional modules
# Load plotting package Basemap 
# Must also specify project library path [unique to each user])
from mpl_toolkits.basemap import Basemap

# Load netCDF (for manipulating netCDF file types)
from netCDF4 import Dataset

# Set up ticker
#import matplotlib.ticker as ticker

#add path for the global function module (file)
import sys
module_path = os.path.abspath(os.path.join('../Global_Functions/'))
#print(module_path)
if module_path not in sys.path:
    sys.path.append(module_path)

# Load functions
import data_load_functions as data_load_fn
import data_functions as data_fn
import data_IO_functions as data_IO_fn
import data_plot_functions as data_plot_fn

In [None]:
#INPUT Files
# Assign global file names
global_filenames = data_load_fn.load_global_file_names()
State_ANSI_inputfile = global_filenames[0]
#County_ANSI_inputfile = global_filenames[1]
#pop_map_inputfile = global_filenames[2]
Grid_area01_inputfile = global_filenames[3]
Grid_area001_inputfile = global_filenames[4]
Grid_state001_ansi_inputfile = global_filenames[5]
#Grid_county001_ansi_inputfile = global_filenames[6]
globalinputlocation = global_filenames[0][0:20]
print(globalinputlocation)

# EPA Inventory Data
EPA_AOG_inputfile = globalinputlocation+'GHGI/Ch3_Energy/FINAL Abandoned Wells Supporting Calcs_2019-12-03.xlsx'

#proxy mapping file
AOG_Mapping_inputfile = './InputData/Abandoned_OGwells_ProxyMapping.xlsx'

#ERG Processed Well Count Notebook
#ERG_statewellcounts_inputfile = globalinputlocation+'Enverus DrillingInfo Processing - Well Counts_2021-03-17.xlsx'
#Activity Data - raw Enverus data (2019 data pull)
Enverus_AOG_well_inputfile = globalinputlocation+'Enverus/AOG/DIDSK_HEADERS_API10_2019_abandoned_wells.csv'

#NEI Data
NEI_grid_ref_inputfile = globalinputlocation+'Gridded/NEI_Reference_Grid_LCC_to_WGS84_latlon.shp'
ERG_NEI_inputloc = globalinputlocation+'NEI/ERG_ILINData/CONUS_SA_FILES_'
ERG_NEI_inputloc_2018 = globalinputlocation+'NEI/ERG_ILINData/IL_IN_ALLOCATED_WELL_LEVEL_DATA_2018_2019/IL_IN_WELL_LEVEL_DATA.accdb'

#OUTPUT FILES
gridded_outputfile = '../Final_Gridded_Data/EPA_v2_1B2ab_Abandoned_Oil_Gas.nc'
#gridded_monthly_outputfile = '../Final_Gridded_Data/EPA_v2_1B2b_Natural_Gas_Transmission_Month.nc'
netCDF_description = 'Gridded EPA Inventory - Abandoned Oil and Gas Well Emissions - IPCC Source Category 1B2ab'
title_str = "EPA methane emissions from abandoned oil and gas wells"
title_diff_str = "Emissions from abandoned oil and gas wells difference: 2018-2012"

#output gridded proxy data
grid_emi_outputfile = '../Final_Gridded_Data/Extension/v2_input_data/AOG_Grid_Emi.nc'

In [None]:
#SPECIFY RECALCS

# Re-Calculate = 1, Load from /IntermediateOutput folder = 0
ReCalc_NEI = 0
ReCalc_Env = 0

In [None]:
# Define local variables
start_year = 2012  #First year in emission timeseries
end_year = 2018    #Last year in emission timeseries
year_range = [*range(start_year, end_year+1,1)] #List of emission years
year_range_str=[str(i) for i in year_range]
num_years = len(year_range)
num_inv_years = len([*range(1990, end_year+1,1)]) #List of inventory years

# Define constants
Avogadro   = 6.02214129 * 10**(23)  #molecules/mol
Molarch4   = 16.04                  #g/mol
Res01      = 0.1                    # degrees
Res_01     = 0.01                   # degrees
hrs_to_yrs = 8760                   #number of hours in a year
g_to_mt    = 1*10**(-6)             # grams to metric ton

# Continental US Lat/Lon Limits (for netCDF files)
Lon_left = -130       #deg
Lon_right = -60       #deg
Lat_low  = 20         #deg
Lat_up  = 55          #deg
loc_dimensions = [Lat_low, Lat_up, Lon_left, Lon_right]

ilat_start = int((90+Lat_low)/Res01) #1100:1450 (continental US range)
ilat_end = int((90+Lat_up)/Res01)
ilon_start = abs(int((-180-Lon_left)/Res01)) #500:1200 (continental US range)
ilon_end = abs(int((-180-Lon_right)/Res01))

# Number of days in each month
month_day_leap  = [  31,  29,  31,  30,  31,  30,  31,  31,  30,  31,  30,  31]
month_day_nonleap = [  31,  28,  31,  30,  31,  30,  31,  31,  30,  31,  30,  31]
month_tag = ['01','02','03','04','05','06','07','08','09','10','11','12']
month_dict = {'January':1, 'February':2,'March':3,'April':4,'May':5,'June':6, 'July':7,'August':8,'September':9,'October':10,\
             'November':11,'December':12}

# Month arrays
month_range_str = ['January','February','March','April','May','June','July','August','September','October','November','December']
num_months = len(month_range_str)

num_regions = 2
appalachia_states = ['OH','PA','WV','NY','KY','TN']

In [None]:
%%javascript
IPython.OutputArea.auto_scroll_threshold = 9999;
//prevent auto-scrolling

In [None]:
# Track run time
ct = datetime.now() 
it = ct.timestamp() 
print("current time:", ct) 

____
## Step 1. Load in State ANSI data, and Area Maps
_____

In [None]:
# State-level ANSI Data
#Read the state ANSI file array
State_ANSI, name_dict, abbr_dict = data_load_fn.load_state_ansi(State_ANSI_inputfile)[0:3]
#QA: number of states
print('Read input file: '+ f"{State_ANSI_inputfile}")
print('Total "States" found: ' + '%.0f' % len(State_ANSI))
print(' ')

# 0.01 x0.01 degree Data
# State ANSI IDs and grid cell area (m2) maps
state_ANSI_map = data_load_fn.load_state_ansi_map(Grid_state001_ansi_inputfile)
area_map, lat001, lon001 = data_load_fn.load_area_map_001(Grid_area001_inputfile)

# 0.1 x0.1 degree data
# grid cell area and state ANSI maps
Lat01, Lon01 = data_load_fn.load_area_map_01(Grid_area01_inputfile)[1:3]
#Select relevant Continental 0.1 x0.1 domain
Lat_01 = Lat01[ilat_start:ilat_end]
Lon_01 = Lon01[ilon_start:ilon_end]
area_matrix_01 = data_fn.regrid001_to_01(area_map, Lat_01, Lon_01)
area_matrix_01 *= 10000  #convert from m2 to cm2
#state_ANSI_map_01 = data_fn.regrid001_to_01(state_ANSI_map, Lat_01, Lon_01)
del area_map#, lat001, lon001, global_filenames

# Print time
ct = datetime.now() 
print("current time:", ct) 

-------------
## Step 2: Read-in and Format Proxy Data
-------------

### Step 2.1 Read In Proxy Mapping File & Make Proxy Arrays

#### Step 2.1 Format Proxy Group Arrays

In [None]:
#load GHGI Mapping Groups
names = pd.read_excel(AOG_Mapping_inputfile, sheet_name = "GHGI Map - AOG", usecols = "A:B",skiprows = 1, header = 0)
colnames = names.columns.values
ghgi_aog_map = pd.read_excel(AOG_Mapping_inputfile, sheet_name = "GHGI Map - AOG", usecols = "A:B", skiprows = 2, names = colnames)
#drop rows with no data, remove the parentheses and ""
ghgi_aog_map = ghgi_aog_map[ghgi_aog_map['GHGI_Emi_Group'] != 'na']
ghgi_aog_map = ghgi_aog_map[ghgi_aog_map['GHGI_Emi_Group'].notna()]
ghgi_aog_map = ghgi_aog_map[ghgi_aog_map['GHGI_Emi_Group'] != '-']
ghgi_aog_map['GHGI_Source']= ghgi_aog_map['GHGI_Source'].str.replace(r"\(","")
ghgi_aog_map['GHGI_Source']= ghgi_aog_map['GHGI_Source'].str.replace(r"\)","")
ghgi_aog_map['GHGI_Source']= ghgi_aog_map['GHGI_Source'].str.replace(r"+","")
ghgi_aog_map.reset_index(inplace=True, drop=True)
display(ghgi_aog_map)

#load emission group - proxy map
names = pd.read_excel(AOG_Mapping_inputfile, sheet_name = "Proxy Map - AOG", usecols = "A:E",skiprows = 1, header = 0)
colnames = names.columns.values
proxy_aog_map = pd.read_excel(AOG_Mapping_inputfile, sheet_name = "Proxy Map - AOG", usecols = "A:E", skiprows = 1, names = colnames)
display((proxy_aog_map))

#create empty proxy and emission group arrays (for state and months, where needed)
for igroup in np.arange(0,len(proxy_aog_map)):
    if proxy_aog_map.loc[igroup, 'Grid_Month_Flag'] ==0:
        vars()[proxy_aog_map.loc[igroup,'Proxy_Group']] = np.zeros([len(Lat_01),len(Lon_01),num_years])
        vars()[proxy_aog_map.loc[igroup,'Proxy_Group']+'_nongrid'] = np.zeros([num_years])
    else:
        vars()[proxy_aog_map.loc[igroup,'Proxy_Group']] = np.zeros([len(Lat_01),len(Lon_01),num_years,num_months])
        vars()[proxy_aog_map.loc[igroup,'Proxy_Group']+'_nongrid'] = np.zeros([num_years,num_months])
        
    vars()[proxy_aog_map.loc[igroup,'GHGI_Emi_Group']] = np.zeros([num_years])
    
    if proxy_aog_map.loc[igroup,'State_Proxy_Group'] != '-':
        if proxy_aog_map.loc[igroup,'State_Month_Flag'] == 0:
            vars()[proxy_aog_map.loc[igroup,'State_Proxy_Group']] = np.zeros([len(State_ANSI),num_years])
        else:
            vars()[proxy_aog_map.loc[igroup,'State_Proxy_Group']] = np.zeros([len(State_ANSI),num_years,num_months])
    else:
        continue # do not make state proxy variable if no variable assigned in mapping file
        
emi_group_names = np.unique(ghgi_aog_map['GHGI_Emi_Group'])

print('QA/QC: Is the number of emission groups the same for the proxy and emissions tabs?')
if (len(emi_group_names) == len(np.unique(proxy_aog_map['GHGI_Emi_Group']))):
    print('PASS')
else:
    print('FAIL')
    print(emi_group_names)

## Read In Proxy Data

#### Step 2.2 Read in State-Level Well Counts from ERG Abandoned Wells Inventory Notebook

##### Step 2.2.1 Read in 1990 and 2015 State well count values

In [None]:
# Read In State-Level Well Counts from Workbook

# As of the 2020 GHGI... 
# State Level Well Counts by well type are only available for 1990 and 2015. 
# Interpolate for all other years and hold 2015 values constant to 2018. Then calcualte the fraction in each
# state for each year.

EPA_AOG_state_wells_1990 = pd.read_excel(EPA_AOG_inputfile, sheet_name = "GHGI Method Dev - StateLevel AD", usecols = "A,W:X", skiprows = 6, nrows = 50)
EPA_AOG_state_wells_1990.rename(columns={EPA_AOG_state_wells_1990.columns[0]:'State'}, inplace=True)
EPA_AOG_state_wells_1990.rename(columns={EPA_AOG_state_wells_1990.columns[1]:'NG_wells'}, inplace=True)
EPA_AOG_state_wells_1990.rename(columns={EPA_AOG_state_wells_1990.columns[2]:'Petr_wells'}, inplace=True)

EPA_AOG_state_wells_2015 = pd.read_excel(EPA_AOG_inputfile, sheet_name = "GHGI Method Dev - StateLevel AD", usecols = "A,AF:AG", skiprows = 6, nrows = 50)
EPA_AOG_state_wells_2015.rename(columns={EPA_AOG_state_wells_2015.columns[0]:'State'}, inplace=True)
EPA_AOG_state_wells_2015.rename(columns={EPA_AOG_state_wells_2015.columns[1]:'NG_wells'}, inplace=True)
EPA_AOG_state_wells_2015.rename(columns={EPA_AOG_state_wells_2015.columns[2]:'Petr_wells'}, inplace=True)

display(EPA_AOG_state_wells_1990)
display(EPA_AOG_state_wells_2015)

##### Step 2.1.2 Make Timeseries Arrays of State-level NG and Petr well counts

In [None]:
# Interpolate for all other years and hold 2015 values constant to latest year. 
# These counts account for the historical adjustment factor (calcualted at the state level in the GHGI workbook)
# as well as the adjustment factor for DI data (though the 1975 DI data are from a 2018 data pull)
# Note that these counts do not represent the national total counts in the GHGI, but represent the most recent 
# information we have on the relative amount of abandoned oil and gas wells in each state for each year. 
# To update this work, we need:
#     1) updated state level counts of abandoned oil and gas wells in 1975 from DI
#     2) updated state level counts of abandoned oil and gas wells across the entire timeseries from DI



state_ng_wells = np.zeros([len(State_ANSI), num_inv_years])
state_petr_wells = np.zeros([len(State_ANSI), num_inv_years])

start_year_idx = 0
idx_2015 = 2015-1990

for istate in np.arange(0,len(EPA_AOG_state_wells_1990)):
    match_state = np.where(State_ANSI['abbr'] == EPA_AOG_state_wells_1990['State'][istate])[0][0]
    state_ng_wells[match_state,0] = EPA_AOG_state_wells_1990['NG_wells'][istate]
    state_ng_wells[match_state,idx_2015] = EPA_AOG_state_wells_2015['NG_wells'][istate]
    state_petr_wells[match_state,0] = EPA_AOG_state_wells_1990['Petr_wells'][istate]
    state_petr_wells[match_state,idx_2015] = EPA_AOG_state_wells_2015['Petr_wells'][istate]

# Interpolate to fill missing years
for istate in np.arange(0,len(state_ng_wells)):
    ng_wells_temp = state_ng_wells[istate][:]  
    ng_wells_temp[idx_2015:] = ng_wells_temp[idx_2015]      #extend 2015 data to the most recent year 
    ng_wells_temp = pd.Series(ng_wells_temp)  
    ng_wells_temp.replace(0,np.NaN, inplace=True)
    ng_wells_temp = ng_wells_temp.interpolate().values
    ng_wells_temp = np.nan_to_num(ng_wells_temp)
    state_ng_wells[istate][:] = ng_wells_temp
    
    petr_wells_temp = state_petr_wells[istate][:]     
    petr_wells_temp[idx_2015:] = petr_wells_temp[idx_2015]      #extend 2015 data to the most recent year 
    petr_wells_temp = pd.Series(petr_wells_temp)  
    petr_wells_temp.replace(0,np.NaN, inplace=True)
    petr_wells_temp = petr_wells_temp.interpolate().values
    petr_wells_temp = np.nan_to_num(petr_wells_temp)
    state_petr_wells[istate][:] = petr_wells_temp
    

#Reduce arrays to relevant years
state_ng_wells = state_ng_wells[:,start_year-1990:end_year-1990+1]
state_petr_wells = state_petr_wells[:,start_year-1990:end_year-1990+1]

#### Step 2.2.3. Read In State-Level Well Plugging Status Fractions

##### Step 2.2.3.1. Read In State-Level Fractions for 2016-2018

In [None]:
#For years with avaialble DI Data (2016-2018), need to calculate fraction plugged vs. unplugged in each state
# Note: Assume that all wells are unplugged if there is no plugging status in Enverus for a particular state

# Since state level well counts by plugging type are only avialble for 2016-2019, interpolate for other years assuming
# zero percent plugged in 1950.
# This is how the current inventory also does this calculation
# Ideally, all wells by type, plugging status, and state would be read from the raw Enverus data

#Data are in units of number of wells of each different status type

state_plugged_frac_well = np.zeros([len(State_ANSI), num_inv_years])

start_year_idx = 0
idx_2016 = 2016-1990

#2016
EPA_AOG_state_well_status_2016 = pd.read_excel(EPA_AOG_inputfile, sheet_name = "Yr 2016 DI Status", usecols = "K:AV", nrows = 10)
EPA_AOG_state_well_status_2016 = EPA_AOG_state_well_status_2016.drop(columns = ['Townsend-Small','%','Cum %'])

for istate in np.arange(0,len(State_ANSI)):
    match_state_abbr = State_ANSI['abbr'][istate]
    col_names = EPA_AOG_state_well_status_2016.columns
    if match_state_abbr in col_names:
        temp_plugged = np.sum(EPA_AOG_state_well_status_2016.loc[EPA_AOG_state_well_status_2016['EPA'] == 'Plugged',match_state_abbr])
        temp_unplugged = np.sum(EPA_AOG_state_well_status_2016.loc[EPA_AOG_state_well_status_2016['EPA'] == 'Unplugged',match_state_abbr])
        if np.isnan(temp_plugged):
            temp_plugged =0
        elif np.isnan(temp_unplugged):
            temp_unplugged=1
        state_plugged_frac_well[istate,idx_2016] = temp_plugged/(temp_plugged+temp_unplugged)
        #state_unplugged_frac_well[istate,idx_2016] = temp_unplugged/ (temp_plugged+temp_unplugged)
    else:
        state_plugged_frac_well[istate,idx_2016] = 0
        #state_unplugged_frac_well[istate,idx_2016] = 1

#*********
#2017 data
idx_2017 = 2017-1990
    
EPA_AOG_state_well_status_2017 = pd.read_excel(EPA_AOG_inputfile, sheet_name = "Yr 2017 DI Status", usecols = "K:AV", nrows = 11)
EPA_AOG_state_well_status_2017 = EPA_AOG_state_well_status_2017.drop(columns = ['Townsend-Small','%','Cum %'])

for istate in np.arange(0,len(State_ANSI)):
    match_state_abbr = State_ANSI['abbr'][istate]
    col_names = EPA_AOG_state_well_status_2017.columns
    if match_state_abbr in col_names:
        temp_plugged = np.sum(EPA_AOG_state_well_status_2017.loc[EPA_AOG_state_well_status_2017['EPA'] == 'Plugged',match_state_abbr])
        temp_unplugged = np.sum(EPA_AOG_state_well_status_2017.loc[EPA_AOG_state_well_status_2017['EPA'] == 'Unplugged',match_state_abbr])
        state_plugged_frac_well[istate,idx_2017] = temp_plugged/(temp_plugged+temp_unplugged)
    else:
        state_plugged_frac_well[istate,idx_2017] = 0
    
#*****
#2018 data
idx_2018 = 2018-1990
    
EPA_AOG_state_well_status_2018 = pd.read_excel(EPA_AOG_inputfile, sheet_name = "Yr 2018 DI Status", usecols = "B:D", skiprows = 7)
EPA_AOG_state_well_status_2018.rename(columns={EPA_AOG_state_well_status_2018.columns[0]:'State'}, inplace=True)
EPA_AOG_state_well_status_2018.rename(columns={EPA_AOG_state_well_status_2018.columns[1]:'Count'}, inplace=True)
EPA_AOG_state_well_status_2018.rename(columns={EPA_AOG_state_well_status_2018.columns[2]:'EPA Status'}, inplace=True)

state_names = EPA_AOG_state_well_status_2018['State'].drop_duplicates().to_list()
for istate in np.arange(0,len(State_ANSI)):
    match_state_abbr = State_ANSI['abbr'][istate]
    if match_state_abbr in state_names:
        temp_plugged = np.sum(EPA_AOG_state_well_status_2018.loc[(EPA_AOG_state_well_status_2018['State'] == match_state_abbr) & (EPA_AOG_state_well_status_2018['EPA Status'] == 'Plugged'),'Count'])
        temp_unplugged = np.sum(EPA_AOG_state_well_status_2018.loc[(EPA_AOG_state_well_status_2018['State'] == match_state_abbr) & (EPA_AOG_state_well_status_2018['EPA Status'] == 'Unplugged'),'Count'])
        if temp_plugged ==0 and temp_unplugged ==0:
            state_plugged_frac_well[istate,idx_2018] = 0
        else:
            state_plugged_frac_well[istate,idx_2018] = temp_plugged/(temp_plugged+temp_unplugged)
    else:
        state_plugged_frac_well[istate,idx_2018] = 0    

#**********
#(for future work)
#2019 data
#idx_2019 = 2019-start_year
    
#EPA_AOG_state_well_status_2019 = pd.read_excel(EPA_AOG_inputfile, sheet_name = "YR 2020 DI Status", usecols = "B,D,F", skiprows = 0)
#EPA_AOG_state_well_status_2019.rename(columns={EPA_AOG_state_well_status_2019.columns[0]:'State'}, inplace=True)
#EPA_AOG_state_well_status_2019.rename(columns={EPA_AOG_state_well_status_2019.columns[1]:'Count'}, inplace=True)
#EPA_AOG_state_well_status_2019.rename(columns={EPA_AOG_state_well_status_2019.columns[2]:'EPA Status'}, inplace=True)

#state_names = EPA_AOG_state_well_status_2019['State'].drop_duplicates().to_list()
#for istate in np.arange(0,len(State_ANSI)):
#    match_state_abbr = State_ANSI['abbr'][istate]
#    if match_state_abbr in state_names:
#        temp_plugged = np.sum(EPA_AOG_state_well_status_2019.loc[(EPA_AOG_state_well_status_2019['State'] == match_state_abbr) & (EPA_AOG_state_well_status_2019['EPA Status'] == 'Plugged'),'Count'])
#        temp_unplugged = np.sum(EPA_AOG_state_well_status_2019.loc[(EPA_AOG_state_well_status_2019['State'] == match_state_abbr) & (EPA_AOG_state_well_status_2019['EPA Status'] == 'Unplugged'),'Count'])
#         #NOTE: for 2019 - add in the wells from 2019 and assume all are unplugged (this was a correction made by ERG at the national level
#        # to account for changes in reported plugging status in the DI/Prism dataset)
#        #print(temp_unplugged)
#        temp_unplugged = temp_unplugged + EPA_AOG_state_wells_1990[EPA_AOG_state_wells_1990['State']==match_state_abbr].sum(axis=1).values[0]
#        #print(temp_unplugged)
#        if temp_plugged ==0 and temp_unplugged ==0:
#            aog_state_plugged_frac_well[istate,idx_2019] = 0
#            aog_state_unplugged_frac_well[istate,idx_2019] = 0 
#        else:
#            aog_state_plugged_frac_well[istate,idx_2019] = temp_plugged/(temp_plugged+temp_unplugged)
#            aog_state_unplugged_frac_well[istate,idx_2019] = temp_unplugged/ (temp_plugged+temp_unplugged)
#        #if match_state_abbr == 'TX':
#            #print(aog_state_plugged_frac_well[istate,idx_2019])
#            #print(aog_state_unplugged_frac_well[istate,idx_2019])
#            #print(aog_state_plugged_frac_well[istate,idx_2018])
#            #print(aog_state_unplugged_frac_well[istate,idx_2018])
#    else:
#        aog_state_plugged_frac_well[istate,idx_2019] = 0
#        aog_state_unplugged_frac_well[istate,idx_2019] = 0     
#aog_state_plugged_frac_well[:,idx_2019] =  aog_state_plugged_frac_well[:,idx_2018]
#aog_state_unplugged_frac_well[:,idx_2019] =  aog_state_unplugged_frac_well[:,idx_2018]

##### Step 2.2.3.2. Make Timeseries Arrays of state-level plugging status fractions

In [None]:
#Interpolate for years between 1950 and 2016, then remove all years prior to 'start year'
# Following National Inventory approach, assume 0% of wells plugged in 1950. 

# interpolate between 1950 and 2016 values (assuming 0% plugged in 1950)
x = [1950,2016]
#print(idx_2016)
xnew = np.linspace(1990, 2015, num=(2016-1990), endpoint=True)        #full timeseries
for istate in np.arange(0, len(State_ANSI)):
    frac_plugged_wells_temp = state_plugged_frac_well[istate,:] 
    y = [0,frac_plugged_wells_temp[idx_2016]]                         # the % in 1950 and 2016
    f = interp1d(x,y)
    frac_plugged_wells_temp[:idx_2016] = f(xnew)                      #evaluate the interpolation for each time series year 
    state_plugged_frac_well[istate,:] = frac_plugged_wells_temp

#Reduce arrays to relevant years
state_plugged_frac_well = state_plugged_frac_well[:,start_year-1990:end_year-1990+1]

#print(np.shape(state_unplugged_frac_well))

In [None]:
display(state_plugged_frac_well[:,:])

##### Step 2.2.4 Make Timeseries of Wells by State, plugging status, type, and region

In [None]:
# Split counts for appalachia and non-appalachia states as well as for NG vs Petr and plugged vs unplugged wells

app_mask = np.zeros(len(State_ANSI))
nonapp_mask = np.ones(len(State_ANSI))
state_ng_plugged_app = np.zeros([len(State_ANSI),num_years])
state_ng_unplugged_app = np.zeros([len(State_ANSI),num_years])
state_petr_plugged_app = np.zeros([len(State_ANSI),num_years])
state_petr_unplugged_app = np.zeros([len(State_ANSI),num_years])
state_ng_plugged_nonapp = np.zeros([len(State_ANSI),num_years])
state_ng_unplugged_nonapp = np.zeros([len(State_ANSI),num_years])
state_petr_plugged_nonapp = np.zeros([len(State_ANSI),num_years])
state_petr_unplugged_nonapp = np.zeros([len(State_ANSI),num_years])

# make mask arrays for app and non-app states
for istate in np.arange(0,len(State_ANSI)):
    if State_ANSI['abbr'][istate] in appalachia_states:
        app_mask[istate]=1
        nonapp_mask[istate]=0

# make timeseries
for iyear in np.arange(0,num_years):
    state_ng_plugged_app[:,iyear] = state_ng_wells[:,iyear] * state_plugged_frac_well[:,iyear] * app_mask
    state_ng_plugged_nonapp[:,iyear] = state_ng_wells[:,iyear] * state_plugged_frac_well[:,iyear] * nonapp_mask
    state_ng_unplugged_app[:,iyear] = state_ng_wells[:,iyear] * (1-state_plugged_frac_well[:,iyear]) * app_mask
    state_ng_unplugged_nonapp[:,iyear] = state_ng_wells[:,iyear] * (1-state_plugged_frac_well[:,iyear]) * nonapp_mask
    state_petr_plugged_app[:,iyear] = state_petr_wells[:,iyear] * state_plugged_frac_well[:,iyear] *app_mask
    state_petr_plugged_nonapp[:,iyear] = state_petr_wells[:,iyear] * state_plugged_frac_well[:,iyear] * nonapp_mask
    state_petr_unplugged_app[:,iyear] = state_petr_wells[:,iyear] * (1-state_plugged_frac_well[:,iyear]) * app_mask
    state_petr_unplugged_nonapp[:,iyear] = state_petr_wells[:,iyear] * (1-state_plugged_frac_well[:,iyear]) * nonapp_mask
#display(np.shape(state_ng_plugged_app))

#### Step 2.3. Make Grid Proxy

##### Step 2.3.1. Read in Enverus Abandoned well data (from ERG)

In [None]:
#Read In Raw Enverus Data from ERG

if ReCalc_Env ==1:
    #DI data
    DI_data = pd.read_csv(Enverus_AOG_well_inputfile,low_memory=False)
    DI_data = DI_data.drop(columns=['API10','COUNTYPARISH','CUM_GAS','CUM_OIL','WELL_STATUS','PRODUCTION_TYPE','ERG_WELL_TYPE','GOR'])
    DI_data.rename({'SURFACE_HOLE_LATITUDE_WGS84':'LAT','SURFACE_HOLE_LONGITUDE_WGS84':'LON'},axis=1, inplace=True)
    DI_data['LAST_PROD_DATE'] = DI_data['LAST_PROD_DATE'].astype(str)
    DI_data['COMPLETION_DATE'] = DI_data['COMPLETION_DATE'].astype(str)
    DI_data['SPUD_DATE'] = DI_data['SPUD_DATE'].astype(str)
    DI_data['OFFSHORE'] = DI_data['OFFSHORE'].astype(str)
    DI_data = DI_data[DI_data['OFFSHORE'] == 'N']
    DI_data.reset_index(inplace=True, drop=True)
    # make columns to reflect the last production, completion, and spud years
    DI_data['prod_year'] = DI_data['LAST_PROD_DATE'].str[:4]
    DI_data['comp_year'] = DI_data['COMPLETION_DATE'].str[:4]
    DI_data['spud_year'] = DI_data['SPUD_DATE'].str[:4]
    DI_data['prod_year'] = DI_data['prod_year'].astype(float)
    DI_data['comp_year'] = DI_data['comp_year'].astype(float)
    DI_data['spud_year'] = DI_data['spud_year'].astype(float)
    display(DI_data)

##### Step 2.3.2. Calculate the state-level fraction of abandoned gas to oil wells

In [None]:
#Calculate the state-level fraction of abandoned gas to oil wells
# To be applied to the 'DRY' well counts in the next step - this approach is also used to allocate the 
# dry well population to either oil or gas types in the national GHGI
# e.g., gas wells = gas wells + dry wells * (gas wells / (gas wells + oil wells))

if ReCalc_Env ==1:
    state_abd_gas_to_oil_ratio = np.zeros([len(State_ANSI),num_years])

    for iyear in np.arange(0, num_years):
        #abandoned wells by the given year
            print('Year', iyear, 'of',num_years)
            temp = DI_data[(DI_data['prod_year'] < year_range[iyear]) | \
               (((DI_data['prod_year'].isna()) & (DI_data['comp_year'] < year_range[iyear]))) |\
              ((DI_data['prod_year'].isna()) & (DI_data['comp_year'].isna()) & (DI_data['spud_year'] < year_range[iyear]))]
            #total_well_count[iyear] = np.sum(temp['PRODUCING_ENTITY_COUNT'])
        
            # Calculate the fraction of abandoned wells in the given year that are gas
            for istate in np.arange(0,len(State_ANSI)):
                state_ag_count = 0
                state_ao_count = 0
                temp2 = temp[(temp['STATE'] == State_ANSI['abbr'][istate])]
                temp3 = temp2[(temp2['ABANDONED_WELL_TYPE'] =='GAS')]
                state_ag_count += np.sum(temp3['PRODUCING_ENTITY_COUNT'])
                temp3 = temp2[(temp2['ABANDONED_WELL_TYPE'] =='OIL')]
                state_ao_count += np.sum(temp3['PRODUCING_ENTITY_COUNT'])
                state_abd_gas_to_oil_ratio[istate,iyear] = data_fn.safe_div(state_ag_count,(state_ag_count+state_ao_count))
                #print(state_ag_count, state_ao_count,state_abd_gas_to_oil_ratio[istate,iyear]) 


##### Step 2.4.3. Make CONUS Grid Array

In [None]:
#for each year, calculate the total number of wells that were abandoned by that year

# Logic:
# If the well is onshore, extract the year of last production, completion, or spud date 
# Following the same logic as the GHGI, count the well as being abandoned if its last
# production date was before the current year, of if the completion date is before the
# current year (if the last prod date is missing), or if the spud date is before the
# current year (if both the last prod date and completion date are missing)
# Then add the relevant number of wells to the map if the location of that well is in the CONUS region
# allocate dry wells based on the fraction of gas to oil wells in the given state

if ReCalc_Env ==1:
    map_agas_wells = np.zeros([len(lat001),len(lon001),num_years])
    map_aoil_wells = np.zeros([len(lat001),len(lon001),num_years])
    total_well_count = np.zeros(num_years)
    ongrid_well_count = np.zeros(num_years)
    nongrid_well_count = np.zeros(num_years)
    
    for iyear in np.arange(0,num_years):
        #abandoned wells by the given year
        print('Year', iyear, 'of',num_years)
        temp = DI_data[(DI_data['prod_year'] < year_range[iyear]) | \
               (((DI_data['prod_year'].isna()) & (DI_data['comp_year'] < year_range[iyear]))) |\
              ((DI_data['prod_year'].isna()) & (DI_data['comp_year'].isna()) & (DI_data['spud_year'] < year_range[iyear]))]
        total_well_count[iyear] = np.sum(temp['PRODUCING_ENTITY_COUNT'])
        #subset on the CONUS grid
        temp2 = temp[(temp['LON'] > Lon_left) & (temp['LON'] < Lon_right) & \
            (temp['LAT'] > Lat_low) & (temp['LAT'] < Lat_up)]
        temp2.reset_index(inplace=True, drop=True)
        #subset off the CONUS grid
        temp3 = temp[~((temp['LON'] > Lon_left) & (temp['LON'] < Lon_right) & \
            (temp['LAT'] > Lat_low) & (temp['LAT'] < Lat_up))]
        nongrid_well_count[iyear] = (np.sum(temp3['PRODUCING_ENTITY_COUNT']))
    
        for iwell in np.arange(0, len(temp2)):
            ilat = int((temp2['LAT'][iwell] - Lat_low)/Res_01)
            ilon = int((temp2['LON'][iwell] - Lon_left)/Res_01)
            istate = np.where(temp2['STATE'][iwell] == State_ANSI['abbr'])[0][0]
            if temp2.loc[iwell,'ABANDONED_WELL_TYPE'] =='GAS':
                map_agas_wells[ilat,ilon,iyear] += temp2.loc[iwell,'PRODUCING_ENTITY_COUNT']
            elif temp2.loc[iwell,'ABANDONED_WELL_TYPE'] =='OIL':
                map_aoil_wells[ilat,ilon,iyear] += temp2.loc[iwell,'PRODUCING_ENTITY_COUNT']
            elif temp2.loc[iwell,'ABANDONED_WELL_TYPE'] =='DRY':
                map_agas_wells[ilat,ilon,iyear] += temp2.loc[iwell,'PRODUCING_ENTITY_COUNT']*state_abd_gas_to_oil_ratio[istate,iyear]
                map_aoil_wells[ilat,ilon,iyear] += temp2.loc[iwell,'PRODUCING_ENTITY_COUNT']*(1-state_abd_gas_to_oil_ratio[istate,iyear])
            ongrid_well_count[iyear] += temp2.loc[iwell,'PRODUCING_ENTITY_COUNT']
            #print(iwell)
        print('Complete:',year_range[iyear],', Number of AOG wells:',total_well_count[iyear], np.sum(map_agas_wells[:,:,iyear])+np.sum(map_aoil_wells[:,:,iyear])+nongrid_well_count[iyear])

    np.save('./IntermediateOutputs/Enverus_agaswell_tempoutput', map_agas_wells)
    np.save('./IntermediateOutputs/Enverus_aoilwell_tempoutput', map_aoil_wells)
    del temp, DI_data, temp2, temp3
    
else:
    map_agas_wells = np.load('./IntermediateOutputs/Enverus_agaswell_tempoutput.npy')
    map_aoil_wells = np.load('./IntermediateOutputs/Enverus_aoilwell_tempoutput.npy')
    
    


# MS Access logic    
#            WHERE (((ABANDONED_WELLS_ENVERUS.last_prod_year)<1975)) OR \
#            (((ABANDONED_WELLS_ENVERUS.last_prod_year) Is Null) AND ((ABANDONED_WELLS_ENVERUS.completion_year)<1975)) OR \
#            (((ABANDONED_WELLS_ENVERUS.last_prod_year) Is Null) AND ((ABANDONED_WELLS_ENVERUS.completion_year) Is Null) AND ((ABANDONED_WELLS_ENVERUS.spud_date_year)<1975))
#GROUP BY ABANDONED_WELLS_ENVERUS.ABANDONED_WELL_TYPE;

# Print time
ct = datetime.now() 
print("current time:", ct) 

##### Step 2.4.3. Correct IL & IN Data

In [None]:
# Extrememly limited well information in the Enverus wells dataset. Therefore, all AOG wells in the national 
# GHGI are from the historic estimates (from state-specific databases). Because we have very limited information
# on where these wells are located, the AOG wells for IL/IN are allocated to the grid cell level assuming the same
# spatial distribution as currently active wells in these states. 
# Other possible assumptions would be to spatially disaggregate based on production levels or data from IL/IN state
# bases, which could be implemented in a future version of the GEPA. 

# General Process
## NA Read the GHGI well and production statistics from the GHGI (contain corrected IL and IN data)
# 1. Read in the relevant NEI data (from both file formats) and place onto GEPA grid (including reproj of NEI data)
## NA  Scale the NEI prxy maps to the corresponding state level values from Step 1.
## NA Calculate the lease condensate proxy for IL/IN using the same method as the Enverus data
# 2. Place the NEI grid data on the appropriate Enverus proxy grids. 


##### Step 2.4.3.1 Read in all data prior to 2018 (text file format)

In [None]:
#1 Read in relevant files by year (for all years before 2018 [2018 read from different file type])
# Data are in a text file format where each row of data contains the surrogate code, FIPS code, column and row location
# (on the NEI CONUS1 grid), and the absolute, fractional, and running sum of data (e.g., counts or production) in the
# given FIPS region. 
# The absolute data are placed onto the GEPA grid by using an NEI reference map shapefile to map the data location
# from the NEI CONUS grid cell indexes to the corresponding latitude and longitude values in the GEPA grid. 
### Note - the 2016 data from the NEI is on a non-standard grid where lat/lons are unknown. Can change later if needed, or
# can interpolat ebetween years if more accurate

NEI_files = ['/USA_698_NOFILL.txt','/USA_695_NOFILL.txt']


# only recalc the data if required (set in Step 0)
if ReCalc_NEI ==1:
    
    map_NEI_agas_wells = np.zeros([len(lat001),len(lon001),num_years])
    map_NEI_aoil_wells = np.zeros([len(lat001),len(lon001),num_years])
    
    #read in the NEI grid refernece shapefile (contains the lat/lons of each NEI coordinate)
    shape = shp.Reader(NEI_grid_ref_inputfile)

    #make the map arrays of aboslute values (counts and mcf)
    for ivar in np.arange(0,len(NEI_files)):
        for iyear in np.arange(0,num_years):
            if year_range_str[iyear] == '2012':
                year = '2011'
            elif year_range_str[iyear] == '2013' or year_range_str[iyear] == '2014' or year_range_str[iyear] == '2015':
                year = '2014'
            #elif year_range_str[iyear] == '2015' or year_range_str[iyear] == '2016':
             #    year = '2016'  
            elif year_range_str[iyear] == '2016' or year_range_str[iyear] == '2017':
                year = '2017'
            elif year_range_str[iyear] == '2018':
                continue
            else:
                print('NEI DATA MISSING FOR YEAR ',year_range_str[iyear])
            path = ERG_NEI_inputloc+year+NEI_files[ivar]
            data_temp = pd.read_csv(path, sep='\t', skiprows = 25)
            data_temp = data_temp.drop(["!"], axis=1)
            data_temp.columns = ['Code','FIPS','COL','ROW','Frac','Abs','FIPS_Total','FIPS_Running_Sum']
            data_temp['Lat'] = np.zeros([len(data_temp)])
            data_temp['Lon'] = np.zeros([len(data_temp)])
            colmin = 1332
            colmax=0
            rowmin = 1548
            rowmax=0
            counter =0
        
            #Create the boundary box
            for idx in np.arange(0,len(data_temp)):
                if str(data_temp['FIPS'][idx]).startswith('17') or str(data_temp['FIPS'][idx]).startswith('18'):
                    icol = data_temp['COL'][idx]
                    irow = data_temp['ROW'][idx]
                    if icol > colmax:
                        colmax =icol
                    if icol < colmin:
                        colmin = icol
                    if irow > rowmax:
                        rowmax = irow
                    if irow < rowmin:
                        rowmin  = irow
            
            #Extract the relevant indicies from the NEI reference shapefile
            array_temp = np.zeros([4,((colmax+1-colmin)*(rowmax+1-rowmin))]) #make an array to save col, row, lat, lon
            idx=0
            for rec in shape.iterRecords():
                if (int(rec['cellid'][0:4]) <= colmax and int(rec['cellid'][0:4]) >= colmin) \
                    and (int(rec['cellid'][5:]) <= rowmax and int(rec['cellid'][5:]) >= rowmin):
                        array_temp[0,idx] = int(rec['cellid'][0:4])   #column index
                        array_temp[1,idx] = int(rec['cellid'][5:])    #row index
                        array_temp[2,idx] = rec['Latitude']           #latitude
                        array_temp[3,idx] = rec['Longitude']          #longitude
                        idx +=1
                        #print(idx,int(rec['cellid'][0:4]),int(rec['cellid'][5:]))
    
            #Use this array to locate and assign the lat lon values to the NEI datafile and then place onto grid
            for idx in np.arange(0,len(data_temp)):
                if str(data_temp['FIPS'][idx]).startswith('17') or str(data_temp['FIPS'][idx]).startswith('18'):
                    icol = data_temp['COL'][idx]
                    irow = data_temp['ROW'][idx]
                    match = np.where((icol == array_temp[0,:]) & (irow == array_temp[1,:]))[0][0]
                    #print(match)
                    data_temp.loc[idx,'Lat'] = array_temp[2,match]
                    data_temp.loc[idx,'Lon'] = array_temp[3,match]
                    ilat = int((data_temp['Lat'][idx] - Lat_low)/Res_01)
                    ilon = int((data_temp['Lon'][idx] - Lon_left)/Res_01)
                    #if str(data_temp['FIPS'][idx]).startswith('17'):
                    if ivar ==0:
                        map_NEI_agas_wells[ilat,ilon,iyear] += data_temp.loc[idx,'Abs']
                    elif ivar ==1:
                        map_NEI_aoil_wells[ilat,ilon,iyear] += data_temp.loc[idx,'Abs']
                        #print(ilat,ilon,data_temp.loc[idx,'Abs'], vars()[data_names[ivar]][0,ilat,ilon,iyear])
                    #else:
                    #    vars()[data_names[ivar]][1,ilat,ilon,iyear] += data_temp.loc[idx,'Abs']
                #else:
                #    data_temp.loc[idx,'Abs'] = 0 #zero out the non IL/IN data

    np.save('./IntermediateOutputs/NEI_agaswell_tempoutput', map_NEI_agas_wells)
    np.save('./IntermediateOutputs/NEI_aoilwell_tempoutput', map_NEI_aoil_wells)

else:
    map_NEI_agas_wells = np.load('./IntermediateOutputs/NEI_agaswell_tempoutput.npy')
    map_NEI_aoil_wells = np.load('./IntermediateOutputs/NEI_aoilwell_tempoutput.npy')
            
            
print('IL/IN NEI totals')
for iyear in np.arange(0,num_years):
    print('Year: ', year_range_str[iyear])
    print('Non Associated Gas (Conv + HF): ',np.sum(map_NEI_agas_wells[:,:,iyear]))
    print('Oil wells (Conv + HF):         ',np.sum(map_NEI_aoil_wells[:,:,iyear]))
    print(' ')

##### Step 2.4.3.2 Read in 2018 data (MS Access data)

In [None]:
# Read in 2018 NEI data from different datafile format
    
if ReCalc_NEI ==1:    
    #Read in the data
    driver_str = r'Driver={Microsoft Access Driver (*.mdb, *.accdb)};DBQ='+ERG_NEI_inputloc_2018+';'''
    conn = pyodbc.connect(driver_str)
    NEI_2018_ILIN_wells = pd.read_sql("SELECT * FROM 2018_IL_IN_WELLS", conn)
    conn.close()

    data_temp = NEI_2018_ILIN_wells[(NEI_2018_ILIN_wells['ACTIVE_WELL_FLAG'] ==1)]
    data_temp.reset_index(inplace=True, drop=True)
    data_temp.fillna("",inplace=True)

    #find 2018 index
    year_diff = [abs(x - 2018) for x in year_range]
    iyear = year_diff.index(min(year_diff))

    # place data on map for each state (for active wells, production, completions, and drilled wells)
    for iwell in np.arange(0,len(data_temp)):
        ilat = int((data_temp['LATITUDE'][iwell] - Lat_low)/Res_01)
        ilon = int((data_temp['LONGITUDE'][iwell] - Lon_left)/Res_01)
        if str(data_temp['FIPS_CODE'][iwell]).startswith('17') or str(data_temp['FIPS_CODE'][iwell]).startswith('18'):
            #istate = 0
        #else:
            #istate =1
            if NEI_2018_ILIN_wells['WELL_TYPE'][iwell] == 'GAS':
                map_NEI_agas_wells[ilat,ilon,iyear] += 1
            elif NEI_2018_ILIN_wells['WELL_TYPE'][iwell] == 'OIL':
                map_NEI_aoil_wells[ilat,ilon,iyear] += 1

    np.save('./IntermediateOutputs/NEI_agaswell_w2018_tempoutput', map_NEI_agas_wells)
    np.save('./IntermediateOutputs/NEI_aoilwell_w2018_tempoutput', map_NEI_aoil_wells)

else:
    map_NEI_agas_wells = np.load('./IntermediateOutputs/NEI_agaswell_w2018_tempoutput.npy')
    map_NEI_aoil_wells = np.load('./IntermediateOutputs/NEI_aoilwell_w2018_tempoutput.npy')
            
            
print('IL/IN NEI totals')
for iyear in np.arange(0,num_years):
    print('Year: ', year_range_str[iyear])
    print('Abandoned gas wells: ',np.sum(map_NEI_agas_wells[:,:,iyear]))
    print('Abandoned oil wells: ',np.sum(map_NEI_aoil_wells[:,:,iyear]))
    print(' ')

#display(data_temp)

#### Step 2.4.3.3 Add the NEI data to the relevant Enverus Proxy Maps

In [None]:
# Add maps to relevant Enverus maps
# add absolute values to the Enverus maps above 
# Note: since this proxy is used to allocate emissions from the state to grid cell level, we don't need to 
# scale the IL/IN data to the historical state counts. This method assumes that the relative geographical 
# distribution of wells within these states is the same between abandoned and active wells 
# Note: This block should not be run more than once
for iyear in np.arange(0,num_years):
    map_agas_wells[:,:,iyear] += map_NEI_agas_wells[:,:,iyear]
    map_aoil_wells[:,:,iyear] += map_NEI_aoil_wells[:,:,iyear]
del map_NEI_agas_wells,map_NEI_aoil_wells

----------------
## Step 3. Read In EPA GHGI Data
---------------

#### Step 3.1. Read-In National AOG Emissions

In [None]:
# Read EPA AOG emissions data (units of metric tons)
# Read from '2021 Time Series' tab. The data are weighted by regional emission factors following:
# Emissions = national well counts * (Appalachia EF * fraction wells Appalachia + Non-Appalachia EF * fraction wells non-Appalachia)
# Therefore, we need to first split/calculate national inventory emissions by region, well type, and plugging status
# even though the workbook only reports by well type and plugging status

#This is different than other sources in the GEPA since the Inventory data are not directly reported in Excel at the
#finest level of spatial resolution. This detail therefore needs to first be calculated in this Notebook. 

#e.g.,
# Emissions = national wells * fraction Appalachia *Appalachia EF + national wells * fraction non-Appalachia *Non-Appalachia EF

names = pd.read_excel(EPA_AOG_inputfile, sheet_name = "2020 PR Time Series", usecols = "A:AD", skiprows = 4, header = 0, nrows = 1)
colnames = names.columns.values
EPA_emi_AOG = pd.read_excel(EPA_AOG_inputfile, sheet_name = "2020 PR Time Series", usecols = "A:AD", skiprows = 27, names = colnames, nrows = 7)
EPA_emi_AOG.rename(columns={EPA_emi_AOG.columns[0]:'Source'}, inplace=True)
EPA_emi_AOG = EPA_emi_AOG.fillna('')
EPA_emi_AOG = EPA_emi_AOG.drop(columns = [*range(1990, start_year,1)])
EPA_emi_AOG.reset_index(inplace=True, drop=True)
display(EPA_emi_AOG)

#### Step 3.2 Read In Region Specific Data (well counts and EFs by well type and gas)

In [None]:
#Read In regional activity data (well counts and plugging status time series) and  EFs (constant over time series)

# a) Read in National Well Counts and Plugging Status
EPA_well_counts = pd.read_excel(EPA_AOG_inputfile, sheet_name = "2020 PR Time Series", usecols = "A:AD", skiprows = 5, names = colnames, nrows = 7)
EPA_well_counts.rename(columns={EPA_well_counts.columns[0]:'Source'}, inplace=True)
EPA_well_counts = EPA_well_counts.fillna('')
EPA_well_counts = EPA_well_counts.drop(columns = [*range(1990, start_year,1)])
EPA_well_counts.reset_index(inplace=True, drop=True)
#print(EPA_well_counts)

start_year_idx = EPA_well_counts.columns.get_loc(start_year)

national_wells_ng_plugged = EPA_well_counts.iloc[EPA_well_counts.index[EPA_well_counts['Source'] == 'NG - Plugged'],start_year_idx:]
national_wells_ng_unplugged = EPA_well_counts.iloc[EPA_well_counts.index[EPA_well_counts['Source'] == 'NG - Unplugged'],start_year_idx:]
national_wells_petr_plugged = EPA_well_counts.iloc[EPA_well_counts.index[EPA_well_counts['Source'] == 'Petro - Plugged'],start_year_idx:]
national_wells_petr_unplugged = EPA_well_counts.iloc[EPA_well_counts.index[EPA_well_counts['Source'] == 'Petro - Unplugged'],start_year_idx:]


# b) Read in Activity Factors (fraction of wells in each region)
EPA_well_region_fractions = pd.read_excel(EPA_AOG_inputfile, sheet_name = "2020 PR Time Series", usecols = "A:AD", skiprows = 13, names = colnames, nrows = 4)
EPA_well_region_fractions.rename(columns={EPA_well_region_fractions.columns[0]:'Source'}, inplace=True)
EPA_well_region_fractions = EPA_well_region_fractions.fillna('')
EPA_well_region_fractions = EPA_well_region_fractions.drop(columns = [*range(1990, start_year,1)])
EPA_well_region_fractions.reset_index(inplace=True, drop=True)
#print(EPA_well_region_fractions)

start_year_idx = EPA_well_counts.columns.get_loc(start_year)

fraction_wells_ng_app = EPA_well_region_fractions.iloc[EPA_well_region_fractions.index[EPA_well_region_fractions['Source'] == 'NG - Appalachia'],start_year_idx:]
fraction_wells_ng_nonapp = EPA_well_region_fractions.iloc[EPA_well_region_fractions.index[EPA_well_region_fractions['Source'] == 'NG - NonAppalcahia'],start_year_idx:]
fraction_wells_petr_app = EPA_well_region_fractions.iloc[EPA_well_region_fractions.index[EPA_well_region_fractions['Source'] == 'Petro - Appalachia'],start_year_idx:]
fraction_wells_petr_nonapp = EPA_well_region_fractions.iloc[EPA_well_region_fractions.index[EPA_well_region_fractions['Source'] == 'Petro - NonAppalachia'],start_year_idx:]


#Read in regional emission factors (single value apllied to timeseries)
EPA_plugged_App_EF = pd.read_excel(EPA_AOG_inputfile, sheet_name = "EFs from Studies", usecols = "C", skiprows = 35, nrows = 1)
EPA_plugged_App_EF = EPA_plugged_App_EF.columns.values[0]
EPA_unplugged_App_EF = pd.read_excel(EPA_AOG_inputfile, sheet_name = "EFs from Studies", usecols = "C",skiprows = 38, nrows = 1)
EPA_unplugged_App_EF = EPA_unplugged_App_EF.columns.values[0]
EPA_plugged_NonApp_EF = pd.read_excel(EPA_AOG_inputfile, sheet_name = "EFs from Studies", usecols = "C",skiprows = 24, nrows = 1)
EPA_plugged_NonApp_EF = EPA_plugged_NonApp_EF.columns.values[0]
EPA_unplugged_NonApp_EF = pd.read_excel(EPA_AOG_inputfile, sheet_name = "EFs from Studies", usecols = "C", skiprows = 25, nrows = 1)
EPA_unplugged_NonApp_EF = EPA_unplugged_NonApp_EF.columns.values[0]

print('QA/QC: Check EF data against GHGI Workbook')
print('All units: g/hr/well')
print('Appalachia EF (plugged):      ',EPA_plugged_App_EF)
print('Appalachia EF (unplugged):    ', EPA_unplugged_App_EF)
print('Non-Appalachia EF (plugged):  ',EPA_plugged_NonApp_EF)
print('Non-Appalachia EF (unplugged):', EPA_unplugged_NonApp_EF)

#### Step 3.3. Calculate National EPA emissions as a function of region, well type, and status

In [None]:
# Split national emissions into regional Emissions using workbook data
# Calculated in units of metric tons

#emis_ng_plugged_app = national_ng_plugged * plugged_app_EF * fraction_wells_ng_app

NG_plugged_app = national_wells_ng_plugged.to_numpy() * (fraction_wells_ng_app.to_numpy() * EPA_plugged_App_EF) *hrs_to_yrs*g_to_mt #convert to MT
NG_plugged_app = NG_plugged_app[0,:]
NG_unplugged_app = national_wells_ng_unplugged.to_numpy() * (fraction_wells_ng_app.to_numpy() * EPA_unplugged_App_EF)*hrs_to_yrs*g_to_mt
NG_unplugged_app = NG_unplugged_app[0,:]
Petro_plugged_app = national_wells_petr_plugged.to_numpy() * (fraction_wells_petr_app.to_numpy() * EPA_plugged_App_EF)*hrs_to_yrs*g_to_mt
Petro_plugged_app = Petro_plugged_app[0,:]
Petro_unplugged_app = national_wells_petr_unplugged.to_numpy() * (fraction_wells_petr_app.to_numpy() * EPA_unplugged_App_EF)*hrs_to_yrs*g_to_mt
Petro_unplugged_app = Petro_unplugged_app[0,:]

NG_plugged_nonapp = national_wells_ng_plugged.to_numpy() * (fraction_wells_ng_nonapp.to_numpy() * EPA_plugged_NonApp_EF)*hrs_to_yrs*g_to_mt
NG_plugged_nonapp = NG_plugged_nonapp[0,:]
NG_unplugged_nonapp = national_wells_ng_unplugged.to_numpy() * (fraction_wells_ng_nonapp.to_numpy() * EPA_unplugged_NonApp_EF)*hrs_to_yrs*g_to_mt
NG_unplugged_nonapp = NG_unplugged_nonapp[0,:]
Petro_plugged_nonapp = national_wells_petr_plugged.to_numpy() * (fraction_wells_petr_nonapp.to_numpy() * EPA_plugged_NonApp_EF)*hrs_to_yrs*g_to_mt
Petro_plugged_nonapp = Petro_plugged_nonapp[0,:]
Petro_unplugged_nonapp = national_wells_petr_unplugged.to_numpy() * (fraction_wells_petr_nonapp.to_numpy() * EPA_unplugged_NonApp_EF)*hrs_to_yrs*g_to_mt
Petro_unplugged_nonapp = Petro_unplugged_nonapp[0,:]

#Make GHGI DataFrame
GHGI_Data = pd.DataFrame(columns = ['Source', *range(start_year, end_year+1,1)])
GHGI_Data.loc[0] = ['Petro_plugged_app'] + Petro_plugged_app.tolist()
GHGI_Data.loc[1] = ['Petro_plugged_nonapp'] + Petro_plugged_nonapp.tolist()
GHGI_Data.loc[2] = ['Petro_unplugged_app'] + Petro_unplugged_app.tolist()
GHGI_Data.loc[3] = ['Petro_unplugged_nonapp'] + Petro_unplugged_nonapp.tolist()
GHGI_Data.loc[4] = ['NG_plugged_app'] + NG_plugged_app.tolist()
GHGI_Data.loc[5] = ['NG_plugged_nonapp'] + NG_plugged_nonapp.tolist()
GHGI_Data.loc[6] = ['NG_unplugged_app'] + NG_unplugged_app.tolist()
GHGI_Data.loc[7] = ['NG_unplugged_nonapp'] + NG_unplugged_nonapp.tolist()
#display(GHGI_Data)

##### 3.4. Split Emissions into Gridding Groups (each Group will have the same proxy applied during the gridding)

In [None]:
#Final units are converted from metric ton to kt

#sum_emi = np.zeros([num_years])
ghgi_aog_groups = ghgi_aog_map['GHGI_Emi_Group'].unique()
DEBUG=1

for igroup in np.arange(0,len(ghgi_aog_groups)): #loop through all groups, finding the GHGI sources in that group and summing emissions for that region, year
    vars()[ghgi_aog_groups[igroup]] = np.zeros([num_years])
    source_temp = ghgi_aog_map.loc[ghgi_aog_map['GHGI_Emi_Group'] == ghgi_aog_groups[igroup], 'GHGI_Source']
    pattern_temp  = '|'.join(source_temp)
    emi_temp = GHGI_Data[GHGI_Data['Source'].str.contains(pattern_temp)]
    vars()[ghgi_aog_groups[igroup]][:] = np.where(emi_temp.iloc[:,start_year_idx:] =='',[0],emi_temp.iloc[:,start_year_idx:]).sum(axis=0)/float(1000) #convert MT to kt
    
#Check again national totals
print('QA/QC: Check Emissions Sum against GHGI Summary Emissions')
for iyear in np.arange(0,num_years):
    sum_emi = 0
    for igroup in np.arange(0,len(ghgi_aog_groups)):
        sum_emi += vars()[ghgi_aog_groups[igroup]][iyear]
    summary_emi = EPA_emi_AOG.iloc[-1,iyear+1]/1e3
    diff1 = abs(sum_emi - summary_emi)/((sum_emi + summary_emi)/2)
    if DEBUG==1:
        print(summary_emi)
        print(sum_emi)
    if diff1 < 0.0001:
        print('Year ', year_range[iyear],': PASS, difference < 0.01%')
    else:
        print('Year ', year_range[iyear],': FAIL (check Production & summary tabs): ', diff1,'%') 


----------------
## Step 4. Grid Data (using spatial proxies)
---------------

#### Step 4.1. Allocate emissions

##### Step 4.1.1 Assign the Appropriate Proxy Variable Names (state & grid)

In [None]:
# The names on the *left* need to match the 'Abandoned_OilGasWells_ProxyMapping' 'State_Proxy_Group' names 
# (these are initialized in Step 2). 
# The names on the *right* are the variable names used to caluclate the proxies in this code.
# Names on the right need to match those from the code in Step 2

    
#national --> state proxies (state x year)
State_Gas_Plugged_App = state_ng_plugged_app
State_Gas_Plugged_NonApp = state_ng_plugged_nonapp
State_Gas_Unplugged_App= state_ng_unplugged_app
State_Gas_Unplugged_NonApp= state_ng_unplugged_nonapp
State_Petr_Plugged_App = state_petr_plugged_app
State_Petr_Plugged_NonApp = state_petr_plugged_nonapp
State_Petr_Unplugged_App = state_petr_unplugged_app
State_Petr_Unplugged_NonApp = state_petr_unplugged_nonapp

#state --> grid proxies (0.01x0.01)
Map_Gas_AbdWells = map_agas_wells
Map_Oil_AbdWells = map_aoil_wells

In [None]:
del map_agas_wells, map_aoil_wells, state_ng_plugged_app, state_ng_plugged_nonapp, state_ng_unplugged_app
del state_petr_plugged_app,state_petr_plugged_nonapp, state_petr_unplugged_app,state_petr_unplugged_nonapp

##### Step 4.1.2 Allocate National EPA Emissions to the State-Level

In [None]:
# Calculate state-level emissions for aog wells (as a function of type and status and region)
# Emissions in kt
# State data = national GHGI emissions * state proxy/national total

DEBUG =1

# Make placeholder emission arrays for each group
for igroup in np.arange(0,len(proxy_aog_map)):
    #if proxy_stat_map.loc[igroup,'State_Month_Flag'] ==1:
    vars()['State_'+proxy_aog_map.loc[igroup,'GHGI_Emi_Group']] = np.zeros([len(State_ANSI),num_years])
    #else:
        #vars()['State_'+proxy_aog_map.loc[igroup,'GHGI_Emi_Group']] = np.zeros([len(State_ANSI),num_years])
    vars()['NonState_'+proxy_aog_map.loc[igroup,'GHGI_Emi_Group']] = np.zeros([num_years])
        
#Loop over years
for iyear in np.arange(0,num_years):
    #Loop over states
    for istate in np.arange(0,len(State_ANSI)):
        for igroup in np.arange(0,len(proxy_aog_map)):    
            if proxy_aog_map.loc[igroup,'State_Proxy_Group'] != '-' and proxy_aog_map.loc[igroup,'GHGI_Emi_Group'] != 'Emi_not_mapped':
                #if emission group has a state-level proxy
                vars()['State_'+proxy_aog_map.loc[igroup,'GHGI_Emi_Group']][istate,iyear] = \
                            vars()[proxy_aog_map.loc[igroup,'GHGI_Emi_Group']][iyear] * \
                            data_fn.safe_div(vars()[proxy_aog_map.loc[igroup,'State_Proxy_Group']][istate,iyear], \
                                             np.sum(vars()[proxy_aog_map.loc[igroup,'State_Proxy_Group']][:,iyear]))
            else:
                #retain emissions without state-level proxy from gridding later (not relevant here)
                vars()['NonState_'+proxy_aog_map.loc[igroup,'GHGI_Emi_Group']][iyear] = vars()[proxy_aog_map.loc[igroup,'GHGI_Emi_Group']][iyear]
                
# Check sum of all gridded emissions + emissions not included in state allocation
print('QA/QC #1: Check weighted emissions against GHGI')   
for iyear in np.arange(0,num_years):
    summary_emi = EPA_emi_AOG.iloc[-1,iyear+1]/1e3 # convert MT to kt
    calc_emi = 0
    for igroup in np.arange(0,len(proxy_aog_map)):
        #print(np.sum(vars()['State_'+proxy_aog_map.loc[igroup,'GHGI_Emi_Group']][:,iyear]))
        calc_emi +=  np.sum(vars()['State_'+proxy_aog_map.loc[igroup,'GHGI_Emi_Group']][:,iyear])+\
            vars()['NonState_'+proxy_aog_map.loc[igroup,'GHGI_Emi_Group']][iyear] 
    if DEBUG ==1:
        print(summary_emi)
        print(calc_emi)
    diff = abs(summary_emi-calc_emi)/((summary_emi+calc_emi)/2)
    if diff < 0.0001:
        print('Year ', year_range[iyear], ': PASS, difference < 0.01%')
    else:
        print('Year ', year_range[iyear], ': FAIL -- Difference = ', diff*100,'%')

##### 4.1.3 Allocate emissions to the CONUS region (0.1x0.1)

In [None]:
# Allocate State-Level emissions (kt) onto a 0.1x0.1 grid using gridcell level 'Proxy_Groups'

#Define emission arrays
Emissions_array_001 = np.zeros([len(lat001),len(lon001),num_years])
Emissions_array_01 = np.zeros([len(Lat_01),len(Lon_01),num_years])
Emissions_nongrid = np.zeros([num_years])

DEBUG =1 

# For each year, (2a) distribute state-level emissions onto a grid using proxies defined above ....
# To speed up the code, masks are used rather than looping individually through each lat/lon. 
# In this case, a mask of 1's is made for the grid cells that match the ANSI values for a given state
# The masked values are set to zero, remaining values = 1. 
# AK and HI and territories are removed from the analysis at this stage. 
# The emissions allocated to each state are at 0.01x0.01 degree resolution, as required to calculate accurate 'mask'
# arrays for each state. 
# (2b - not applicable here) For emission groups that were not first allocated to states, national emissions for those groups are gridded
# based on the relevant gridded proxy arrays (0.1x0.1 resolution). These emissions are at 0.1x0.1 degrees resolution. 
# (2c - not applicable here) - record 'not mapped' emission groups in the 'non-grid' array


print('**QA/QC Check: Sum of national gridded emissions vs. GHGI national emissions')
running_sum = np.zeros([len(proxy_aog_map),num_years])

for igroup in np.arange(0,len(proxy_aog_map)):
    proxy_temp = vars()[proxy_aog_map.loc[igroup,'Proxy_Group']]
    proxy_temp_nongrid = vars()[proxy_aog_map.loc[igroup,'Proxy_Group']+'_nongrid']

    #2a. Step through each state (if group was previously allocated to state level)
    if proxy_aog_map.loc[igroup,'State_Proxy_Group'] != '-' and \
        proxy_aog_map.loc[igroup,'State_Proxy_Group'] != 'state_not_mapped':
        print('Group:',igroup,'of ',len(proxy_aog_map))
        vars()['Ext_'+proxy_aog_map.loc[igroup,'GHGI_Emi_Group']+'_01'] = np.zeros([len(lat001),len(lon001),num_years])

        for istate in np.arange(0,len(State_ANSI)):
            #print(igroup,istate)
            
            if State_ANSI['abbr'][istate] not in {'AK','HI'} and istate < 51:
                mask_state = np.ma.ones(np.shape(state_ANSI_map))
                mask_state = np.ma.masked_where(state_ANSI_map != State_ANSI['ansi'][istate], mask_state)
                mask_state = np.ma.filled(mask_state,0) 
                for iyear in np.arange(0,num_years):
                    emi_temp = vars()['State_'+proxy_aog_map.loc[igroup,'GHGI_Emi_Group']][istate,iyear]
                    #print(emi_temp)
                    if np.sum(mask_state*proxy_temp[:,:,iyear]) > 0 and emi_temp > 0: 
                    # if state is on grid and proxy for that state is non-zero
                        weighted_array = data_fn.safe_div(mask_state*proxy_temp[:,:,iyear], \
                                            np.sum(mask_state*proxy_temp[:,:,iyear]))
                        Emissions_array_001[:,:,iyear] += emi_temp*weighted_array#_01
                        vars()['Ext_'+proxy_aog_map.loc[igroup,'GHGI_Emi_Group']+'_01'][:,:,iyear] += emi_temp*weighted_array
                        running_sum[igroup,iyear] += np.sum(emi_temp*weighted_array)
                    else:
                        Emissions_nongrid[iyear] += emi_temp
                        running_sum[igroup,iyear] += np.sum(emi_temp)                 

            else:
                for iyear in np.arange(0, num_years):
                    Emissions_nongrid[iyear] += np.sum(vars()['State_'+proxy_aog_map.loc[igroup,'GHGI_Emi_Group']][istate,iyear])
                    running_sum[igroup,iyear] += np.sum(vars()['State_'+proxy_aog_map.loc[igroup,'GHGI_Emi_Group']][istate,iyear])    
    

for igroup in np.arange(0,len(proxy_aog_map)):
    vars()['Ext_'+proxy_aog_map.loc[igroup,'GHGI_Emi_Group']] = np.zeros([len(Lat_01),len(Lon_01),num_years])
    
for iyear in np.arange(0, num_years):    
    Emissions_array_01[:,:,iyear] = data_fn.regrid001_to_01(Emissions_array_001[:,:,iyear], Lat_01, Lon_01)
    #Emissions_array_01[:,:,iyear] += Emissions_array_01_temp[:,:,iyear]
    calc_emi = np.sum(Emissions_array_01[:,:,iyear]) + np.sum(Emissions_nongrid[iyear]) 
    calc_emi2 = 0
    for igroup in np.arange(0, len(proxy_aog_map)):
        if proxy_aog_map.loc[igroup,'State_Proxy_Group'] != '-' and proxy_aog_map.loc[igroup,'State_Proxy_Group'] != 'state_not_mapped':
            vars()['Ext_'+proxy_aog_map.loc[igroup,'GHGI_Emi_Group']][:,:,iyear]= data_fn.regrid001_to_01(vars()['Ext_'+proxy_aog_map.loc[igroup,'GHGI_Emi_Group']+'_01'][:,:,iyear], Lat_01, Lon_01)
            calc_emi2 += np.sum(vars()['Ext_'+proxy_aog_map.loc[igroup,'GHGI_Emi_Group']][:,:,iyear])
    calc_emi2 += np.sum(Emissions_nongrid[iyear]) 
    summary_emi = EPA_emi_AOG.iloc[-1,iyear+1]/1e3 #metric tons to kt
    emi_diff = abs(summary_emi-calc_emi)/((summary_emi+calc_emi)/2)
    #check two
    if DEBUG==1:
        print(calc_emi)
        print(calc_emi2)
        print(summary_emi)
    if abs(emi_diff) < 0.0001:
        print('Year '+ year_range_str[iyear]+': Difference < 0.01%: PASS')
    else: 
        print('Year '+ year_range_str[iyear]+': Difference > 0.01%: FAIL, diff: '+str(emi_diff))
        
ct = datetime.now() 
print("current time:", ct)

#### Step 4.1.4 Save gridded emissions (kt)

In [None]:
#save gridded emissions for each gridding group - for extension

#Initialize file
data_IO_fn.initialize_netCDF(grid_emi_outputfile, netCDF_description, 0, year_range, loc_dimensions, Lat_01, Lon_01)

unique_groups = np.unique(proxy_aog_map['GHGI_Emi_Group'])
unique_groups = unique_groups[unique_groups != 'Emi_not_mapped']

nc_out = Dataset(grid_emi_outputfile, 'r+', format='NETCDF4')

for igroup in np.arange(0,len(unique_groups)):
    print('Ext_'+unique_groups[igroup])
    if len(np.shape(vars()['Ext_'+unique_groups[igroup]])) ==4:
        ghgi_temp = np.sum(vars()[unique_groups[igroup]],axis=3) #sum month data if data is monthly
    else:
        ghgi_temp = vars()['Ext_'+unique_groups[igroup]]

    # Write data to netCDF
    data_out = nc_out.createVariable('Ext_'+unique_groups[igroup], 'f8', ('lat', 'lon','year'), zlib=True)
    data_out[:,:,:] = ghgi_temp[:,:,:]

#save nongrid data to calculate non-grid fraction extension
data_out = nc_out.createVariable('Emissions_nongrid', 'f8', ('year'), zlib=True)  
data_out[:] = Emissions_nongrid[:]
nc_out.close()

#Confirm file location
print('** SUCCESS **')
print("Gridded emissions (kt) written to file: {}" .format(os.getcwd())+grid_emi_outputfile)
print(' ')

del data_out, ghgi_temp, nc_out

#### 4.2 Calculate Gridded Emission Fluxes (molec./cm2/s) (0.1x0.1)

In [None]:
#Convert emissions to emission flux
# conversion: kt emissions to molec/cm2/s flux

Flux_array_01_annual = np.zeros([len(Lat_01),len(Lon_01),num_years])
print('**QA/QC Check: Sum of national gridded emissions vs. GHGI national emissions')
  
for iyear in np.arange(0,num_years):
    calc_emi = 0
    if year_range[iyear]==2012 or year_range[iyear]==2016:
        year_days = np.sum(month_day_leap)
    else:
        year_days = np.sum(month_day_nonleap)

    conversion_factor_01 = 10**9 * Avogadro / float(Molarch4 *year_days * 24 * 60 *60) / area_matrix_01
    Flux_array_01_annual[:,:,iyear] = Emissions_array_01[:,:,iyear]*conversion_factor_01
    #convert back to mass to check
    conversion_factor_annual = 10**9 * Avogadro / float(Molarch4 *year_days * 24 * 60 *60) / area_matrix_01
    calc_emi = np.sum(Flux_array_01_annual[:,:,iyear]/conversion_factor_annual)+np.sum(Emissions_nongrid[iyear])
    summary_emi = EPA_emi_AOG.iloc[-1,iyear+1]/1e3 #metric tons to kt
    emi_diff = abs(summary_emi-calc_emi)/((summary_emi+calc_emi)/2)
    if DEBUG==1:
        print(calc_emi)
        print(summary_emi)
    if abs(emi_diff) < 0.0001:
        print('Year '+ year_range_str[iyear]+': Difference < 0.01%: PASS')
    else: 
        print('Year '+ year_range_str[iyear]+': Difference > 0.01%: FAIL, diff: '+str(emi_diff))
        
Flux_Emissions_Total_annual = Flux_array_01_annual

-------------
## Step 5. Write netCDF
------------

In [None]:
# yearly data
#Initialize file
data_IO_fn.initialize_netCDF(gridded_outputfile, netCDF_description, 0, year_range, loc_dimensions, Lat_01, Lon_01)

# Write data to netCDF
nc_out = Dataset(gridded_outputfile, 'r+', format='NETCDF4')
nc_out.variables['emi_ch4'][:,:,:] = Flux_Emissions_Total_annual
nc_out.close()
#Confirm file location
print('** SUCCESS **')
print("Gridded total abandoned oil and gas well fluxes written to file: {}" .format(os.getcwd())+gridded_outputfile)


----------
## Step 6. Plot Gridded Data
---------

#### Step 6.1. Plot Annual Emission Fluxes

In [None]:
#Plot Annual Data
scale_max = 0.5
save_flag = 0
save_fig = ''
data_plot_fn.plot_annual_emission_flux_map(Flux_Emissions_Total_annual, Lat_01, Lon_01, year_range, title_str,scale_max,save_flag,save_fig)

#### Step 6.2 Plot Difference between first and last inventory year

In [None]:
# Plot difference between last and first year
save_flag = 0
save_outfile = ''
data_plot_fn.plot_diff_emission_flux_map(Flux_Emissions_Total_annual, Lat_01, Lon_01, year_range, title_diff_str,save_flag,save_outfile)

In [None]:
ct = datetime.now() 
ft = ct.timestamp() 
time_elapsed = (ft-it)/(60*60)
print('Time to run: '+str(time_elapsed)+' hours')
print('** GEPA_1B2ab_Abandoned_Oil_Gas: COMPLETE **')