# Gridded EPA Methane Inventory
## Category: 1A Mobile Combustion

***
#### Authors: 
Joannes D. Maasakkers, Erin E. McDuffie
#### Date Last Updated: 
see Step 0
#### Notebook Purpose: 
This Notebook calculates and reports annual gridded (0.1°x0.1°) methane emission fluxes (molec./cm2/s) from mobile combustion sources in the CONUS region between 2012-2018.    
#### Summary & Notes:
EPA GHGI mobile combustion emissions from on-road and non-highway emission sources are read in at the national level from the GHGI (file via personal communication). For on-road sources, national emissions for each vehicle type (e.g., passenger, light, heavy duty, diesel) are allocated to states based on vehicle miles traveled, as a function of vehicle type, road type (e.g., primary, secondary, other), and whether the road types are in rural or urban regions. The vehicle miles traveled as a function of vehicle type, road type, and region are derived from state-level datasets from the U.S. Department of Transportation, Federal Highway Administration. State-level on-road emissions (as a function of vehicle type, road type, and region) are then allocated to a 0.01°x0.01° grid using high resolution maps of urban and rural roads, as a function of road type, derived from U.S. Census and U.S. DOT Highway Performance Monitoring System data. National-level non-highway emissions are allocated to the 0.1°x0.1° grid using gridded source-specific proxies, including maps of navigable waterways, railroads, mine locations, crop areas, and population. All emissions are re-gridded to a 0.1°x0.1° grid. Emissions are converted to emission flux. Annual emission fluxes (molec./cm2/s) are written to final netCDFs in the ‘/code/Final_Gridded_Data/’ folder. 
***

-------
## Step 0. Set-Up Notebook Modules, Functions, and Local Parameters and Constants
-------

In [None]:
#Confirm working directory
import os
import time
modtime = os.path.getmtime('./1A_Combustion_Mobile.ipynb')
modificationTime = time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(modtime))
print("This file was last modified on: ", modificationTime)
print('')
print("The directory we are working in is {}" .format(os.getcwd()))

In [None]:
## Include plots within notebook
%matplotlib inline

In [None]:
# Import base modules
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import re
import datetime
from copy import copy

# Import additional modules
# Load plotting package Basemap 
from mpl_toolkits.basemap import Basemap

# Load netCDF (for manipulating netCDF file types)
from netCDF4 import Dataset

# Set up ticker
import matplotlib.ticker as ticker

#add path for the global function module (file)
import sys
module_path = os.path.abspath(os.path.join('../Global_Functions/'))
if module_path not in sys.path:
    sys.path.append(module_path)

# Load Tabula (for reading tables from PDFs)
import tabula as tb   
    
# Load user-defined global functions (modules)
import data_load_functions as data_load_fn
import data_functions as data_fn
import data_IO_functions as data_IO_fn
import data_plot_functions as data_plot_fn

In [None]:
#INPUT Files
# Assign global file names
global_filenames = data_load_fn.load_global_file_names()
State_ANSI_inputfile = global_filenames[0]
#County_ANSI_inputfile = global_filenames[1]
pop_map_inputfile = global_filenames[2]
Grid_area01_inputfile = global_filenames[3]
Grid_area001_inputfile = global_filenames[4]
Grid_state001_ansi_inputfile = global_filenames[5]
#Grid_county001_ansi_inputfile = global_filenames[6]
globalinputlocation = global_filenames[0][0:20]
print(globalinputlocation)

# Specify names of inputs files used in this notebook
#EPA Data
EPA_comb_inputfile = '../Global_InputData/GHGI/Ch3_Energy/Transport non-CO2.csv'

#Proxy Data file
MobComb_Mapping_inputfile = "./InputData/MobileCombustion_ProxyMapping.xlsx"


#Activity Data
#US DOT Federal Highway Statistics
State_vmt_file = "./InputData/vm2/vm2_"
State_vdf_file = "./InputData/vm4/vm4_"
Primary_roads_file = "./InputData/High_Resolution_Data/PrimaryRoads_"
Secondary_roads_file = "./InputData/High_Resolution_Data/PrimarySecondaryRoads_"
CONUS_rural_roads_file = "./InputData/High_Resolution_Data/CONUS_HPMS_Rural_Roads_001x001.csv"
CONUS_urban_roads_file = "./InputData/High_Resolution_Data/CONUS_HPMS_Urban_Roads_001x001.csv"
Urban_area_file = "./InputData/High_Resolution_Data/UrbanAreas_"

Waterways_file = './InputData/High_Resolution_Data/Navigable_Waterways_001x001.csv'
Mines_file = '../Global_InputData/MSHA/Mines.txt'
Railroad_file = './InputData/High_Resolution_Data/Railroads_'
Crop_file = globalinputlocation+'Gridded/AllCrops_'

#OUTPUT FILES
gridded_outputfile = '../Final_Gridded_Data/EPA_v2_1A_Combustion_Mobile.nc'
netCDF_description = 'Gridded EPA Inventory - Mobile Combustion Emissions - IPCC Source Category 1A'
title_str = "EPA methane emissions from mobile combustion"
title_diff_str = "Emissions from mobile combustion difference: 2018-2012"

#output gridded proxy data
grid_emi_outputfile = '../Final_Gridded_Data/Extension/v2_input_data/Combustion_Mobile_Grid_Emi.nc'

In [None]:
#SPECIFY RECALCS

#0 = don't recalcuate, 1 = re-calculate
ReCalc_Crop =0

In [None]:
# Define local variables
start_year = 2012  #First year in emission timeseries
end_year = 2018    #Last year in emission timeseries
year_range = [*range(start_year, end_year+1,1)] #List of emission years
year_range_str=[str(i) for i in year_range]
num_years = len(year_range)

# Define constants
Avogadro   = 6.02214129 * 10**(23)  #molecules/mol
Molarch4   = 16.04                  #g/mol
Res01      = 0.1                    # degrees
Res_01     = 0.01
tg_scale   = 0.001                  #Tg scale number [New file allows for the exclusion of the territories] 

# Continental US Lat/Lon Limits (for netCDF files)
Lon_left = -130       #deg
Lon_right = -60       #deg
Lat_low  = 20         #deg
Lat_up  = 55          #deg
loc_dimensions = [Lat_low, Lat_up, Lon_left, Lon_right]

ilat_start = int((90+Lat_low)/Res01) #1100:1450 (continental US range)
ilat_end = int((90+Lat_up)/Res01)
ilon_start = abs(int((-180-Lon_left)/Res01)) #500:1200 (continental US range)
ilon_end = abs(int((-180-Lon_right)/Res01))

# Number of days in each month
month_day_leap  = [  31,  29,  31,  30,  31,  30,  31,  31,  30,  31,  30,  31]
month_day_nonleap = [  31,  28,  31,  30,  31,  30,  31,  31,  30,  31,  30,  31]

# Month arrays
month_range_str = ['January','February','March','April','May','June','July','August','September','October','November','December']
num_months = len(month_range_str)

In [None]:
%%javascript
IPython.OutputArea.auto_scroll_threshold = 9999;

In [None]:
# Track run time
ct = datetime.datetime.now() 
it = ct.timestamp() 
print("current time:", ct) 

____
## Step 1. Load in State ANSI data and Area Maps
_____

In [None]:
# State-level ANSI Data
#Read the state ANSI file array
State_ANSI, name_dict = data_load_fn.load_state_ansi(State_ANSI_inputfile)[0:2]
#QA: number of states
print('Read input file: '+ f"{State_ANSI_inputfile}")
print('Total "States" found: ' + '%.0f' % len(State_ANSI))
print(' ')

# 0.01 x0.01 degree Data
# State ANSI IDs and grid cell area (m2) maps
state_ANSI_map = data_load_fn.load_state_ansi_map(Grid_state001_ansi_inputfile)
area_map, lat001, lon001 = data_load_fn.load_area_map_001(Grid_area001_inputfile)

# 0.01 x0.01 degree Data
# State ANSI IDs and grid cell area (m2) maps
state_ANSI_map = data_load_fn.load_state_ansi_map(Grid_state001_ansi_inputfile)
state_ANSI_map = state_ANSI_map.astype('int32')
#county_ANSI_map = data_load_fn.load_county_ansi_map(Grid_county001_ansi_inputfile)
#county_ANSI_map = county_ANSI_map.astype('int32')
area_map, lat001, lon001 = data_load_fn.load_area_map_001(Grid_area001_inputfile)

# 0.1 x0.1 degree data
# grid cell area and state and county ANSI maps
area_map01, Lat01, Lon01 = data_load_fn.load_area_map_01(Grid_area01_inputfile)[0:3]
#Select relevant Continental 0.1 x0.1 domain
Lat_01 = Lat01[ilat_start:ilat_end]
Lon_01 = Lon01[ilon_start:ilon_end]
area_matrix_01 = data_fn.regrid001_to_01(area_map, Lat_01, Lon_01)
area_matrix_01 *= 10000  #convert from m2 to cm2

state_ANSI_map_01 = data_fn.regrid001_to_01(state_ANSI_map, Lat_01, Lon_01)

# Print time
ct = datetime.datetime.now() 
print("current time:", ct) 

-------------
## Step 2: Read-in and Format Proxy Data
-------------

#### Step 2.1 Read In Proxy Mapping File & Make Proxy Arrays

In [None]:
##NOTE: Mobile combustion uses an additional road type, and urban/rural flag, which means that the proxy data
# have up to two added dimensions (road type and region)

#load GHGI Mapping Groups
names = pd.read_excel(MobComb_Mapping_inputfile, sheet_name = "GHGI Map - Mob. Comb.", usecols = "A:B",skiprows = 1, header = 0)
colnames = names.columns.values
ghgi_mob_map = pd.read_excel(MobComb_Mapping_inputfile, sheet_name = "GHGI Map - Mob. Comb.", usecols = "A:B", skiprows = 1, names = colnames)
#drop rows with no data, remove the parentheses and ""
ghgi_mob_map = ghgi_mob_map[ghgi_mob_map['GHGI_Emi_Group'] != 'na']
ghgi_mob_map = ghgi_mob_map[ghgi_mob_map['GHGI_Emi_Group'] != '-']
ghgi_mob_map = ghgi_mob_map[ghgi_mob_map['GHGI_Emi_Group'].notna()]
ghgi_mob_map['GHGI_Source']= ghgi_mob_map['GHGI_Source'].str.replace(r"\(","")
ghgi_mob_map['GHGI_Source']= ghgi_mob_map['GHGI_Source'].str.replace(r"\)","")
ghgi_mob_map.reset_index(inplace=True, drop=True)
display(ghgi_mob_map)

#load emission group - proxy map
names = pd.read_excel(MobComb_Mapping_inputfile, sheet_name = "Proxy Map - Mob. Comb.", usecols = "A:F",skiprows = 1, header = 0)
colnames = names.columns.values
proxy_mob_map = pd.read_excel(MobComb_Mapping_inputfile, sheet_name = "Proxy Map - Mob. Comb.", usecols = "A:F", skiprows = 1, names = colnames)
display((proxy_mob_map))

#create empty proxy and emission group arrays (add months for proxy variables that have monthly data)
for igroup in np.arange(0,len(proxy_mob_map)):
    if proxy_mob_map.loc[igroup, 'Grid_Month_Flag'] ==0:
        if proxy_mob_map.loc[igroup, 'Grid_Urban_Rural_Flag'] >= 1:
            vars()[proxy_mob_map.loc[igroup,'Proxy_Group']] = np.zeros([2,len(Lat_01),len(Lon_01),num_years])
            vars()[proxy_mob_map.loc[igroup,'Proxy_Group']+'_nongrid'] = np.zeros([2,num_years])
        else:
            vars()[proxy_mob_map.loc[igroup,'Proxy_Group']] = np.zeros([len(Lat_01),len(Lon_01),num_years])
            vars()[proxy_mob_map.loc[igroup,'Proxy_Group']+'_nongrid'] = np.zeros([num_years])
    else:
        vars()[proxy_mob_map.loc[igroup,'Proxy_Group']] = np.zeros([len(Lat_01),len(Lon_01),num_years,num_months])
        vars()[proxy_mob_map.loc[igroup,'Proxy_Group']+'_nongrid'] = np.zeros([num_years,num_months])
        
    vars()[proxy_mob_map.loc[igroup,'GHGI_Emi_Group']] = np.zeros([num_years])
    
    if proxy_mob_map.loc[igroup,'State_Proxy_Group'] != '-':
        #if proxy_mob_map.loc[igroup,'State_Month_Flag'] == 0:
        if proxy_mob_map.loc[igroup, 'Urban_Rural_Flag'] >= 1:
            vars()[proxy_mob_map.loc[igroup,'State_Proxy_Group']] = np.zeros([2,len(State_ANSI),num_years])
        else:
            vars()[proxy_mob_map.loc[igroup,'State_Proxy_Group']] = np.zeros([len(State_ANSI),num_years])
        #else:
        #    vars()[proxy_mob_map.loc[igroup,'State_Proxy_Group']] = np.zeros([len(State_ANSI),num_years,num_months])
    else:
        continue # do not make state proxy variable if no variable assigned in mapping file
        
emi_group_names = np.unique(ghgi_mob_map['GHGI_Emi_Group'])

print('QA/QC: Is the number of emission groups the same for the proxy and emissions tabs?')
if (len(emi_group_names) == len(np.unique(proxy_mob_map['GHGI_Emi_Group']))):
    print('PASS')
else:
    print('FAIL')
    print(emi_group_names)

#### Step 2.2. Read in Federal Highway Administration Data (vehcile miles traveled by state (vehicle & road type)

##### Step 2.2.1 Read In Vehicle Miles Traveled, by State and Functional Type

In [None]:
#Read in state-level vehicle miles travels by road type, from the Federal Highway Administration (e.g., road miles)

#Available roads [Urban / Rural]
#  - Interstate => INTERSTATE 
#  - Primary & Secondary => OTHER FREEWAYS AND EXPRESSWAYS / OTHER PRINCIPAL ARTERIAL / MINOR ARTERIAL
#  - Other =>  MAJOR COLLECTOR / MINOR COLLECTOR / LOCAL
    
#Map roads to road table (3 categories x state x year)
Miles_road_primary = np.zeros([2, len(State_ANSI), num_years])
Miles_road_secondary = np.zeros([2, len(State_ANSI), num_years])
Miles_road_other = np.zeros([2, len(State_ANSI), num_years])
total = np.zeros(num_years)
total2 = np.zeros(num_years)

for iyear in np.arange(0, num_years):
    names = pd.read_excel(State_vmt_file+year_range_str[iyear]+'.xls',  sheet_name = 'A', skiprows = 12, header = 0, nrows = 1)
    colnames = names.columns.values
    VMT_road = pd.read_excel(State_vmt_file+year_range_str[iyear]+'.xls', sheet_name = 'A', names = colnames, skiprows = 13, nrows = 51)
    #print(type(VMT_road))

    VMT_road.rename(columns = {'INTERSTATE':'RURAL - INTERSTATE', 'FREEWAYS  AND':'RURAL - FREEWAYS',\
                                     'PRINCIPAL':'RURAL - PRINCIPAL','MINOR':'RURAL - MINOR',\
                                     'MAJOR':'RURAL - MAJOR COLLECTOR','MINOR.1':'RURAL - MINOR COLLECTOR',\
                                     'LOCAL':'RURAL - LOCAL','TOTAL':'RURAL - TOTAL',\
                                     'INTERSTATE.1':'URBAN - INTERSTATE','FREEWAYS  AND.1':'URBAN - FREEWAYS',\
                                     'PRINCIPAL.1':'URBAN - PRINCIPAL','MINOR.2':'URBAN - MINOR',
                                     'MAJOR.1':'URBAN - MAJOR COLLECTOR','MINOR.3':'URBAN - MINOR COLLECTOR',
                                     'LOCAL.1':'URBAN - LOCAL','TOTAL.1':'URBAN - TOTAL',
                                     'TOTAL.2':'TOTAL'}, inplace = True)

    VMT_road['STATE'] = VMT_road['STATE'].str.replace(r"\(2\)","") #fix state names
    VMT_road['STATE'] = VMT_road['STATE'].str.replace("Dist. of Columbia","District of Columbia") #fix state names
    #display(VMT_road)
    
    #Add state ID to dataframes
    VMT_road['ANSI'] = 0
    for idx in np.arange(len(VMT_road)):
        VMT_road.loc[idx,'ANSI'] = name_dict[VMT_road.loc[idx,'STATE'].strip()]
        istate = np.where(VMT_road.loc[idx,'ANSI'] == State_ANSI['ansi'])
        Miles_road_primary[0,istate,iyear] = VMT_road.loc[idx,'URBAN - INTERSTATE']
        Miles_road_primary[1,istate,iyear] = VMT_road.loc[idx,'RURAL - INTERSTATE']
        Miles_road_secondary[0,istate,iyear] = VMT_road.loc[idx,'URBAN - FREEWAYS']+VMT_road.loc[idx,'URBAN - PRINCIPAL']+VMT_road.loc[idx,'URBAN - MINOR']
        Miles_road_secondary[1,istate,iyear] = VMT_road.loc[idx,'RURAL - FREEWAYS']+VMT_road.loc[idx,'RURAL - PRINCIPAL']+VMT_road.loc[idx,'RURAL - MINOR']
        Miles_road_other[0,istate,iyear] = VMT_road.loc[idx,'URBAN - MAJOR COLLECTOR']+VMT_road.loc[idx,'URBAN - MINOR COLLECTOR']+VMT_road.loc[idx,'URBAN - LOCAL']
        Miles_road_other[1,istate,iyear] = VMT_road.loc[idx,'RURAL - MAJOR COLLECTOR']+VMT_road.loc[idx,'RURAL - MINOR COLLECTOR']+VMT_road.loc[idx,'RURAL - LOCAL']
        total[iyear] += np.sum(Miles_road_primary[:,istate,iyear])+np.sum(Miles_road_secondary[:,istate,iyear])+\
                        np.sum(Miles_road_other[:,istate,iyear])
        total2[iyear] += VMT_road.loc[idx,'TOTAL']
    
    #calc_emi += np.sum(Emissions_nongrid[iyear,:])
    #summary_emi = EPA_statcom_total.iloc[0,iyear+1] 
    abs_diff = abs(total[iyear]-total2[iyear])/((total[iyear]+total2[iyear])/2)
    #DEBUG## print(calc_emi)
    #DEBUG## print(summary_emi)
    if abs(abs_diff) < 0.0001:
        print('Year '+ year_range_str[iyear]+': Difference < 0.01%: PASS')
    else: 
        print('Year '+ year_range_str[iyear]+': Difference > 0.01%: FAIL, diff: '+str(abs_diff))
        print(total[iyear])
        print(total2[iyear])

##### Step 2.2.2 Read In Fraction of Vehicle Miles Traveled, by State, Functional Type, and Vehicle Type

In [None]:
#Read VMT per vehicle type & road-type (urban & rural)
#http://www.fhwa.dot.gov/policyinformation/statistics/2013/vm4.cfm

 #Map percentages to emission categories
#Interstate / P&S / Other __ #Passenger cars / Light-Duty Trucks / Medium- and Heavy-Duty Trucks and buses
#0- urban, 1 - rural
# 0- primary roads, 1- secondary roads, 3- other roads
Per_vmt_mot = np.zeros([2,3, len(State_ANSI), num_years])
Per_vmt_pas = np.zeros([2,3, len(State_ANSI), num_years])
Per_vmt_lig = np.zeros([2,3, len(State_ANSI), num_years])
Per_vmt_hea = np.zeros([2,3, len(State_ANSI), num_years])
total_R = np.zeros(num_years)
total_U = np.zeros(num_years)
total = np.zeros(num_years)
total2_U = np.zeros(num_years)
total2 = np.zeros(num_years)
total2_R = np.zeros(num_years)

for iyear in np.arange(0,num_years):
    if year_range[iyear] ==2012 or year_range[iyear]==2016:
        continue #deal with missing data at the end
    else:
        #read in rural sheet
        names = pd.read_excel(State_vdf_file+year_range_str[iyear]+'.xls',  sheet_name = 'A', skiprows = 12, header = 0, nrows = 1)
        colnames = names.columns.values
        VMT_type_R = pd.read_excel(State_vdf_file+year_range_str[iyear]+'.xls', na_values=['-'],sheet_name = 'A', names = colnames, skiprows = 13, nrows = 51)

        VMT_type_R.rename(columns = {'MOTOR-':'INTERSTATE - MOTORCYCLES', 'PASSENGER':'INTERSTATE - PASSENGER CARS',\
                                     'LIGHT':'INTERSTATE - LIGHT TRUCKS','Unnamed: 4':'INTERSTATE - BUSES',\
                                     'SINGLE-UNIT':'INTERSTATE - SINGLE-UNIT TRUCKS','COMBINATION':'INTERSTATE - COMBINATION TRUCKS',\
                                     'Unnamed: 7':'INTERSTATE - TOTAL',
                                     'MOTOR-.1':'ARTERIALS - MOTORCYCLES', 'PASSENGER.1':'ARTERIALS - PASSENGER CARS',\
                                     'LIGHT.1':'ARTERIALS - LIGHT TRUCKS','Unnamed: 11':'ARTERIALS - BUSES',\
                                     'SINGLE-UNIT.1':'ARTERIALS - SINGLE-UNIT TRUCKS','COMBINATION.1':'ARTERIALS - COMBINATION TRUCKS',\
                                     'Unnamed: 14':'ARTERIALS - TOTAL',
                                     'MOTOR-.2':'OTHER - MOTORCYCLES', 'PASSENGER.2':'OTHER - PASSENGER CARS',\
                                     'LIGHT.2':'OTHER - LIGHT TRUCKS','Unnamed: 18':'OTHER - BUSES',\
                                     'SINGLE-UNIT.2':'OTHER - SINGLE-UNIT TRUCKS','COMBINATION.2':'OTHER - COMBINATION TRUCKS',\
                                     'Unnamed: 21':'OTHER - TOTAL'}, inplace = True)

        VMT_type_R['STATE'] = VMT_type_R['STATE'].str.replace(r"\(2\)","") #fix state names
        VMT_type_R['STATE'] = VMT_type_R['STATE'].str.replace("Dist. of Columbia","District of Columbia") #fix state names
        VMT_type_R = VMT_type_R.fillna(0)
        
        #read in urban sheet
        names = pd.read_excel(State_vdf_file+year_range_str[iyear]+'.xls',  sheet_name = 'B', skiprows = 12, header = 0, nrows = 1)
        colnames = names.columns.values
        VMT_type_U = pd.read_excel(State_vdf_file+year_range_str[iyear]+'.xls', na_values=['-'],sheet_name = 'B', names = colnames, skiprows = 13, nrows = 51)


        VMT_type_U.rename(columns = {'MOTOR-':'INTERSTATE - MOTORCYCLES', 'PASSENGER':'INTERSTATE - PASSENGER CARS',\
                                     'LIGHT':'INTERSTATE - LIGHT TRUCKS','Unnamed: 4':'INTERSTATE - BUSES',\
                                     'SINGLE-UNIT':'INTERSTATE - SINGLE-UNIT TRUCKS','COMBINATION':'INTERSTATE - COMBINATION TRUCKS',\
                                     'Unnamed: 7':'INTERSTATE - TOTAL',
                                     'MOTOR-.1':'ARTERIALS - MOTORCYCLES', 'PASSENGER.1':'ARTERIALS - PASSENGER CARS',\
                                     'LIGHT.1':'ARTERIALS - LIGHT TRUCKS','Unnamed: 11':'ARTERIALS - BUSES',\
                                     'SINGLE-UNIT.1':'ARTERIALS - SINGLE-UNIT TRUCKS','COMBINATION.1':'ARTERIALS - COMBINATION TRUCKS',\
                                     'Unnamed: 14':'ARTERIALS - TOTAL',
                                     'MOTOR-.2':'OTHER - MOTORCYCLES', 'PASSENGER.2':'OTHER - PASSENGER CARS',\
                                     'LIGHT.2':'OTHER - LIGHT TRUCKS','Unnamed: 18':'OTHER - BUSES',\
                                     'SINGLE-UNIT.2':'OTHER - SINGLE-UNIT TRUCKS','COMBINATION.2':'OTHER - COMBINATION TRUCKS',\
                                     'Unnamed: 21':'OTHER - TOTAL'}, inplace = True)

        VMT_type_U['STATE'] = VMT_type_U['STATE'].str.replace(r"\(2\)","") #fix state names
        VMT_type_U['STATE'] = VMT_type_U['STATE'].str.replace("Dist. of Columbia","District of Columbia") #fix state names
        VMT_type_U = VMT_type_U.fillna(0)
        #display(VMT_type_U)
        
        
        #Add state ID to dataframes
        VMT_type_R['ANSI'] = 0
        VMT_type_U['ANSI'] = 0
        for idx in np.arange(len(VMT_type_R)):
            VMT_type_R.loc[idx,'ANSI'] = name_dict[VMT_type_R.loc[idx,'STATE'].strip()]
            istate_R = np.where(VMT_type_R.loc[idx,'ANSI'] == State_ANSI['ansi'])
            Per_vmt_mot[1,0,istate_R,iyear] = VMT_type_R.loc[idx,'INTERSTATE - MOTORCYCLES']
            Per_vmt_mot[1,1,istate_R,iyear] = VMT_type_R.loc[idx,'ARTERIALS - MOTORCYCLES']
            Per_vmt_mot[1,2,istate_R,iyear] = VMT_type_R.loc[idx,'OTHER - MOTORCYCLES']
            Per_vmt_pas[1,0,istate_R,iyear] = VMT_type_R.loc[idx,'INTERSTATE - PASSENGER CARS']
            Per_vmt_pas[1,1,istate_R,iyear] = VMT_type_R.loc[idx,'ARTERIALS - PASSENGER CARS']
            Per_vmt_pas[1,2,istate_R,iyear] = VMT_type_R.loc[idx,'OTHER - PASSENGER CARS']
            Per_vmt_lig[1,0,istate_R,iyear] = VMT_type_R.loc[idx,'INTERSTATE - LIGHT TRUCKS']
            Per_vmt_lig[1,1,istate_R,iyear] = VMT_type_R.loc[idx,'ARTERIALS - LIGHT TRUCKS']
            Per_vmt_lig[1,2,istate_R,iyear] = VMT_type_R.loc[idx,'OTHER - LIGHT TRUCKS']
            Per_vmt_hea[1,0,istate_R,iyear] = VMT_type_R.loc[idx,'INTERSTATE - BUSES']+VMT_type_R.loc[idx,'INTERSTATE - SINGLE-UNIT TRUCKS']+VMT_type_R.loc[idx,'INTERSTATE - COMBINATION TRUCKS']
            Per_vmt_hea[1,1,istate_R,iyear] = VMT_type_R.loc[idx,'ARTERIALS - BUSES']+VMT_type_R.loc[idx,'ARTERIALS - SINGLE-UNIT TRUCKS']+VMT_type_R.loc[idx,'ARTERIALS - COMBINATION TRUCKS']
            Per_vmt_hea[1,2,istate_R,iyear] = VMT_type_R.loc[idx,'OTHER - BUSES']+VMT_type_R.loc[idx,'OTHER - SINGLE-UNIT TRUCKS']+VMT_type_R.loc[idx,'OTHER - COMBINATION TRUCKS']
            total_R[iyear] += np.sum(Per_vmt_mot[1,:,istate_R,iyear])+np.sum(Per_vmt_pas[1,:,istate_R,iyear])+\
                np.sum(Per_vmt_lig[1,:,istate_R,iyear])+np.sum(Per_vmt_hea[1,:,istate_R,iyear])
            total2_R[iyear] += VMT_type_R.loc[idx,'INTERSTATE - TOTAL']+VMT_type_R.loc[idx,'ARTERIALS - TOTAL']+VMT_type_R.loc[idx,'OTHER - TOTAL']

        for idx in np.arange(len(VMT_type_U)):
            VMT_type_U.loc[idx,'ANSI'] = name_dict[VMT_type_U.loc[idx,'STATE'].strip()]            
            istate_U = np.where(VMT_type_U.loc[idx,'ANSI'] == State_ANSI['ansi'])
            Per_vmt_mot[0,0,istate_U,iyear] = VMT_type_U.loc[idx,'INTERSTATE - MOTORCYCLES']
            Per_vmt_mot[0,1,istate_U,iyear] = VMT_type_U.loc[idx,'ARTERIALS - MOTORCYCLES']
            Per_vmt_mot[0,2,istate_U,iyear] = VMT_type_U.loc[idx,'OTHER - MOTORCYCLES']
            Per_vmt_pas[0,0,istate_U,iyear] = VMT_type_U.loc[idx,'INTERSTATE - PASSENGER CARS']
            Per_vmt_pas[0,1,istate_U,iyear] = VMT_type_U.loc[idx,'ARTERIALS - PASSENGER CARS']
            Per_vmt_pas[0,2,istate_U,iyear] = VMT_type_U.loc[idx,'OTHER - PASSENGER CARS']
            Per_vmt_lig[0,0,istate_U,iyear] = VMT_type_U.loc[idx,'INTERSTATE - LIGHT TRUCKS']
            Per_vmt_lig[0,1,istate_U,iyear] = VMT_type_U.loc[idx,'ARTERIALS - LIGHT TRUCKS']
            Per_vmt_lig[0,2,istate_U,iyear] = VMT_type_U.loc[idx,'OTHER - LIGHT TRUCKS']
            Per_vmt_hea[0,0,istate_U,iyear] = VMT_type_U.loc[idx,'INTERSTATE - BUSES']+VMT_type_U.loc[idx,'INTERSTATE - SINGLE-UNIT TRUCKS']+VMT_type_U.loc[idx,'INTERSTATE - COMBINATION TRUCKS']
            Per_vmt_hea[0,1,istate_U,iyear] = VMT_type_U.loc[idx,'ARTERIALS - BUSES']+VMT_type_U.loc[idx,'ARTERIALS - SINGLE-UNIT TRUCKS']+VMT_type_U.loc[idx,'ARTERIALS - COMBINATION TRUCKS']
            Per_vmt_hea[0,2,istate_U,iyear] = VMT_type_U.loc[idx,'OTHER - BUSES']+VMT_type_U.loc[idx,'OTHER - SINGLE-UNIT TRUCKS']+VMT_type_U.loc[idx,'OTHER - COMBINATION TRUCKS']
            total_U[iyear] += np.sum(Per_vmt_mot[0,:,istate_U,iyear])+np.sum(Per_vmt_pas[0,:,istate_U,iyear])+\
                np.sum(Per_vmt_lig[0,:,istate_U,iyear])+np.sum(Per_vmt_hea[0,:,istate_U,iyear])
            total2_U[iyear] += VMT_type_U.loc[idx,'INTERSTATE - TOTAL']+VMT_type_U.loc[idx,'ARTERIALS - TOTAL']+VMT_type_U.loc[idx,'OTHER - TOTAL']
   
        total[iyear] = total_U[iyear]+total_R[iyear]
        total2[iyear] = total2_R[iyear]+total2_U[iyear]
        abs_diff1 = abs(total[iyear]-total2[iyear])/((total[iyear]+total2[iyear])/2)
        #abs_diff2 = abs(total_R[iyear]-total2_R[iyear])/((total_R[iyear]+total2_R[iyear])/2)
        #DEBUG## print(calc_emi)
        #DEBUG## print(summary_emi)
        if abs(abs_diff1) < 0.0001:
            print('Year '+ year_range_str[iyear]+': Urban Difference < 0.01%: PASS')
        else: 
            print('Year '+ year_range_str[iyear]+': Urban Difference > 0.01%: FAIL, diff: '+str(abs_diff1))
            print(total[iyear])
            print(total2[iyear])

#Correct Years (assign 2012 to 2013), assign 2016 as average of 2015 and 2017
#ADD
idx_2012 = (2012-start_year)
idx_2016 = (2016-start_year)
Per_vmt_mot[:,:,:,idx_2012] = Per_vmt_mot[:,:,:,idx_2012+1]
Per_vmt_pas[:,:,:,idx_2012] = Per_vmt_pas[:,:,:,idx_2012+1]
Per_vmt_lig[:,:,:,idx_2012] = Per_vmt_lig[:,:,:,idx_2012+1]
Per_vmt_hea[:,:,:,idx_2012] = Per_vmt_hea[:,:,:,idx_2012+1]

Per_vmt_mot[:,:,:,idx_2016] = 0.5*(Per_vmt_mot[:,:,:,idx_2016-1]+Per_vmt_mot[:,:,:,idx_2016+1])
Per_vmt_pas[:,:,:,idx_2016] = 0.5*(Per_vmt_pas[:,:,:,idx_2016-1]+Per_vmt_pas[:,:,:,idx_2016+1])
Per_vmt_lig[:,:,:,idx_2016] = 0.5*(Per_vmt_lig[:,:,:,idx_2016-1]+Per_vmt_lig[:,:,:,idx_2016+1])
Per_vmt_hea[:,:,:,idx_2016] = 0.5*(Per_vmt_hea[:,:,:,idx_2016-1]+Per_vmt_hea[:,:,:,idx_2016+1])

##### Step 2.2.3. Calculate State-Level Road Proxy - VMT per road/type, by state

In [None]:
# Calculate the vehicle miles traveled by state, year, vehcile type, and road type
# VMT = fraction of miles traveled as a function of vehicle type and state * road miles by road type

#initialize variables
# urban/rural x road type x state x year
#0 - urban, 1 - rural
# road types
# 0 - interstate, #1 - primary & secondary, #2 - other
vmt_pas = np.zeros([2,3,len(State_ANSI), num_years])
vmt_lig = np.zeros([2,3,len(State_ANSI), num_years])
vmt_hea = np.zeros([2,3,len(State_ANSI), num_years])
vmt_tot = np.zeros([2,len(State_ANSI), num_years])

#calculate the absolute number of VMT by region, road type, vehicle type, and state, for each year
# e.g. vmt_pas = VMT for passenger vehicles with dimensions = region (urban/rural), road type (primary, secondary,
# other), state, and year
# vmt_tot = region x state, year
# road mile variable dimensions (urban/rural, state, year)
# vmt percentage variable dimensions (urban/rural, road type, state, year)
for iyear in np.arange(0, num_years):
    vmt_pas[:,0,:,iyear] = Miles_road_primary[:,:,iyear] * Per_vmt_pas[:,0,:,iyear]
    vmt_pas[:,1,:,iyear] = Miles_road_secondary[:,:,iyear] * Per_vmt_pas[:,1,:,iyear]
    vmt_pas[:,2,:,iyear] = Miles_road_other[:,:,iyear] * Per_vmt_pas[:,2,:,iyear]
    
    vmt_lig[:,0,:,iyear] = Miles_road_primary[:,:,iyear] * Per_vmt_lig[:,0,:,iyear]
    vmt_lig[:,1,:,iyear] = Miles_road_secondary[:,:,iyear] * Per_vmt_lig[:,1,:,iyear]
    vmt_lig[:,2,:,iyear] = Miles_road_other[:,:,iyear] * Per_vmt_lig[:,2,:,iyear]
    
    vmt_hea[:,0,:,iyear] = Miles_road_primary[:,:,iyear] * Per_vmt_hea[:,0,:,iyear]
    vmt_hea[:,1,:,iyear] = Miles_road_secondary[:,:,iyear] * Per_vmt_hea[:,1,:,iyear]
    vmt_hea[:,2,:,iyear] = Miles_road_other[:,:,iyear] * Per_vmt_hea[:,2,:,iyear]
    
    vmt_tot[:,:,iyear] += Miles_road_primary[:,:,iyear]+Miles_road_secondary[:,:,iyear]+Miles_road_other[:,:,iyear]
 

#### Step 2.4. Read In and Format High Resolution Road Gridded Proxy

In [None]:
#Make proxy maps for road types

# Maps of road length in each grid cell, by urban and rural region, and type of road (primary, secondary, other),
# are calculated as follows:
# 1) high resolution maps of all road types in urban and rural areas (in the CONUS region) are read in. These
#    files are only avialable for a single year and are used to calculation the 'other road' type later in the process
# 2) high resolution maps of primary and primary+secondary roads are read in. These files have year-specific
#    information and are used to calculate primary and secondary roads each year
# 3) split each of the three road types between urban and rural contributions. A map of urban area in the given year
#    is read in. This map is then used to calculate the urban roads as those in grid cells with urban area > 0.5.
#    Rural areas are then calculated as the total roads minus the urban roads


 #Create gridded maps
# urban/rural -> 0 - urban, #1 - rural
# road type - > 0 - primary roads, 1 - secondary roads, 2 - other roads
map_roads = np.zeros([2,3,len(lat001),len(lon001),num_years])
map_roads_nongrid = np.zeros([2,3,num_years])
map_urban_area = np.zeros([len(lat001),len(lon001)]) 
    
# Step 1) Read in total urban and rural roads in the CONUS region and assign road lengths to grid  
roads_other_urb = pd.read_csv(CONUS_urban_roads_file, usecols=[2,3,4])
roads_other_rur = pd.read_csv(CONUS_rural_roads_file, usecols=[2,3,4])

for idx in np.arange(len(roads_other_urb)):
    if roads_other_urb['FIRST_Longitude'][idx] > Lon_left and roads_other_urb['FIRST_Longitude'][idx] < Lon_right and\
        roads_other_urb['FIRST_Latitude'][idx] > Lat_low and roads_other_urb['FIRST_Latitude'][idx] < Lat_up:
        #Set ilon and ilat
        ilat = int((roads_other_urb['FIRST_Latitude'][idx]  - Lat_low) /Res_01)
        ilon = int((roads_other_urb['FIRST_Longitude'][idx] - Lon_left)/Res_01)
        map_roads[0,2,ilat,ilon,0] += roads_other_urb['SUM_Shape_Length'][idx]        
    else:
        map_roads_nongrid[0,2,0] += roads_other_urb['SUM_Shape_Length'][idx]
print('Finished Processing All Urban Roads')
ct = datetime.datetime.now() 
print("current time:", ct) 

for idx in np.arange(len(roads_other_rur)): #set all to 'urban' for now, split urban vs rural for all roads later
    if roads_other_rur['FIRST_Longitude'][idx] > Lon_left and roads_other_rur['FIRST_Longitude'][idx] < Lon_right and \
        roads_other_rur['FIRST_Latitude'][idx] > Lat_low and roads_other_rur['FIRST_Latitude'][idx] < Lat_up:
        ilat = int((roads_other_rur['FIRST_Latitude'][idx]  - Lat_low) /Res_01)
        ilon = int((roads_other_rur['FIRST_Longitude'][idx] - Lon_left)/Res_01)
        map_roads[0,2,ilat,ilon,0] += roads_other_rur['SUM_Shape_Length'][idx]  
    else:
        map_roads_nongrid[0,2,0] += roads_other_rur['SUM_Shape_Length'][idx]  
del roads_other_urb, roads_other_rur
print('Finished Processing All Rural Roads')
ct = datetime.datetime.now() 
print("current time:", ct) 


# Step 2) make yearly maps of primary, secondary, and other road types, split between urban and rural regions
for iyear in np.arange(0, num_years):
    #Assign set all years of 'other roads' as constant
    map_roads[0,2,:,:,iyear] = map_roads[0,2,:,:,0]
    map_roads[1,2,:,:,iyear] = map_roads[1,2,:,:,0]
    
    #Read primary and secondary road maps
    roads_primary = pd.read_csv(Primary_roads_file+year_range_str[iyear]+'_001x001.csv', sep=',')
    roads_primsec = pd.read_csv(Secondary_roads_file+year_range_str[iyear]+'_001x001.csv', sep=',')

    for idx in np.arange(len(roads_primary)):
        if roads_primary['Longitude'][idx] > Lon_left and roads_primary['Longitude'][idx] < Lon_right and \
            roads_primary['Latitude'][idx] > Lat_low and roads_primary['Latitude'][idx] < Lat_up:
            #Set ilon and ilat
            ilat = int((roads_primary['Latitude'][idx]  - Lat_low) /Res_01)
            ilon = int((roads_primary['Longitude'][idx] - Lon_left)/Res_01)
            map_roads[0,0,ilat,ilon,iyear] += roads_primary['SUM_Shape_Length'][idx]
        else:
            map_roads_nongrid[0,0,iyear] += roads_primary['SUM_Shape_Length'][idx] 
    
    for idx in np.arange(len(roads_primsec)):
        if roads_primsec['Longitude'][idx] > Lon_left and roads_primsec['Longitude'][idx] < Lon_right and \
            roads_primsec['Latitude'][idx] > Lat_low and roads_primsec['Latitude'][idx] < Lat_up:
            #Set ilon and ilat
            ilat = int((roads_primsec['Latitude'][idx]  - Lat_low) /Res_01)
            ilon = int((roads_primsec['Longitude'][idx] - Lon_left)/Res_01)
            map_roads[0,1,ilat,ilon,iyear] += roads_primsec['SUM_Shape_Length'][idx]
        else:
            map_roads_nongrid[0,1,iyear] += roads_primsec['SUM_Shape_Length'][idx]
    
    # Calculate Secondary Road lengths
    # 1) secondary = secondary&primary - primary
    # 2) remove negative values
    # 3) repeat for non-grid arrays
    map_roads[0,1,:,:,iyear] = map_roads[0,1,:,:,iyear] - map_roads[0,0,:,:,iyear]
    map_roads[map_roads<0] = 0.0
    map_roads_nongrid[0,1,iyear] = map_roads_nongrid[0,1,iyear] - map_roads_nongrid[0,0,iyear]
    map_roads_nongrid[map_roads_nongrid<0] = 0.0

    # Calculate Other road lengths
    # 1) other = other - secondary - primary
    # 2) replace negatives with zeros
    map_roads[0,2,:,:,iyear] = map_roads[0,2,:,:,iyear] - map_roads[0,1,:,:,iyear] - map_roads[0,0,:,:,iyear]
    map_roads[map_roads < 0] = 0.0
    map_roads_nongrid[0,2,iyear] = map_roads_nongrid[0,2,iyear] - map_roads_nongrid[0,1,iyear] - map_roads_nongrid[0,0,iyear]
    map_roads_nongrid[map_roads_nongrid < 0] = 0.0
    
    print ('Gridded  primary roads length: ', np.sum(map_roads[0,0,:,:,iyear]))
    print ('Gridded  secondary roads length: ', np.sum(map_roads[0,1,:,:,iyear]))
    print ('Gridded  other roads length:   ', np.sum(map_roads[0,2,:,:,iyear]))

    # Step 3) Separate out urban vs rural portion
    # Read in the urban area and normalize to the grid cell area
    urban_area   = pd.read_csv(Urban_area_file+year_range_str[iyear]+'_001x001.csv', sep=',')
    urban_area['SUM_Area_Urbanized'] = urban_area['SUM_Area_Urbanized']/np.max(urban_area['SUM_Area_Urbanized'])*np.max(area_map)
    # create a map of urban areas
    for idx in np.arange(len(urban_area)):
        if urban_area['Longitude'][idx] > Lon_left and urban_area['Longitude'][idx] < Lon_right and\
            urban_area['Latitude'][idx] > Lat_low and urban_area['Latitude'][idx] < Lat_up:
            ilat = int((urban_area['Latitude'][idx]  - Lat_low) /Res_01)
            ilon = int((urban_area['Longitude'][idx] - Lon_left)/Res_01)
            map_urban_area[ilat,ilon] = urban_area['SUM_Area_Urbanized'][idx] / float(area_map[ilat,ilon])
    
    # Split urban vs rural contributions for all three road types
    # 1) make a temporary copy of all roads (for that road type) for the given year
    # 2) urban region = all roads with urban area > 0.5
    # 3) rural region = all roads - all urban roads
    # 4) re-assign urban fraction to map_roads array
    # 5) calculate the fraction of urban/rural roads and apply to off grid region
    # 5) Repeat for all road types
    map_temp = map_roads[0,0,:,:,iyear].copy()
    map_temp[map_urban_area < 0.5] = 0.0
    map_roads[1,0,:,:,iyear] = map_roads[0,0,:,:,iyear] - map_temp[:,:]
    map_roads[0,0,:,:,iyear] = map_temp.copy()
    map_roads_nongrid[0,0,iyear] = np.sum(map_roads[0,0,:,:,iyear])/np.sum(map_roads[1,0,:,:,iyear]+map_roads[0,0,:,:,iyear])
    map_roads_nongrid[1,0,iyear] = np.sum(map_roads[1,0,:,:,iyear])/np.sum(map_roads[1,0,:,:,iyear]+map_roads[0,0,:,:,iyear])

    #secondary
    map_temp = map_roads[0,1,:,:,iyear].copy()
    map_temp[map_urban_area < 0.5] = 0.0
    map_roads[1,1,:,:,iyear] = map_roads[0,1,:,:,iyear] - map_temp[:,:]
    map_roads[0,1,:,:,iyear] = map_temp.copy()
    map_roads_nongrid[0,1,iyear] = np.sum(map_roads[0,1,:,:,iyear])/np.sum(map_roads[1,1,:,:,iyear]+map_roads[0,1,:,:,iyear])
    map_roads_nongrid[1,1,iyear] = np.sum(map_roads[1,1,:,:,iyear])/np.sum(map_roads[1,1,:,:,iyear]+map_roads[0,1,:,:,iyear])

    #other
    map_temp = map_roads[0,2,:,:,iyear].copy()
    map_temp[map_urban_area < 0.5] = 0.0
    map_roads[1,2,:,:,iyear] = map_roads[0,2,:,:,iyear] - map_temp[:,:]
    map_roads[0,2,:,:,iyear] = map_temp.copy()
    map_roads_nongrid[0,2,iyear] = np.sum(map_roads[0,2,:,:,iyear])/np.sum(map_roads[1,2,:,:,iyear]+map_roads[0,2,:,:,iyear])
    map_roads_nongrid[1,2,iyear] = np.sum(map_roads[1,2,:,:,iyear])/np.sum(map_roads[1,2,:,:,iyear]+map_roads[0,2,:,:,iyear])

    print('Year:', year_range_str[iyear])
    print ('Gridded  primary roads length: ', np.sum(map_roads[:,0,:,:,iyear]))
    print ('Gridded  primary roads urban:  ', np.sum(map_roads[0,0,:,:,iyear]))
    print ('Gridded  primary roads rural:  ', np.sum(map_roads[1,0,:,:,iyear]))
    print ('Gridded  primsec roads length: ', np.sum(map_roads[:,1,:,:,iyear]))
    print ('Gridded  primsec roads urban:  ', np.sum(map_roads[0,1,:,:,iyear]))
    print ('Gridded  primsec roads rural:  ', np.sum(map_roads[1,1,:,:,iyear]))
    print ('Gridded  other roads length:   ', np.sum(map_roads[:,2,:,:,iyear]))
    print ('Gridded  other roads urban:    ', np.sum(map_roads[0,2,:,:,iyear]))
    print ('Gridded  other roads rural:    ', np.sum(map_roads[1,2,:,:,iyear]))
    ct = datetime.datetime.now() 
    print("current time:", ct)

#### Step 2.5. Make Maps of Waterways

In [None]:
# Laod Waterways data and allocate to 0.1x0.1 degree map
# only include CONUS region

#Initialize 
map_waterways = np.zeros([len(Lat_01),len(Lon_01),num_years])
map_waterways_nongrid = np.zeros([num_years])

##Load waterways
waterways_loc = pd.read_csv(Waterways_file, sep=',')
#waterways_loc.head(1)

for idx in np.arange(len(waterways_loc)):
    if waterways_loc['Longitude'][idx] > Lon_left and waterways_loc['Longitude'][idx] < Lon_right and \
        waterways_loc['Latitude'][idx] > Lat_low and waterways_loc['Latitude'][idx] < Lat_up:
        ilat = int((waterways_loc['Latitude'][idx]  - Lat_low)/Res01)
        ilon = int((waterways_loc['Longitude'][idx] - Lon_left)/Res01)
        map_waterways[ilat,ilon,0] += waterways_loc['SUM_Shape_Length'][idx]
    else:
        map_waterways_nongrid[0] += waterways_loc['SUM_Shape_Length'][idx]

for iyear in np.arange(0,num_years):
    map_waterways[:,:,iyear] = map_waterways[:,:,0]
    print('Year:', year_range_str[iyear])
    print ('Database waterways length: ', np.sum(waterways_loc['SUM_Shape_Length']))
    print ('Gridded  waterways length: ', np.sum(map_waterways[:,:,iyear]))

#### Step 2.6. Make Maps of Mines

In [None]:
##Load mines and make gridded 0.1x0.1 degree map - only includes one year of data
# includes mines outside of CONUS

#Initialize map arrays
map_mines = np.zeros([len(Lat_01),len(Lon_01),num_years])
map_mines_nongrid = np.zeros(num_years)

mine_loc = pd.read_csv(Mines_file, sep='|', encoding= 'unicode_escape')
print ('Database mines: ', len(mine_loc))
mine_loc = mine_loc[mine_loc['LATITUDE']>0]
mine_loc = mine_loc[mine_loc['CURRENT_MINE_STATUS']=='Active']
mine_loc['LONGITUDE'] = 1*mine_loc['LONGITUDE']
mine_loc.reset_index(inplace=True, drop=True)
print ('Active mines with location: ', len(mine_loc))

for idx in np.arange(len(mine_loc)):
    if mine_loc['LONGITUDE'][idx] > Lon_left and mine_loc['LONGITUDE'][idx] < Lon_right and \
        mine_loc['LATITUDE'][idx] > Lat_low and mine_loc['LATITUDE'][idx] < Lat_up:
        ilat = int((mine_loc['LATITUDE'][idx] - Lat_low)/Res01)
        ilon = int((mine_loc['LONGITUDE'][idx] - Lon_left)/Res01)
        map_mines[ilat,ilon] += 1
    else:
        map_mines_nongrid += 1
    
for iyear in np.arange(0,num_years):
    map_mines[:,:,iyear] = map_mines[:,:,0]
    print('Year:', year_range_str[iyear])
    print ('Gridded  mines: ', np.sum(map_mines[:,:,iyear]))

#### Step 2.7. Make Maps of Population

In [None]:
#Read population density map, convert to absolute population, regrid to 0.1 x0.1 degrees
# only includes CONUS region

pop_abs = np.zeros([len(Lat_01),len(Lon_01),num_years])
map_abs_nongrid = np.zeros(num_years)

pop_den_map = data_load_fn.load_pop_den_map(pop_map_inputfile)
pop_abs_001 = pop_den_map*area_map
pop_abs[:,:,0] = data_fn.regrid001_to_01(pop_abs_001, Lat_01, Lon_01)

for iyear in np.arange(0,num_years):
    pop_abs[:,:,iyear] = pop_abs[:,:,0]
    print('Year:', year_range_str[iyear])
    print ('Gridded  population: ', np.sum(pop_abs[:,:,iyear]))


#### Step 2.8 Make Maps of Railroads

In [None]:
# read in railroad data and place on 0.1x0.1 degree map
#only includes CONUS region

map_rail = np.zeros([len(Lat_01),len(Lon_01),num_years])
map_rail_nongrid = np.zeros(num_years)

for iyear in np.arange(0,num_years):
    rail_loc = pd.read_csv(Railroad_file+year_range_str[iyear]+'_01x01.csv', sep=',')

    for idx in np.arange(len(rail_loc)):
        if rail_loc['Longitude'][idx] > Lon_left and rail_loc['Longitude'][idx] < Lon_right and \
            rail_loc['Latitude'][idx] > Lat_low and rail_loc['Latitude'][idx] < Lat_up:
            ilat = int((rail_loc['Latitude'][idx]  - Lat_low)/Res01)
            ilon = int((rail_loc['Longitude'][idx] - Lon_left)/Res01)
            map_rail[ilat,ilon,iyear] += rail_loc['SUM_Shape_Length'][idx]
        else:
            map_rail_nongrid[iyear] += rail_loc['SUM_Railroad_Length'][idx]
    print('Year:', year_range_str[iyear])
    print ('Database rail length: ', np.sum(rail_loc['SUM_Shape_Length']))
    print ('Gridded  rail length: ', np.sum(map_rail[:,:,iyear]))

#### Step 2.9 Make Maps of Crops

In [None]:
#only includes CONUS region

if ReCalc_Crop ==1:
    map_crop = np.zeros([len(Lat_01),len(Lon_01),num_years])
    map_crop_nongrid = np.zeros(num_years)

    for iyear in np.arange(0,num_years):
        crop_loc = pd.read_csv(Crop_file+year_range_str[iyear]+'_001x001.csv', sep=',')
        for idx in np.arange(0,len(crop_loc)):
            if crop_loc['FIRST_Longitude'][idx] > Lon_left and crop_loc['FIRST_Longitude'][idx] < Lon_right and \
                crop_loc['FIRST_Latitude'][idx] > Lat_low and crop_loc['FIRST_Latitude'][idx] < Lat_up:
                ilat = int((crop_loc['FIRST_Latitude'][idx]  - Lat_low)/Res01)
                ilon = int((crop_loc['FIRST_Longitude'][idx] - Lon_left)/Res01)
                map_crop[ilat,ilon,iyear] += crop_loc['SUM_Area_AllCrops'][idx]
            else:
                map_crop_nongrid[iyear] += crop_loc['SUM_Area_AllCrops'][idx]
        print('Year:', year_range_str[iyear])
        print ('Database crop area: ', np.sum(crop_loc['SUM_Area_AllCrops']))
        print ('Gridded  crop area: ', np.sum(map_crop[:,:,iyear]))
        ct = datetime.datetime.now() 
        print("current time:", ct)
    np.save('./IntermediateOutputs/Crops_tempoutput', map_crop)
    np.save('./IntermediateOutputs/Crops_nongrid_tempoutput', map_crop_nongrid)
    
else:
    map_crop = np.load('./IntermediateOutputs/Crops_tempoutput.npy')
    map_crop_nongrid = np.load('./IntermediateOutputs/Crops_nongrid_tempoutput.npy')
    for iyear in np.arange(0,num_years):
        print('Year:', year_range_str[iyear])
        print ('Gridded  crop area: ', np.sum(map_crop[:,:,iyear]))
        ct = datetime.datetime.now() 
        print("current time:", ct)

-----------
## Step 3. Read in and Format US EPA GHGI Emissions
----------

#We only map emissions which are reported as being substantial in the inventory

In [None]:
# Read mobile combustion emissions (units = Gg (==kt))

names = pd.read_csv(EPA_comb_inputfile,  skiprows = 1, header = 0, nrows = 1)
colnames = names.columns.values
EPA_emi_mobcomb_CH4 = pd.read_csv(EPA_comb_inputfile, skiprows = 2, names = colnames, nrows = 18)
EPA_emi_mobcomb_CH4 = EPA_emi_mobcomb_CH4.fillna('')
EPA_emi_mobcomb_CH4 = EPA_emi_mobcomb_CH4.drop(columns = [str(n) for n in range(1990, start_year,1)])
EPA_emi_mobcomb_CH4.reset_index(inplace=True, drop=True)
EPA_emi_mobcomb_CH4['Fuel Type/Vehicle Type'] = EPA_emi_mobcomb_CH4['Fuel Type/Vehicle Type'].str.replace("*","")
EPA_mobcomb_total = EPA_emi_mobcomb_CH4[EPA_emi_mobcomb_CH4['Fuel Type/Vehicle Type'] == 'Total']

display(EPA_mobcomb_total)
display(EPA_emi_mobcomb_CH4)

#### 3.2. Split Emissions into Gridding Groups (each Group will have the same proxy applied during the state allocation/gridding)

In [None]:
#split GHG emissions into gridding groups, based on Mobile Combustion Proxy Mapping file

DEBUG =0
start_year_idx = EPA_emi_mobcomb_CH4.columns.get_loc(str(start_year))
end_year_idx = EPA_emi_mobcomb_CH4.columns.get_loc(str(end_year))+1
ghgi_mob_groups = ghgi_mob_map['GHGI_Emi_Group'].unique()
sum_emi = np.zeros([num_years])

for igroup in np.arange(0,len(ghgi_mob_groups)): #loop through all groups, finding the GHGI sources in that group and summing emissions for that region, year        vars()[ghgi_prod_groups[igroup]] = np.zeros([num_regions-1,num_years])
    ##DEBUG## print(ghgi_stat_groups[igroup])
    vars()[ghgi_mob_groups[igroup]] = np.zeros([num_years])
    source_temp = ghgi_mob_map.loc[ghgi_mob_map['GHGI_Emi_Group'] == ghgi_mob_groups[igroup], 'GHGI_Source']
    pattern_temp  = '|'.join(source_temp) 
   # print(pattern_temp) 
    emi_temp =EPA_emi_mobcomb_CH4[EPA_emi_mobcomb_CH4['Fuel Type/Vehicle Type'].str.contains(pattern_temp)]
    #display(emi_temp)
    if 'Light-Duty' in pattern_temp :
        vars()[ghgi_mob_groups[igroup]][:] = emi_temp.iloc[0,start_year_idx:] #only use the first value
    elif 'Passenger' in pattern_temp or 'Heavy' in pattern_temp:
        vars()[ghgi_mob_groups[igroup]][:] = emi_temp.iloc[0:2,start_year_idx:].sum()
    else:
        vars()[ghgi_mob_groups[igroup]][:] = emi_temp.iloc[:,start_year_idx:].sum()
    #display(vars()[ghgi_mob_groups[igroup]][:])
        
        
#Check against total summary emissions 
print('QA/QC #1: Check Processing Emission Sum against GHGI Summary Emissions')
for iyear in np.arange(0,num_years): 
    for igroup in np.arange(0,len(ghgi_mob_groups)):
        sum_emi[iyear] += vars()[ghgi_mob_groups[igroup]][iyear]
        
    summary_emi = EPA_mobcomb_total.iloc[0,iyear+1]  
    #Check 1 - make sure that the sums from all the regions equal the totals reported
    diff1 = abs(sum_emi[iyear] - summary_emi)/((sum_emi[iyear] + summary_emi)/2)
    if DEBUG==1:
        print(summary_emi)
        print(sum_emi[iyear])
    if diff1 < 0.0001:
        print('Year ', year_range[iyear],': PASS, difference < 0.01%')
    else:
        print('Year ', year_range[iyear],': FAIL (check Production & summary tabs): ', diff1,'%') 

--------------
## Step 4. Grid Data
-------------

#### Step 4.1. Allocate emissions

##### Step 4.1.1 Assign the Appropriate Proxy Variable Names (state & grid)

In [None]:
# The names on the *left* need to match the 'ProxyMapping' 'State_Proxy_Group' names 
# (these are initialized in Step 2). 
# The names on the *right* are the variable names used to caluclate the proxies in this code.
# Names on the right need to match those from the code in Step 2

#state proxies are in dimensions (urban/rural x road type x state x year)
State_Passenger = vmt_pas
State_Light = vmt_lig
State_Heavy = vmt_hea
State_AllRoads = vmt_tot


#state --> 0.01 proxies
Map_Roads = map_roads
Map_Waterways = map_waterways
Map_Railroads = map_rail
Map_Pop = pop_abs
Map_Farm = map_crop
Map_Mines = map_mines


##### Step 4.1.2. Allocate to the State level

In [None]:
# Calculate state-level emissions
# Emissions in kt
# State data = national GHGI emissions * state proxy/national total

# Note that national emissions are retained for groups that do not have state proxies (identified in the mapping file)
# and are gridded in the next step
DEBUG = 1

# Make placeholder emission arrays for each group
# Urban/Rural flag == 2 indicates that the proxy data have both road and region type information
# Urbal/Rural flag == 1 indicates that the proxy data has region type information
for igroup in np.arange(0,len(proxy_mob_map)):
    if proxy_mob_map.loc[igroup,'Urban_Rural_Flag'] ==2:
        vars()['State_'+proxy_mob_map.loc[igroup,'GHGI_Emi_Group']] = np.zeros([2,3,len(State_ANSI),num_years])
        vars()['NonState_'+proxy_mob_map.loc[igroup,'GHGI_Emi_Group']] = np.zeros([2,3,num_years])
    elif proxy_mob_map.loc[igroup,'Urban_Rural_Flag'] ==1:
        vars()['State_'+proxy_mob_map.loc[igroup,'GHGI_Emi_Group']] = np.zeros([2,len(State_ANSI),num_years])
        vars()['NonState_'+proxy_mob_map.loc[igroup,'GHGI_Emi_Group']] = np.zeros([2,num_years])
    else:
        vars()['State_'+proxy_mob_map.loc[igroup,'GHGI_Emi_Group']] = np.zeros([len(State_ANSI),num_years])
        vars()['NonState_'+proxy_mob_map.loc[igroup,'GHGI_Emi_Group']] = np.zeros([num_years])
        
#Loop over years
for iyear in np.arange(num_years):
    #Loop over states
    for istate in np.arange(len(State_ANSI)):
        for igroup in np.arange(0,len(proxy_mob_map)):    
            if proxy_mob_map.loc[igroup,'State_Proxy_Group'] != '-' and proxy_mob_map.loc[igroup,'GHGI_Emi_Group'] != 'Emi_not_mapped':
                if proxy_mob_map.loc[igroup,'Urban_Rural_Flag'] ==2:
                    for iregion in np.arange(0,2):
                        for iroad in np.arange(0,3):
                            vars()['State_'+proxy_mob_map.loc[igroup,'GHGI_Emi_Group']][iregion,iroad,istate,iyear] = \
                                vars()[proxy_mob_map.loc[igroup,'GHGI_Emi_Group']][iyear] * \
                        data_fn.safe_div(vars()[proxy_mob_map.loc[igroup,'State_Proxy_Group']][iregion,iroad,istate,iyear], \
                                         np.sum(vars()[proxy_mob_map.loc[igroup,'State_Proxy_Group']][:,:,:,iyear]))
                elif proxy_mob_map.loc[igroup,'Urban_Rural_Flag'] ==1:
                    for iregion in np.arange(0,2):
                        vars()['State_'+proxy_mob_map.loc[igroup,'GHGI_Emi_Group']][iregion,istate,iyear] = \
                                vars()[proxy_mob_map.loc[igroup,'GHGI_Emi_Group']][iyear] * \
                        data_fn.safe_div(vars()[proxy_mob_map.loc[igroup,'State_Proxy_Group']][iregion,istate,iyear], \
                                         np.sum(vars()[proxy_mob_map.loc[igroup,'State_Proxy_Group']][:,:,iyear]))

            else:
                vars()['NonState_'+proxy_mob_map.loc[igroup,'GHGI_Emi_Group']][iyear] = vars()[proxy_mob_map.loc[igroup,'GHGI_Emi_Group']][iyear]
                
# Check sum of all gridded emissions + emissions not included in state allocation
print('QA/QC #1: Check weighted emissions against GHGI')   
for iyear in np.arange(0,num_years):
    summary_emi = EPA_mobcomb_total.iloc[0,iyear+1] 
    calc_emi = 0
    for igroup in np.arange(0,len(proxy_mob_map)):
        if proxy_mob_map.loc[igroup,'Urban_Rural_Flag'] ==2:
            calc_emi +=  np.sum(vars()['State_'+proxy_mob_map.loc[igroup,'GHGI_Emi_Group']][:,:,:,iyear])+\
                        np.sum(vars()['NonState_'+proxy_mob_map.loc[igroup,'GHGI_Emi_Group']][:,:,iyear])
        elif proxy_mob_map.loc[igroup,'Urban_Rural_Flag'] ==1:
            calc_emi +=  np.sum(vars()['State_'+proxy_mob_map.loc[igroup,'GHGI_Emi_Group']][:,:,iyear])+\
                        np.sum(vars()['NonState_'+proxy_mob_map.loc[igroup,'GHGI_Emi_Group']][:,iyear])
        else:
            calc_emi +=  np.sum(vars()['State_'+proxy_mob_map.loc[igroup,'GHGI_Emi_Group']][:,iyear])
            calc_emi += vars()['NonState_'+proxy_mob_map.loc[igroup,'GHGI_Emi_Group']][iyear] #np.sum(Emissions[:,iyear]) + Emissions_nongrid[iyear] + Emissions_nonstate[iyear]
    if DEBUG ==1:
        print(summary_emi)
        print(calc_emi)
    diff = abs(summary_emi-calc_emi)/((summary_emi+calc_emi)/2)
    if diff < 0.0002:
        print('Year ', year_range[iyear], ': PASS, difference < 0.01%')
    else:
        print('Year ', year_range[iyear], ': FAIL -- Difference = ', diff*100,'%')

##### 4.1.3 Allocate emissions to the CONUS region (0.1x0.1)

In [None]:
# Allocate State-Level emissions (kt) onto a 0.1x0.1 grid using gridcell level 'Proxy_Groups'

DEBUG =1

# For each year, (2a) distribute state-level emissions onto a grid using proxies defined above ....
# To speed up the code, masks are used rather than looping individually through each lat/lon. 
# In this case, a mask of 1's is made for the grid cells that match the ANSI values for a given state
# The masked values are set to zero, remaining values = 1. 
# AK and HI and territories are removed from the analysis at this stage. 
# The emissions allocated to each state are at 0.01x0.01 degree resolution, as required to calculate accurate 'mask'
# arrays for each state. 
# (2b) For emission groups that were not first allocated to states, national emissions for those groups are gridded
# based on the relevant gridded proxy arrays (0.1x0.1 resolution). These emissions are at 0.1x0.1 degrees resolution. 
# (2c) - record 'not mapped' emission groups in the 'non-grid' array

print('**QA/QC Check: Sum of national gridded emissions vs. GHGI national emissions')
running_sum = np.zeros([len(proxy_mob_map),num_years])

for igroup in np.arange(4,len(proxy_mob_map)):
    proxy_temp = vars()[proxy_mob_map.loc[igroup,'Proxy_Group']]
    proxy_temp_nongrid = vars()[proxy_mob_map.loc[igroup,'Proxy_Group']+'_nongrid']
    
    
    #2a. Step through each state (if group was previously allocated to state level)
    if proxy_mob_map.loc[igroup,'State_Proxy_Group'] != '-' and \
        proxy_mob_map.loc[igroup,'State_Proxy_Group'] != 'state_not_mapped':
        print('Group:',igroup,'of ',len(proxy_mob_map))
        vars()['Ext_'+proxy_mob_map.loc[igroup,'GHGI_Emi_Group']+'_01'] = np.zeros([len(lat001),len(lon001),num_years])

        for istate in np.arange(0,len(State_ANSI)):
            #print(igroup,istate)
            
            if State_ANSI['abbr'][istate] not in {'AK','HI'} and istate < 51:
                mask_state = np.ma.ones(np.shape(state_ANSI_map))
                mask_state = np.ma.masked_where(state_ANSI_map != State_ANSI['ansi'][istate], mask_state)
                mask_state = np.ma.filled(mask_state,0) 
                if proxy_mob_map.loc[igroup, 'Grid_Urban_Rural_Flag'] ==2:
                    if proxy_mob_map.loc[igroup, 'Urban_Rural_Flag'] ==2:
                        for iyear in np.arange(0,num_years):
                            for iregion in np.arange(0,2):
                                for iroad in np.arange(0,3):
                                    emi_temp = vars()['State_'+proxy_mob_map.loc[igroup,'GHGI_Emi_Group']][iregion,iroad,istate,iyear]
                                    #print(emi_temp)
                                    if np.sum(mask_state*proxy_temp[iregion,iroad,:,:,iyear]) > 0 and emi_temp > 0: 
                                    # if state is on grid and proxy for that state is non-zero
                                        weighted_array = data_fn.safe_div(mask_state*proxy_temp[iregion,iroad,:,:,iyear], \
                                                                      np.sum(mask_state*proxy_temp[iregion,iroad,:,:,iyear]))
                                        #weighted_array_01 = data_fn.regrid001_to_01(weighted_array, Lat_01, Lon_01)
                                        #print(np.sum(weighted_array))
                                        Emissions_array_001[:,:,iyear] += emi_temp*weighted_array#_01
                                        vars()['Ext_'+proxy_mob_map.loc[igroup,'GHGI_Emi_Group']+'_01'][:,:,iyear]+=emi_temp*weighted_array
                                        running_sum[igroup,iyear] += np.sum(emi_temp*weighted_array)
                                    else:
                                    #for imonth in np.arange(0,num_months):
                                        Emissions_nongrid[iyear] += emi_temp
                                        running_sum[igroup,iyear] += np.sum(emi_temp)
                        print(running_sum[igroup,iyear])
                
                    elif proxy_mob_map.loc[igroup, 'Urban_Rural_Flag'] ==1:
                        for iyear in np.arange(0, num_years):
                            for iregion in np.arange(0,2):
                                #for iroad in np.arange(0,3):
                                emi_temp = vars()['State_'+proxy_mob_map.loc[igroup,'GHGI_Emi_Group']][iregion,istate,iyear]
                                if np.sum(mask_state*np.sum(proxy_temp[iregion,:,:,:,iyear],axis=0)) > 0 and emi_temp > 0: 
                                    # if state is on grid and proxy for that state is non-zero
                                    weighted_array = data_fn.safe_div(mask_state*np.sum(proxy_temp[iregion,:,:,:,iyear],axis=0), \
                                                                  np.sum(mask_state*np.sum(proxy_temp[iregion,:,:,:,iyear],axis=0)))
                                    #weighted_array_01 = data_fn.regrid001_to_01(weighted_array, Lat_01, Lon_01)
                                    Emissions_array_001[:,:,iyear] += emi_temp*weighted_array#_01
                                    vars()['Ext_'+proxy_mob_map.loc[igroup,'GHGI_Emi_Group']+'_01'][:,:,iyear]+=emi_temp*weighted_array
                                    #print(np.sum(weighted_array))
                                    running_sum[igroup,iyear] += np.sum(emi_temp*weighted_array)
                                    #print(emi_temp)
                                    #print(np.sum(emi_temp*weighted_array))
                                else:
                                    #for imonth in np.arange(0,num_months):
                                    Emissions_nongrid[iyear] += emi_temp
                                    running_sum[igroup,iyear] += np.sum(emi_temp)

            else:
                if proxy_mob_map.loc[igroup, 'Urban_Rural_Flag'] ==2:
                    for iyear in np.arange(0, num_years):
                        Emissions_nongrid[iyear] += np.sum(vars()['State_'+proxy_mob_map.loc[igroup,'GHGI_Emi_Group']][:,:,istate,iyear])
                        running_sum[igroup,iyear] += np.sum(vars()['State_'+proxy_mob_map.loc[igroup,'GHGI_Emi_Group']][:,:,istate,iyear])    

                elif proxy_mob_map.loc[igroup, 'Urban_Rural_Flag'] ==1:
                    for iyear in np.arange(0, num_years):
                        Emissions_nongrid[iyear] += np.sum(vars()['State_'+proxy_mob_map.loc[igroup,'GHGI_Emi_Group']][:,istate,iyear])
                        running_sum[igroup,iyear] += np.sum(vars()['State_'+proxy_mob_map.loc[igroup,'GHGI_Emi_Group']][:,istate,iyear])    
         
    #2b. if emissions were not allocated to state, allocate national total to grid here (these are in 0.1x0.1 resolution)
    elif proxy_mob_map.loc[igroup,'State_Proxy_Group'] == '-':
        vars()['Ext_'+proxy_mob_map.loc[igroup,'GHGI_Emi_Group']+'_temp'] = np.zeros([len(Lat_01),len(Lon_01),num_years])
        for iyear in np.arange(0,num_years):
            temp_sum = np.sum(vars()[proxy_mob_map.loc[igroup,'Proxy_Group']][:,:,iyear])+np.sum(vars()[proxy_mob_map.loc[igroup,'Proxy_Group']+'_nongrid'][iyear])
            emi_temp= vars()[proxy_mob_map.loc[igroup,'GHGI_Emi_Group']][iyear] * \
                       data_fn.safe_div(vars()[proxy_mob_map.loc[igroup,'Proxy_Group']][:,:,iyear], temp_sum)
            Emissions_array_01_temp[:,:,iyear] += emi_temp
            vars()['Ext_'+proxy_mob_map.loc[igroup,'GHGI_Emi_Group']+'_temp'][:,:,iyear]+= emi_temp
            #print(np.sum(vars()['Ext_'+proxy_mob_map.loc[igroup,'GHGI_Emi_Group']+'_temp'][:,:,iyear]))
            Emissions_nongrid[iyear] += vars()[proxy_mob_map.loc[igroup,'GHGI_Emi_Group']][iyear] *\
                        data_fn.safe_div(vars()[proxy_mob_map.loc[igroup,'Proxy_Group']+'_nongrid'][iyear], temp_sum)
            ##DEBUG## running_count += vars()[proxy_mob_map.loc[igroup,'GHGI_Emi_Group']][iyear]
            running_sum[igroup,iyear] += np.sum(emi_temp) + \
                        vars()[proxy_mob_map.loc[igroup,'GHGI_Emi_Group']][iyear] *\
                        data_fn.safe_div(vars()[proxy_mob_map.loc[igroup,'Proxy_Group']+'_nongrid'][iyear], temp_sum)    
        print(running_sum[igroup,iyear])
    #2c. this is the case that GHGI emissions are not mapped (e.g., specified outside of CONUS in the GHGI)
    elif proxy_mob_map.loc[igroup,'Proxy_Group'] == 'Map_not_mapped':  
        for iyear in np.arange(0, num_years):
            Emissions_nongrid[iyear] += vars()[proxy_mob_map.loc[igroup,'GHGI_Emi_Group']][iyear]
            running_sum[igroup,iyear] += vars()[proxy_mob_map.loc[igroup,'GHGI_Emi_Group']][iyear] 
        print(running_sum[igroup,iyear])

for igroup in np.arange(0, len(proxy_mob_map)):
    vars()['Ext_'+proxy_mob_map.loc[igroup,'GHGI_Emi_Group']] = np.zeros([len(Lat_01),len(Lon_01),num_years])

for iyear in np.arange(0, num_years):    
    Emissions_array_01[:,:,iyear] = data_fn.regrid001_to_01(Emissions_array_001[:,:,iyear], Lat_01, Lon_01)
    Emissions_array_01[:,:,iyear] += Emissions_array_01_temp[:,:,iyear]
    calc_emi = np.sum(Emissions_array_01[:,:,iyear]) + np.sum(Emissions_nongrid[iyear]) 
    calc_emi2 = 0
    for igroup in np.arange(0, len(proxy_mob_map)):
        if proxy_mob_map.loc[igroup,'State_Proxy_Group'] != '-' and proxy_mob_map.loc[igroup,'State_Proxy_Group'] != 'state_not_mapped':
            vars()['Ext_'+proxy_mob_map.loc[igroup,'GHGI_Emi_Group']][:,:,iyear]= data_fn.regrid001_to_01(vars()['Ext_'+proxy_mob_map.loc[igroup,'GHGI_Emi_Group']+'_01'][:,:,iyear], Lat_01, Lon_01)
            calc_emi2 += np.sum(vars()['Ext_'+proxy_mob_map.loc[igroup,'GHGI_Emi_Group']][:,:,iyear])
        else:
            #print(proxy_mob_map.loc[igroup,'GHGI_Emi_Group'])
            vars()['Ext_'+proxy_mob_map.loc[igroup,'GHGI_Emi_Group']][:,:,iyear] += vars()['Ext_'+proxy_mob_map.loc[igroup,'GHGI_Emi_Group']+'_temp'][:,:,iyear]
            calc_emi2 += np.sum(vars()['Ext_'+proxy_mob_map.loc[igroup,'GHGI_Emi_Group']][:,:,iyear])
    calc_emi2 += np.sum(Emissions_nongrid[iyear]) 

    summary_emi = EPA_mobcomb_total.iloc[0,iyear+1] 
    emi_diff = abs(summary_emi-calc_emi)/((summary_emi+calc_emi)/2)
    if DEBUG==1:
        print(calc_emi)
        print(calc_emi2)
        print(summary_emi)
    if abs(emi_diff) < 0.0001:
        print('Year '+ year_range_str[iyear]+': Difference < 0.01%: PASS')
    else: 
        print('Year '+ year_range_str[iyear]+': Difference > 0.01%: FAIL, diff: '+str(emi_diff))
        
ct = datetime.datetime.now() 
print("current time:", ct)

#### Step 4.1.4 Save gridded emissions (kt)

In [None]:
#save gridded emissions for each gridding group - for extension

#Initialize file
data_IO_fn.initialize_netCDF(grid_emi_outputfile, netCDF_description, 0, year_range, loc_dimensions, Lat_01, Lon_01)

unique_groups = np.unique(proxy_mob_map['GHGI_Emi_Group'])
unique_groups = unique_groups[unique_groups != 'Emi_not_mapped']

nc_out = Dataset(grid_emi_outputfile, 'r+', format='NETCDF4')
#nc_out.createDimension('state', len(State_ANSI))

for igroup in np.arange(0,len(unique_groups)):
    print('Ext_'+unique_groups[igroup])
    if len(np.shape(vars()['Ext_'+unique_groups[igroup]])) ==4:
        ghgi_temp = np.sum(vars()[unique_groups[igroup]],axis=3) #sum month data if data is monthly
    else:
        ghgi_temp = vars()['Ext_'+unique_groups[igroup]]

    # Write data to netCDF
    data_out = nc_out.createVariable('Ext_'+unique_groups[igroup], 'f8', ('lat', 'lon','year'), zlib=True)
    data_out[:,:,:] = ghgi_temp[:,:,:]

#save nongrid data to calculate non-grid fraction extension
data_out = nc_out.createVariable('Emissions_nongrid', 'f8', ('year'), zlib=True)  
data_out[:] = Emissions_nongrid[:]
nc_out.close()

#Confirm file location
print('** SUCCESS **')
print("Gridded emissions (kt) written to file: {}" .format(os.getcwd())+grid_emi_outputfile)
print(' ')

del data_out, ghgi_temp, nc_out

#### 4.2 Calculate Gridded Emission Fluxes (molec./cm2/s) (0.1x0.1)

In [None]:
#Convert emissions to emission flux
# convert kt to molec/cm2/s

Flux_array_01 = np.zeros([len(Lat_01),len(Lon_01),num_years,num_months])
Flux_array_01_annual = np.zeros([len(Lat_01),len(Lon_01),num_years])
print('**QA/QC Check: Sum of national gridded emissions vs. GHGI national emissions')
  
for iyear in np.arange(0,num_years):
    calc_emi = 0
    if year_range[iyear]==2012 or year_range[iyear]==2016:
        year_days = np.sum(month_day_leap)
        #month_days = month_day_leap
    else:
        year_days = np.sum(month_day_nonleap)
        #month_days = month_day_nonleap
        
    #for imonth in np.arange(0,num_months):
    conversion_factor_01 = 10**9 * Avogadro / float(Molarch4 *year_days * 24 * 60 *60) / area_matrix_01
    Flux_array_01_annual[:,:,iyear] = Emissions_array_01[:,:,iyear]*conversion_factor_01
    #convert back to mass to check
    conversion_factor_annual = 10**9 * Avogadro / float(Molarch4 *year_days * 24 * 60 *60) / area_matrix_01
    calc_emi = np.sum(Flux_array_01_annual[:,:,iyear]/conversion_factor_annual)+np.sum(Emissions_nongrid[iyear])
    #calc_emi += np.sum(Emissions_nongrid[iyear,:])
    summary_emi = EPA_mobcomb_total.iloc[0,iyear+1] 
    emi_diff = abs(summary_emi-calc_emi)/((summary_emi+calc_emi)/2)
    if DEBUG==1:
        print(calc_emi)
        print(summary_emi)
    if abs(emi_diff) < 0.0001:
        print('Year '+ year_range_str[iyear]+': Difference < 0.01%: PASS')
    else: 
        print('Year '+ year_range_str[iyear]+': Difference > 0.01%: FAIL, diff: '+str(emi_diff))
        
Flux_Emissions_Total_annual = Flux_array_01_annual

-------------
## Step 5. Write netCDF
------------

In [None]:
# yearly data
#Initialize file
data_IO_fn.initialize_netCDF(gridded_outputfile, netCDF_description, 0, year_range, loc_dimensions, Lat_01, Lon_01)

# Write data to netCDF
nc_out = Dataset(gridded_outputfile, 'r+', format='NETCDF4')
nc_out.variables['emi_ch4'][:,:,:] = Flux_Emissions_Total_annual
nc_out.close()
#Confirm file location
print('** SUCCESS **')
print("Gridded stationary combustion fluxes written to file: {}" .format(os.getcwd())+gridded_outputfile)

----------
## Step 6. Plot Gridded Data
---------

#### Step 6.1. Plot Annual Emission Fluxes

In [None]:
#Plot Annual Data
scale_max = 10
save_flag = 0
save_fig = ''
data_plot_fn.plot_annual_emission_flux_map(Flux_Emissions_Total_annual, Lat_01, Lon_01, year_range, title_str,scale_max,save_flag,save_fig)

#### Step 6.2 Plot Difference between first and last inventory year

In [None]:
# Plot difference between last and first year
save_flag = 0
save_outfile = ''
data_plot_fn.plot_diff_emission_flux_map(Flux_Emissions_Total_annual, Lat_01, Lon_01, year_range, title_diff_str,save_flag,save_outfile)

In [None]:
ct = datetime.datetime.now() 
ft = ct.timestamp() 
time_elapsed = (ft-it)/(60*60)
print('Time to run: '+str(time_elapsed)+' hours')
print('** GEPA_1A_Combustion_Mobile: COMPLETE **')