# Gridded EPA Methane Inventory
## Category: 3A Livestock Sector - Enteric Emissions

***
#### Authors: 
Joannes D. Maasakkers, Candice F. Z. Chen, Erin E. McDuffie
#### Date Last Updated: 
see Step 0
#### Notebook Purpose
This notebook calculates gridded (0.1⁰x0.1⁰) annual emission fluxes of methane (molecules CH4/cm2/s) from enteric fermentation activities in the CONUS region for the years 2012 - 2018. Emission fluxes are reported at an annual time resolution. 
#### Summary & Notes 
The national EPA GHGI emissions data are read in from table 5-2 from the public GHGI report. First, national emissions are allocated to each state and animal type using state EPA GHGI emissions from enteric fermentation from the GHGI Inventory Enteric workbooks (for both cattle and other animals). Animal-specific emissions from 2018 are only available at the national level. National level-ratios of animal types from previous years are used to estimate state-level contributions for 2018. State-level emissions (as a function of animal type) are then allocated to the county level using USDA animal counts from the 2012 and 2017 Census. Animal counts for additional years are estimated through interpolation of census data. Resulting county-level emissions are then distributed onto a 0.1⁰x0.1⁰ grid (as a function of animal type) using a map of grid-level landcover probabilities from the USDA. Emissions as a function of animal type are then aggregated to total gridded enteric fermentation emissions. Total emissions are converted to annual emision fluxes (molec./cm2/s) and are written to final netCDFs in the '/code/Final_Gridded_Data/' folder. 
***

###### 

--------------
## Step 0. Set-Up Notebook Modules, Functions, and Local Parameters and Constants
_____

In [None]:
#Confirm working directory
import os
import time
modtime = os.path.getmtime('./3A_Livestock_Enteric.ipynb')
modificationTime = time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(modtime))
print("This file was last modified on: ", modificationTime)
print('')
print("The directory we are working in is {}" .format(os.getcwd()))

In [None]:
# Include plots within notebook
%matplotlib inline

In [None]:
# Import base modules
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import re
import datetime
from copy import copy

# Import additional modules
from mpl_toolkits.basemap import Basemap

# Load netCDF (for manipulating netCDF file types)
from netCDF4 import Dataset

# Set up ticker
import matplotlib.ticker as ticker

#add path for the global function module (file)
import sys
module_path = os.path.abspath(os.path.join('../Global_Functions/'))
if module_path not in sys.path:
    sys.path.append(module_path)

# Load user-defined global functions (modules)
import data_load_functions as data_load_fn
import data_functions as data_fn
import data_IO_functions as data_IO_fn
import data_plot_functions as data_plot_fn

In [None]:
#INPUT Files
# Assign global file names
global_filenames = data_load_fn.load_global_file_names()
State_ANSI_inputfile = global_filenames[0]
County_ANSI_inputfile = global_filenames[1]
pop_map_inputfile = global_filenames[2]
Grid_area01_inputfile = global_filenames[3]
Grid_area001_inputfile = global_filenames[4]
Grid_state001_ansi_inputfile = global_filenames[5]
Grid_county001_ansi_inputfile = global_filenames[6]

# Specify names of inputs files used in this notebook
EPA_enteric_cattle_inputfile = "../Global_InputData/GHGI/Ch5_Agriculture/EntericOutputs_1990-2018-final.xlsx" #EPA_Enteric_Cattle.csv"
EPA_enteric_cattle2018_inputfile = "./InputData/EPA_Enteric_Cattle_2018.csv"
EPA_enteric_other_inputfile = "../Global_InputData/GHGI/Ch5_Agriculture/OtherEnteric18_master-Final.xlsx"
Census_12_inputloc =  "./InputData/USDA_Census/Census_12_"
Census_17_inputloc =  "./InputData/USDA_Census/Census_17_"
USDA_LUC_inputloc = "./InputData/Data_map/usda_luc_rank_"
EPA_AGR_inputfile = "../Global_InputData/GHGI/Ch5_Agriculture/Table 5-2.csv"

#Proxy Data file
Livestock_Mapping_inputfile = "./InputData/Livestock_Enteric_ProxyMapping.xlsx"

#Specify names of gridded output files
enteric_int_out = './IntermediateOutputs/Intermediate_EPA_v2_3A_Enteric_Fermentation.nc'

gridded_outputfile = '../Final_Gridded_Data/EPA_v2_3A_Enteric_Fermentation.nc'
netCDF_description = 'Gridded EPA Inventory - Enteric Fermentation Emissions - IPCC Source Category 3A'
title_str = "EPA methane emissions from enteric fermentation"
title_diff_str = "Emissions from enteric fermentation difference: 2018-2012"

#output gridded proxy data
grid_emi_outputfile = '../Final_Gridded_Data/Extension/v2_input_data/Livestock_Enteric_Grid_Emi.nc'

In [None]:
# Define local variables
start_year = 2012  #First year in emission timeseries
end_year = 2018    #Last year in emission timeseries
year_range = [*range(start_year, end_year+1,1)] #List of emission years
year_range_str=[str(i) for i in year_range]
num_years = len(year_range)

# Define constants
Avogadro   = 6.02214129 * 10**(23)  #molecules/mol
Molarch4   = 16.04                  #g/mol
Res01      = 0.1                    # degrees

# Continental US Lat/Lon Limits (for netCDF files)
Lon_left = -130       #deg
Lon_right = -60       #deg
Lat_low  = 20         #deg
Lat_up  = 55          #deg

loc_dimensions = [Lat_low, Lat_up, Lon_left, Lon_right]
ilat_start = int((90+Lat_low)/Res01) #1100:1450 (continental US range)
ilat_end = int((90+Lat_up)/Res01)
ilon_start = abs(int((-180-Lon_left)/Res01)) #500:1200 (continental US range)
ilon_end = abs(int((-180-Lon_right)/Res01))

# Number of days in each month
month_day_leap  = [  31,  29,  31,  30,  31,  30,  31,  31,  30,  31,  30,  31]
month_day_nonleap = [  31,  28,  31,  30,  31,  30,  31,  31,  30,  31,  30,  31]

# Month arrays
month_range_str = ['January','February','March','April','May','June','July','August','September','October','November','December']
num_months = len(month_range_str)

In [None]:
# Define animal type arrays
#Generate a vector with all the animal types in the county-level census data (use .csv naming convention):
#animal_array = ['Beef','Bison','Broilers','Cattle','Chickens','Dairy','Goats','Hogs','Horses',\
#                'Layers','OnFeed','Pullets','Roosters','Sheep','Turkeys']

#Generate a vector with the 9 animal types used to grid county-level emissions:
#Census data to LUC data mapping:
#Uniform --> animal
#Broilers + Turkeys --> brltrk
#On Feed --> ctlfed
#Beef + Bison +Cattle --> ctlinv
#Goats --> goat
#Hogs --> hogpig
#Horses --> hrspny
#Chickens + Layers + Pullets + Roosters --> lyrplt
#Dairy --> mlkcow
#Sheep --> shplmb
#luc_animal_array = ['animal','brltrk','ctlfed','ctlinv','goat','hogpig','hrspny','lyrplt','mlkcow','shplmb']

In [None]:
%%javascript
IPython.OutputArea.auto_scroll_threshold = 9999;
//prevent auto-scrolling

In [None]:
# Track run time
ct = datetime.datetime.now() 
it = ct.timestamp() 
print("current time:", ct) 

____
## Step 1. Load in State and County ANSI data and Area Maps
_____

In [None]:
# State-level ANSI Data
#Read the state ANSI file array
State_ANSI, name_dict, abbr_dict = data_load_fn.load_state_ansi(State_ANSI_inputfile)[0:3]
#QA: number of states
print('Read input file: '+ f"{State_ANSI_inputfile}")
print('Total "States" found: ' + '%.0f' % len(State_ANSI))
print(' ')

# 0.01 x0.01 degree Data
# State ANSI IDs and grid cell area (m2) maps
state_ANSI_map = data_load_fn.load_state_ansi_map(Grid_state001_ansi_inputfile)
area_map, lat001, lon001 = data_load_fn.load_area_map_001(Grid_area001_inputfile)


#County ANSI Data
#Includes State ANSI number, county ANSI number, county name, and country area (square miles)
County_ANSI = pd.read_csv(County_ANSI_inputfile,encoding='latin-1')

#QA: number of counties
print ('Read input file: ' + f"{County_ANSI_inputfile}")
print('Total "Counties" found (include PR): ' + '%.0f' % len(County_ANSI))
print(' ')

#Create a placeholder array for county data
county_array = np.zeros([len(County_ANSI),3])

#Populate array with State ANSI number (0), county ANSI number (1), and county area (2)
for icounty in np.arange(0,len(County_ANSI)):
    county_array[icounty,0] = int(County_ANSI.values[icounty,0])
    county_array[icounty,1] = int(County_ANSI.values[icounty,1])
    county_array[icounty,2] = County_ANSI.values[icounty,3]

# 0.01 x0.01 degree Data
# State ANSI IDs and grid cell area (m2) maps
state_ANSI_map = data_load_fn.load_state_ansi_map(Grid_state001_ansi_inputfile)
state_ANSI_map = state_ANSI_map.astype('int32')
county_ANSI_map = data_load_fn.load_county_ansi_map(Grid_county001_ansi_inputfile)
county_ANSI_map = county_ANSI_map.astype('int32')
area_map, lat001, lon001 = data_load_fn.load_area_map_001(Grid_area001_inputfile)

# 0.1 x0.1 degree data
# grid cell area and state and county ANSI maps
area_map01, Lat01, Lon01 = data_load_fn.load_area_map_01(Grid_area01_inputfile)[0:3]
#Select relevant Continental 0.1 x0.1 domain
Lat_01 = Lat01[ilat_start:ilat_end]
Lon_01 = Lon01[ilon_start:ilon_end]
area_matrix_01 = data_fn.regrid001_to_01(area_map, Lat_01, Lon_01)
area_matrix_01 *= 10000  #convert from m2 to cm2

state_ANSI_map_01 = data_fn.regrid001_to_01(state_ANSI_map, Lat_01, Lon_01)

# Print time
ct = datetime.datetime.now() 
print("current time:", ct) 

---------------------------------------------
## Step 2. Read in and Format Proxy Data
--------------------------------

#### Step 2.1 Read In Proxy Mapping File & Make Proxy Arrays

In [None]:
#load GHGI Mapping Groups
names = pd.read_excel(Livestock_Mapping_inputfile, sheet_name = "GHGI Map - Livestock", usecols = "A:B",skiprows = 1, header = 0)
colnames = names.columns.values
ghgi_livestock_map = pd.read_excel(Livestock_Mapping_inputfile, sheet_name = "GHGI Map - Livestock", usecols = "A:B", skiprows = 1, names = colnames)
#drop rows with no data, remove the parentheses and ""
ghgi_livestock_map = ghgi_livestock_map[ghgi_livestock_map['GHGI_Emi_Group'] != 'na']
ghgi_livestock_map = ghgi_livestock_map[ghgi_livestock_map['GHGI_Emi_Group'].notna()]
ghgi_livestock_map['GHGI_Source']= ghgi_livestock_map['GHGI_Source'].str.replace(r"\(","")
ghgi_livestock_map['GHGI_Source']= ghgi_livestock_map['GHGI_Source'].str.replace(r"\)","")
ghgi_livestock_map.reset_index(inplace=True, drop=True)
display(ghgi_livestock_map)

#load emission group - proxy map
names = pd.read_excel(Livestock_Mapping_inputfile, sheet_name = "Proxy Map - Livestock", usecols = "A:G",skiprows = 1, header = 0)
colnames = names.columns.values
proxy_livestock_map = pd.read_excel(Livestock_Mapping_inputfile, sheet_name = "Proxy Map - Livestock", usecols = "A:G", skiprows = 1, names = colnames)
display((proxy_livestock_map))

#create empty proxy and emission group arrays (add months for proxy variables that have monthly data)
for igroup in np.arange(0,len(proxy_livestock_map)):
    if proxy_livestock_map.loc[igroup, 'Grid_Month_Flag'] ==0:
        vars()[proxy_livestock_map.loc[igroup,'Proxy_Group']] = np.zeros([len(Lat_01),len(Lon_01),num_years])
        vars()[proxy_livestock_map.loc[igroup,'Proxy_Group']+'_nongrid'] = np.zeros([num_years])
    else:
        vars()[proxy_livestock_map.loc[igroup,'Proxy_Group']] = np.zeros([len(Lat_01),len(Lon_01),num_years,num_months])
        vars()[proxy_livestock_map.loc[igroup,'Proxy_Group']+'_nongrid'] = np.zeros([num_years,num_months])
        
    vars()[proxy_livestock_map.loc[igroup,'GHGI_Emi_Group']] = np.zeros([num_years])
    
    if proxy_livestock_map.loc[igroup,'State_Proxy_Group'] != '-':
        if proxy_livestock_map.loc[igroup,'State_Month_Flag'] == 0:
            vars()[proxy_livestock_map.loc[igroup,'State_Proxy_Group']] = np.zeros([len(State_ANSI),num_years])
        else:
            vars()[proxy_livestock_map.loc[igroup,'State_Proxy_Group']] = np.zeros([len(State_ANSI),num_years,num_months])
    else:
        continue # do not make state proxy variable if no variable assigned in mapping file
        
    if proxy_livestock_map.loc[igroup,'County_Proxy_Group'] != '-':
        if proxy_livestock_map.loc[igroup,'County_Month_Flag'] == 0:
            vars()[proxy_livestock_map.loc[igroup,'County_Proxy_Group']] = np.zeros([len(State_ANSI),len(County_ANSI),num_years])
        else:
            vars()[proxy_livestock_map.loc[igroup,'County_Proxy_Group']] = np.zeros([len(State_ANSI),len(County_ANSI),num_years,num_months])
    else:
        continue # do not make state proxy variable if no variable assigned in mapping file

        
emi_group_names = np.unique(ghgi_livestock_map['GHGI_Emi_Group'])

print('QA/QC: Is the number of emission groups the same for the proxy and emissions tabs?')
if (len(emi_group_names) == len(np.unique(proxy_livestock_map['GHGI_Emi_Group']))):
    print('PASS')
else:
    print('FAIL')

### Step 2.2 Read in the GHGI State Emissions Data

#### Step 2.2.1. Enteric Fermentation

##### Step 2.2.1.1 Read in 2012-2017 state emissions for beef, cattle, diary, and onfeed

In [None]:
#Read in and format EPA enteric methane emissions (mt/yr) by state and animal type
# This inlcude all data from 2012-2017 (state levels by animal types were not re-calculated
# for the year 2018 in the GHGI)

# Array dimensions:
# 10 animal types x state x year
emi_state_ent_animal = np.zeros([10,len(State_ANSI),num_years])

#Note: no county data for mules (set to 'uniform')

# Do cattle first

#Read and format EPA enteric methane emissions (metric tons/year) from cattle
#Loop over emission years and a) Read data, b) format data table, c) split into 4 animal categories, 
# then convert to Tg, and insert annual data into final array
next_year_temp = np.zeros([50,num_years+1])

#first read in 2011 'next year emissions'
iyear=0
start_loc = (year_range[iyear]-1-1989)*34+3
num_rowvars = 27
Ent_Cat_2011 = pd.read_excel(EPA_enteric_cattle_inputfile,skiprows=(start_loc),nrows=num_rowvars, usecols='D:BB',sheet_name = 'Output Summary')
Ent_Cat_2011 = Ent_Cat_2011.set_index('Region').T.rename_axis('State').reset_index()#.rename_axis(None, 1).reset_index()
Ent_Cat_2011.fillna(0, inplace=True)
#these are added to the following year totals
next_year_temp[:,iyear] = Ent_Cat_2011['Steer <600 Next Yr.']+\
                                Ent_Cat_2011['Steer 600 to 700 Next Yr.']+\
                                Ent_Cat_2011['Steer 700 to 800 Next Yr.']+\
                                Ent_Cat_2011['Steer >800 Next Yr.']+\
                                Ent_Cat_2011['Heifer <600 Next Yr.']+\
                                Ent_Cat_2011['Heifer 600 to 700 Next Yr.']+\
                                Ent_Cat_2011['Heifer 700 to 800 Next Yr.']+\
                                Ent_Cat_2011['Heifer >800 Next Yr.']

for iyear in np.arange(0,num_years): 

    #a) Read data
    # Rows correspond to years (1989-2020), columns to state. For each year, 27 variables (rows) are reported
    # (34 rows total for each data block of years).Therefore, the first row that should be read for each year 
    # is the year index*34 + 3 header rows.
    start_loc = (year_range[iyear]-1989)*34+3
    num_rowvars = 27
    Ent_Cat_temp = pd.read_excel(EPA_enteric_cattle_inputfile,skiprows=(start_loc),nrows=num_rowvars, usecols='D:BB',sheet_name = 'Output Summary')


    #b) Format data table
    # Take transpose of the array to get State by Variable, fill NaN values with '0', rename columns, remove
    # extra rows, and select the first 50 rows (e.g., data for 50 states).
    #Ent_Cat_temp = Ent_Cat_temp.transpose()
    Ent_Cat_temp.fillna(0, inplace=True)
    Ent_Cat_temp = Ent_Cat_temp.set_index('Region').T.rename_axis('State').reset_index()#.rename_axis(None, 1).reset_index()
    
    #c) Separate cattle into their 4 categories (Beef, Cattle, Dairy, OnFeed)
    # For each, sum specified variables and remove the individual columns
    #Beef
    Ent_Cat_temp['Beef'] = Ent_Cat_temp['Bulls']+Ent_Cat_temp['Beef Calves']+\
                            Ent_Cat_temp['Beef Cows']+Ent_Cat_temp['Beef Repl. Heif. 0-12']+\
                            Ent_Cat_temp['Beef Repl. Heif. 12-24']
    Ent_Cat_temp.drop(['Bulls','Beef Cows','Beef Repl. Heif. 0-12','Beef Repl. Heif. 12-24',\
                       'Beef Calves'], axis=1, inplace=True)
    #Cattle
    Ent_Cat_temp['Cattle'] = Ent_Cat_temp['Steer Stockers']+Ent_Cat_temp['Heifer Stockers']
    Ent_Cat_temp.drop(['Steer Stockers','Heifer Stockers'], axis=1, inplace=True)
    #Dairy
    Ent_Cat_temp['Dairy'] = Ent_Cat_temp['Dairy Calves']+Ent_Cat_temp['Dairy Cows']+\
                            Ent_Cat_temp['Dairy Repl. Heif. 0-12']+\
                            Ent_Cat_temp['Dairy Repl. Heif. 12-24']
    Ent_Cat_temp.drop(['Dairy Calves','Dairy Cows','Dairy Repl. Heif. 0-12',\
                       'Dairy Repl. Heif. 12-24'], axis=1, inplace=True)
    #OnFeed
    Ent_Cat_temp['OnFeed'] = Ent_Cat_temp['Steer <600']+Ent_Cat_temp['Steer 600 to 700']+\
                                Ent_Cat_temp['Steer 700 to 800']+Ent_Cat_temp['Steer >800']+\
                                Ent_Cat_temp['Heifer <600']+Ent_Cat_temp['Heifer 600 to 700']+\
                                Ent_Cat_temp['Heifer 700 to 800']+Ent_Cat_temp['Heifer >800']
    #these are added to the following year totals (i.e., 2011 added to 2012, but won't use 2018 value here)
    next_year_temp[:,iyear+1] = Ent_Cat_temp['Steer <600 Next Yr.']+\
                                Ent_Cat_temp['Steer 600 to 700 Next Yr.']+\
                                Ent_Cat_temp['Steer 700 to 800 Next Yr.']+\
                                Ent_Cat_temp['Steer >800 Next Yr.']+\
                                Ent_Cat_temp['Heifer <600 Next Yr.']+\
                                Ent_Cat_temp['Heifer 600 to 700 Next Yr.']+\
                                Ent_Cat_temp['Heifer 700 to 800 Next Yr.']+\
                                Ent_Cat_temp['Heifer >800 Next Yr.']
    #display(Ent_Cat_temp)
    Ent_Cat_temp.drop(['Steer <600','Steer 600 to 700','Steer 700 to 800','Steer >800',\
                       'Heifer <600','Heifer 600 to 700','Heifer 700 to 800','Heifer >800',\
                       'Steer <600 Next Yr.','Steer 600 to 700 Next Yr.','Steer 700 to 800 Next Yr.',\
                       'Steer >800 Next Yr.','Heifer <600 Next Yr.','Heifer 600 to 700 Next Yr.',\
                       'Heifer 700 to 800 Next Yr.','Heifer >800 Next Yr.'], axis=1, inplace=True)
    
    #Build the timeseries array using data from each year
    for istate in np.arange(0, len(Ent_Cat_temp)):
        match_state = np.where(Ent_Cat_temp['State'][istate].rstrip() == State_ANSI['name'])[0][0]
        emi_state_ent_animal[0,match_state,iyear] = Ent_Cat_temp.loc[istate,'Beef']/1e3 #convert mt to kt
        emi_state_ent_animal[1,match_state,iyear] = Ent_Cat_temp.loc[istate,'Cattle']/1e3
        emi_state_ent_animal[2,match_state,iyear] = Ent_Cat_temp.loc[istate,'Dairy']/1e3
        emi_state_ent_animal[3,match_state,iyear] = (Ent_Cat_temp.loc[istate,'OnFeed']+next_year_temp[istate,iyear])/1e3
        

##### Step 2.2.1.2 Read in 2018 GHGI state emissions for beef, cattle, diary, and onfeed

In [None]:
# calculate 2018 state levels for cattle

# Read and format the EPA enteric methane emissions (kt/year) from cattle, for the year 2018.

# The 2018 data are only available as national totals. To calculate State contributions in 2018, 
# calcualte the 2017 fractional contributions of diary, beef, cattle, and on feed emissions in 
# each state relative to the national total emissions. Then apply these fractional state 
# contributions to the 2018 national total emissions to calculate 2017 state-level emissions. 

national_2018_ent = np.zeros([4])

#a) Read the 2018 national methane emissions data (kt/year). The Livestock type 
# data are on the first 29 rows. Format by dropping the first two empty rows, reset
# the index numbers, and convert to floating point numbers.
Ent_Cat_18 = pd.read_excel(EPA_enteric_cattle_inputfile,skiprows=20,nrows=32, usecols='A,E',sheet_name = '2018 Calculations')
Ent_Cat_18.dropna(axis=0,inplace=True)
Ent_Cat_18.reset_index(inplace=True,drop=True)
Ent_Cat_18['2018 Methane Emissions'] = pd.to_numeric(Ent_Cat_18['2018 Methane Emissions'])

#b) Calculate the total national emissions (Tg/year) as the sum of national emissions (kt/year)
#for the relevant animal types in 2018.
#Dairy
select_variables = ['DAIRY ']
mask = Ent_Cat_18.loc[Ent_Cat_18['CEFM Livestock Type'].isin(select_variables)]
national_2018_ent[2] = float(sum(mask['2018 Methane Emissions']))#/float(1000) #Tg
#Beef
select_variables = ['Bulls','Beef Calves','Beef Cows','Beef Replacements 7-11 months','Beef Replacements 12-23 months']
mask = Ent_Cat_18.loc[Ent_Cat_18['CEFM Livestock Type'].isin(select_variables)]
national_2018_ent[0] = float(sum(mask['2018 Methane Emissions']))#/float(1000) #Tg
#Cattle
select_variables = ['Steer Stockers','Heifer Stockers']
mask = Ent_Cat_18.loc[Ent_Cat_18['CEFM Livestock Type'].isin(select_variables)]
national_2018_ent[1] = float(sum(mask['2018 Methane Emissions']))#/float(1000) #Tg
#OnFeed
select_variables = ['Steer Feedlot','Heifer Feedlot']
mask = Ent_Cat_18.loc[Ent_Cat_18['CEFM Livestock Type'].isin(select_variables)]
national_2018_ent[3] = float(sum(mask['2018 Methane Emissions']))#/float(1000) #Tg
#print(national_2018_ent)
#c) Calculate state-level emissions from national totals, using 2017 fractional state contributions
# for each aggregate animal type (Beef, dairy, cattel, onfeed). Then put the calculated emissions for each 
# state into the final enteric emissions array for the year 2018. 
# 2018 = 2017 state fraction * 2018 national total
index_2017 = year_range.index(2017)
for ianimal in np.arange(0,4):
    statefractions_17_temp = emi_state_ent_animal[ianimal,:,index_2017]/np.sum(emi_state_ent_animal[ianimal,:,index_2017])
    #print(np.sum(statefractions_17_temp))
    emi_state_ent_animal[ianimal,:,index_2017+1] = national_2018_ent[ianimal] * statefractions_17_temp
    
    #beef, cattle, dairy, onfeed

##### Step 2.2.1.3 Read in 2012-2018 GHGI state emissions for all other animal types

In [None]:
Ent_Other_temp = pd.read_excel(EPA_enteric_other_inputfile, sheet_name = 'ERG Pops by State')

emi_state_ent_animal[4:10,:,:] = 0 #make sure all values are set to zero before populating

#emissions (kt) = total counts of animals * emission factor / 1e6 (to get from kg/head to kt)
# emission factors from 'EPA_enteric_other_inputfile', 'Emissions' tab
# bison = 82.2 kg CH4/head/year
# goats = 5.0 kg CH4/head/year
# horses = 18.0 kg CH4/head/year
# mules = 10.0 kg CH4/head/year
# sheep = 8.0 kg CH4/head/year
# Swine = 1.5 kg CH4/head/year

for iyear in np.arange(0, num_years):
    temp_bis = Ent_Other_temp.loc[(Ent_Other_temp['year']==year_range[iyear]) & (Ent_Other_temp['animal']=='bison')]
    temp_bis.reset_index(inplace=True,drop=True)
    temp_goa = Ent_Other_temp.loc[(Ent_Other_temp['year']==year_range[iyear]) & (Ent_Other_temp['animal']=='goats')]
    temp_goa.reset_index(inplace=True,drop=True)
    temp_hor = Ent_Other_temp.loc[(Ent_Other_temp['year']==year_range[iyear]) & (Ent_Other_temp['animal']=='horses')]
    temp_hor.reset_index(inplace=True,drop=True)
    temp_mul = Ent_Other_temp.loc[(Ent_Other_temp['year']==year_range[iyear]) & (Ent_Other_temp['animal']=='mules')]
    temp_mul.reset_index(inplace=True,drop=True)
    temp_she = Ent_Other_temp.loc[(Ent_Other_temp['year']==year_range[iyear]) & (Ent_Other_temp['animal']=='sheep')]
    temp_she.reset_index(inplace=True,drop=True)
    temp_swi = Ent_Other_temp.loc[(Ent_Other_temp['year']==year_range[iyear]) & (Ent_Other_temp['animal'].str.contains('swine_120_179|swine_180|swine_50|swine_50_119|swine_breeding'))]
    temp_swi.reset_index(inplace=True,drop=True)
    
    #populate state array
    for istate in np.arange(0, len(temp_bis)):
        match_state = np.where(temp_bis['state'][istate].rstrip() == State_ANSI['abbr'])[0][0]
        emi_state_ent_animal[4,match_state,iyear] = temp_bis.loc[istate,'head']*82.2/1e6 #(kt) #/(25*1e-3) #covert from MMT CO2e to kt
    for istate in np.arange(0, len(temp_goa)):
        match_state = np.where(temp_goa['state'][istate].rstrip() == State_ANSI['abbr'])[0][0]
        emi_state_ent_animal[5,match_state,iyear] = temp_goa.loc[istate,'head']*5/1e6 #(kt) #/(25*1e-3) #covert from MMT CO2e to kt
    for istate in np.arange(0, len(temp_hor)):
        match_state = np.where(temp_hor['state'][istate].rstrip() == State_ANSI['abbr'])[0][0]
        emi_state_ent_animal[6,match_state,iyear] = temp_hor.loc[istate,'head']*18/1e6 #(kt) #/(25*1e-3) #covert from MMT CO2e to kt
    for istate in np.arange(0, len(temp_mul)):
        match_state = np.where(temp_mul['state'][istate].rstrip() == State_ANSI['abbr'])[0][0]
        emi_state_ent_animal[7,match_state,iyear] = temp_mul.loc[istate,'head']*10/1e6 #(kt) #/(25*1e-3) #covert from MMT CO2e to kt
    for istate in np.arange(0, len(temp_she)):
        match_state = np.where(temp_she['state'][istate].rstrip() == State_ANSI['abbr'])[0][0]
        emi_state_ent_animal[8,match_state,iyear] = temp_she.loc[istate,'head']*8/1e6 #(kt) #/(25*1e-3) #covert from MMT CO2e to kt
    for istate in np.arange(0, len(temp_swi)):
        #print(temp_swi['state'][istate])
        match_state = np.where(temp_swi['state'][istate].rstrip() == State_ANSI['abbr'])[0][0]
        emi_state_ent_animal[9,match_state,iyear] += temp_swi.loc[istate,'head']*1.5/1e6 #(kt) #/(25*1e-3) #covert from MMT CO2e to kt


##### Step 2.2.1.4 Compare against national totals

In [None]:
# QA: Check the enteric emissions (summed across all animal types and states) compared to reported 
# national-level enteric emissions (kt/year) in the US GHGI. 

DEBUG=1

#Read in total EPA emissions from public report table 5.2 (in kt)
EPA_emi_agr_CH4 = pd.read_csv(EPA_AGR_inputfile, thousands=',', header=2,nrows = 10)
EPA_emi_agr_CH4 = EPA_emi_agr_CH4.drop(['Unnamed: 0'], axis=1)
EPA_emi_agr_CH4.rename(columns={EPA_emi_agr_CH4.columns[0]:'Source'}, inplace=True)
EPA_emi_agr_CH4 = EPA_emi_agr_CH4.drop(columns = [str(n) for n in range(1990, start_year,1)])
EPA_emi_ent_CH4 = EPA_emi_agr_CH4.loc[EPA_emi_agr_CH4['Source']=="Enteric Fermentation"]
EPA_emi_ent_CH4.reset_index(inplace=True, drop=True)

sum_emi = np.zeros([num_years])
    
print('QA/QC #1: Check State Emission Sum against GHGI Summary Emissions')
for iyear in np.arange(0,num_years): 
    #for igroup in np.arange(0,len(ghgi_rice_groups)):
    sum_emi[iyear] = np.sum(emi_state_ent_animal[:,:,iyear])
        
    summary_emi = EPA_emi_ent_CH4.iloc[0,iyear+1]  
    diff1 = abs(sum_emi[iyear] - summary_emi)/((sum_emi[iyear] + summary_emi)/2)
    if DEBUG ==1:
        print(summary_emi)
        print(sum_emi[iyear])
    if diff1 < 0.0001:
        print('Year ', year_range[iyear],': PASS, difference < 0.01%')
    else:
        print('Year ', year_range[iyear],': FAIL (check): ', diff1,'%') 

### Step 2.3 Read and Format USDA Census Data

#### Step 2.3.1. State-Level USDA Animal Census Data

##### Step 2.3.1.1 First Census Year (2012)

In [None]:
#Initialize and fill arrays that will hold the state livestock numbers per animal type for the first emissions year

cen_animal_array = np.array(['Beef','Bison','Broilers','Cattle','Chickens','Dairy','Goats','Hogs','Horses',\
                'Layers','OnFeed','Pullets','Roosters','Sheep','Turkeys']) #these are the census categories

proxy_animal_array = np.array(['Beef','Cattle','Dairy','OnFeed','Bison','Goats','Horses',\
                'Mules','Sheep','Swine']) #these are the categories from the state GHGI

State_census_livestock_12 = np.zeros([len(cen_animal_array),len(State_ANSI)])
State_livestock_12 = np.zeros([len(proxy_animal_array),len(State_ANSI)])
    
# a) Read in 2012 state census data (one file per animal type), b) pull out the state ANSI numbers and animal
# counts data, c) for each state, reformat the state name and set the animal number to an integer, and 
# d) insert the state animal counts into the final livestock counts array as a function of state and animal type 
for ianimal in np.arange(0,len(cen_animal_array)):
    if not (cen_animal_array[ianimal].strip('Chickens')): # No census file for chickens, skip for now
        continue
    #a)
    State_file = Census_12_inputloc + cen_animal_array[ianimal] + "_State.csv"
    print('Reading file: ' + State_file)
    State_temp = pd.read_csv(State_file)
    #b) 
    Census12_State = State_temp[['State ANSI','Value']].copy()
    Census12_State['Value']= Census12_State['Value'].str.replace(r"\(D\)","0")
    Census12_State['Value']= Census12_State['Value'].str.replace(",","").astype(float)
    Census12_State['State ANSI']= Census12_State['State ANSI'].astype(int)
    #c)
    for istate in np.arange(0,len(Census12_State)):
        match_state = np.where(Census12_State['State ANSI'][istate] == State_ANSI['ansi'])[0][0]
        State_census_livestock_12[ianimal,match_state] = Census12_State['Value'][istate]


#assign the census data to the correct animal type order (proxy animal order)
for ianimal in np.arange(0,len(proxy_animal_array)):
    if proxy_animal_array[ianimal] == 'Mules':
        continue #will grid by area later (no county data)
    elif proxy_animal_array[ianimal] == 'Chickens': 
        State_livestock_12[ianimal,:] = State_census_livestock_12[np.where(cen_animal_array=='Broilers')[0][0],:] \
                                    +State_census_livestock_12[np.where(cen_animal_array=='Layers')[0][0],:] \
                                    +State_census_livestock_12[np.where(cen_animal_array=='Pullets')[0][0],:]\
                                    +State_census_livestock_12[np.where(cen_animal_array=='Roosters')[0][0],:]
    elif proxy_animal_array[ianimal] == 'Swine': 
        State_livestock_12[ianimal,:] = State_census_livestock_12[np.where(cen_animal_array=='Hogs')[0][0],:]
    else:
        match_ani = np.where(proxy_animal_array[ianimal] == cen_animal_array)[0][0]
        State_livestock_12[ianimal,:] = State_census_livestock_12[match_ani,:]
        
display(np.shape(State_livestock_12))
display((State_livestock_12))

##### Step 2.3.1.2 Last Census Year (2017)

In [None]:
#Initialize and fill arrays that will hold the state livestock numbers per animal type for the last available emissions year

State_census_livestock_17 = np.zeros([len(cen_animal_array),len(State_ANSI)])
State_livestock_17 = np.zeros([len(proxy_animal_array),len(State_ANSI)])

    
# a) Read in 2012 state census data (one file per animal type), b) pull out the state ANSI numbers and animal
# counts data, c) for each state, reformat the state name and set the animal number to an integer, and 
# d) insert the state animal counts into the final livestock counts array as a function of state and animal type 
for ianimal in np.arange(0,len(cen_animal_array)):
    if not (cen_animal_array[ianimal].strip('Chickens')): # No census file for chickens, skip for now
        continue
    #a)
    State_file = Census_17_inputloc + cen_animal_array[ianimal] + "_State.csv"
    print('Reading file: ' + State_file)
    State_temp = pd.read_csv(State_file)
    #b) 
    Census17_State = State_temp[['State ANSI','Value']].copy()
    Census17_State['Value']= Census17_State['Value'].str.replace(r"\(D\)","0")
    Census17_State['Value']= Census17_State['Value'].str.replace(",","").astype(float)
    Census17_State['State ANSI']= Census17_State['State ANSI'].astype(int)
    #c)
    for istate in np.arange(0,len(Census17_State)):
        match_state = np.where(Census17_State['State ANSI'][istate] == State_ANSI['ansi'])[0][0]
        State_census_livestock_17[ianimal,match_state] = Census17_State['Value'][istate]


#assign the census data to the correct animal type order (proxy animal order)
for ianimal in np.arange(0,len(proxy_animal_array)):
    if proxy_animal_array[ianimal] == 'Mules':
        continue #will grid by area later (no county data)
    elif proxy_animal_array[ianimal] == 'Chickens': 
        State_livestock_17[ianimal,:] = State_census_livestock_17[np.where(cen_animal_array=='Broilers')[0][0],:] \
                                    +State_census_livestock_17[np.where(cen_animal_array=='Layers')[0][0],:] \
                                    +State_census_livestock_17[np.where(cen_animal_array=='Pullets')[0][0],:]\
                                    +State_census_livestock_17[np.where(cen_animal_array=='Roosters')[0][0],:]
    elif proxy_animal_array[ianimal] == 'Swine':
        State_livestock_17[ianimal,:] = State_census_livestock_17[np.where(cen_animal_array=='Hogs')[0][0],:]

    else:
        match_ani = np.where(proxy_animal_array[ianimal] == cen_animal_array)[0][0]
        State_livestock_17[ianimal,:] = State_census_livestock_17[match_ani,:]
        
display(np.shape(State_livestock_17))
display((State_livestock_17))

#### Step 2.3.2 Read and Format County-Level USDA Animal Census Data 

##### Step 2.3.2.1. First Census Year (2012)

In [None]:
#Initialize and fill arrays that will hold the county livestock numbers per animal type for the first emissions year


County_census_livestock_12 = np.zeros([len(cen_animal_array),len(State_ANSI),len(County_ANSI)])
County_livestock_12 = np.zeros([len(proxy_animal_array),len(State_ANSI),len(County_ANSI)])


# a) Read in 2012 county census data (one file per animal type), b) pull out the state and county ANSI numbers and animal
# counts data, c) for each county, reformat the county name and set the animal number to an integer, and 
# d) insert the county animal counts into the final livestock counts array as a function of state, county, and animal type 
for ianimal in np.arange(0,len(cen_animal_array)):
    if not (cen_animal_array[ianimal].strip('Chickens')):  # No census file for chickens, skip for now
        continue
    # a)
    County_file = Census_12_inputloc + cen_animal_array[ianimal] + "_County.csv"
    print('Reading file: ' + County_file)
    County_temp = pd.read_csv(County_file)
    # b)
    Census12_County = County_temp[['State ANSI','County ANSI','Value']].copy()
    Census12_County['Value']= Census12_County['Value'].str.replace(r"\(D\)","0")
    Census12_County['Value']= Census12_County['Value'].str.replace(",","").astype(float)
    Census12_County['State ANSI']= Census12_County['State ANSI'].astype(int)
    Census12_County['County ANSI']= Census12_County['County ANSI'].astype(int)
    #c)
    for icounty in np.arange(0,len(Census12_County)):
        match_state = np.where(Census12_County['State ANSI'][icounty] == State_ANSI['ansi'])[0][0]
        match_county = np.where((Census12_County['County ANSI'][icounty] == County_ANSI['County'])&\
                               (Census12_County['State ANSI'][icounty] == County_ANSI['State']))[0][0]
        #print(match_state, match_county)
        County_census_livestock_12[ianimal,match_state,match_county] = Census12_County['Value'][icounty]

#assign the census data to the correct animal type order (proxy animal order)
for ianimal in np.arange(0,len(proxy_animal_array)):
    if proxy_animal_array[ianimal] == 'Mules':
        continue #will grid by area later (no county data)
    elif proxy_animal_array[ianimal] == 'Chickens': 
        County_livestock_12[ianimal,:] = County_census_livestock_12[np.where(cen_animal_array=='Broilers')[0][0],:] \
                                    +County_census_livestock_12[np.where(cen_animal_array=='Layers')[0][0],:] \
                                    +County_census_livestock_12[np.where(cen_animal_array=='Pullets')[0][0],:]\
                                    +County_census_livestock_12[np.where(cen_animal_array=='Roosters')[0][0],:]
    elif proxy_animal_array[ianimal] == 'Swine':
        County_livestock_12[ianimal,:] = County_census_livestock_12[np.where(cen_animal_array=='Hogs')[0][0],:]

    else:
        match_ani = np.where(proxy_animal_array[ianimal] == cen_animal_array)[0][0]
        County_livestock_12[ianimal,:] = County_census_livestock_12[match_ani,:]
        
display(np.shape(County_livestock_12))
display((County_livestock_12))

##### Step 2.3.2.2. Last Census Year (2017)

In [None]:
#Initialize and fill arrays that will hold the county livestock numbers per animal type for the first emissions year


County_census_livestock_17 = np.zeros([len(cen_animal_array),len(State_ANSI),len(County_ANSI)])
County_livestock_17 = np.zeros([len(proxy_animal_array),len(State_ANSI),len(County_ANSI)])

# a) Read in 2012 county census data (one file per animal type), b) pull out the state and county ANSI numbers and animal
# counts data, c) for each county, reformat the county name and set the animal number to an integer, and 
# d) insert the county animal counts into the final livestock counts array as a function of state, county, and animal type 
for ianimal in np.arange(0,len(cen_animal_array)):
    if not (cen_animal_array[ianimal].strip('Chickens')):  # No census file for chickens, skip for now
        continue
    # a)
    County_file = Census_17_inputloc + cen_animal_array[ianimal] + "_County.csv"
    print('Reading file: ' + County_file)
    County_temp = pd.read_csv(County_file)
    # b)
    Census17_County = County_temp[['State ANSI','County ANSI','Value','County']].copy()
    Census17_County['Value']= Census17_County['Value'].str.replace(r"\(D\)","0")
    Census17_County['Value']= Census17_County['Value'].str.replace(",","").astype(float)
    Census17_County['County ANSI'].fillna(0, inplace=True)
    #display(Census17_County)
    Census17_County['State ANSI']= Census17_County['State ANSI'].astype(int)
    Census17_County['County ANSI']= Census17_County['County ANSI'].astype(int)
    #c)
    for icounty in np.arange(0,len(Census17_County)):
        if Census17_County.loc[icounty,'County'].upper()=='ALEUTIAN ISLANDS':
            Census17_County.loc[icounty,'County ANSI']=13
        #Map Oglala Lakota County to Shannon County (2015 name change)
        if Census17_County.loc[icounty,'State ANSI'] == 46 and \
            Census17_County.loc[icounty,'County ANSI'] == 102:
            Census17_County.loc[icounty,'County ANSI'] = 113
        #correct county value for kenai peninsula (note that AK counties are incorrect [not correcting here since AK emissions removed])
        if Census17_County.loc[icounty,'State ANSI'] == 2 and \
            Census17_County.loc[icounty,'County ANSI'] == 0:
            Census17_County.loc[icounty,'County ANSI'] = 122
        match_state = np.where(Census17_County['State ANSI'][icounty] == State_ANSI['ansi'])[0][0]
        match_county = np.where((Census17_County['County ANSI'][icounty] == County_ANSI['County'])&\
                               (Census17_County['State ANSI'][icounty] == County_ANSI['State']))[0][0]
        #print(match_state, match_county)
        County_census_livestock_17[ianimal,match_state,match_county] = Census17_County['Value'][icounty]


#assign the census data to the correct animal type order (proxy animal order)
for ianimal in np.arange(0,len(proxy_animal_array)):
    if proxy_animal_array[ianimal] == 'Mules':
        continue #will grid by area later (no county data)
    elif proxy_animal_array[ianimal] == 'Chickens': 
        County_livestock_17[ianimal,:] = County_census_livestock_17[np.where(cen_animal_array=='Broilers')[0][0],:] \
                                    +County_census_livestock_17[np.where(cen_animal_array=='Layers')[0][0],:] \
                                    +County_census_livestock_17[np.where(cen_animal_array=='Pullets')[0][0],:]\
                                    +County_census_livestock_17[np.where(cen_animal_array=='Roosters')[0][0],:]
    elif proxy_animal_array[ianimal] == 'Swine':
        County_livestock_17[ianimal,:] = County_census_livestock_17[np.where(cen_animal_array=='Hogs')[0][0],:]
    else:
        match_ani = np.where(proxy_animal_array[ianimal] == cen_animal_array)[0][0]
        County_livestock_17[ianimal,:] = County_census_livestock_17[match_ani,:]
        
display(np.shape(County_livestock_17))
display((County_livestock_17))

#### Step 2.3.3. Calculate Total State-Level Animal Counts from the County-Level Data

##### Step 2.3.3.1 First Census Year (2012)

In [None]:
            
Census_summary_State_12  = np.zeros([len(proxy_animal_array),len(State_ANSI)])
Census_summary_County_12  = np.zeros([len(proxy_animal_array),len(State_ANSI)])
Census_summary_Missing_Area_12  = np.zeros([len(proxy_animal_array),len(State_ANSI)])
Census_summary_per_area_12  = np.zeros([len(proxy_animal_array),len(State_ANSI)])

#save the state sum, the state sum (calc'd from counties), and for a county that has zero livestock data, save area for later
#the arrays were created to follow the index values of 'State_ANSI' and 'County_ANSI' arrays, so can just loop through these here
for ianimal in np.arange(0, len(proxy_animal_array)):
    for istate in np.arange(0, len(State_ANSI)):
        Census_summary_State_12[ianimal,istate] = State_livestock_12[ianimal,istate]
        Census_summary_County_12[ianimal,istate] = np.sum(County_livestock_12[ianimal,istate,:])
    
    for icounty in np.arange(0,len(County_ANSI)):
        match_state = np.where(State_ANSI['ansi']==County_ANSI['State'][icounty])[0][0]
        if County_livestock_12[ianimal,match_state,icounty] ==0:
            Census_summary_Missing_Area_12[ianimal,match_state] += County_ANSI['Area'][icounty]
        

#if a state has some counties with no livestock data, then calculate missing animals per area in that state
# missing animals in state = (state animal sum - county animal sum)/ total area of counties with no data

for ianimal in np.arange(0,len(proxy_animal_array)):
    for istate in np.arange(0,len(State_ANSI)):
        if Census_summary_Missing_Area_12[ianimal,istate] > 0:
            Census_summary_per_area_12[ianimal,istate] = (Census_summary_State_12[ianimal,istate] - \
                                                          Census_summary_County_12[ianimal,istate]) / \
                                                            Census_summary_Missing_Area_12[ianimal,istate]
        if Census_summary_per_area_12[ianimal,istate] < 0:
            Census_summary_per_area_12[ianimal,istate] = 0.0


#Now that animals per area have been calculated for counties with missing data, fill in these
#zeros in the county data using the (animal counts / area) X county area relationship
#average animal per area on the state level times the area of the county


#For each county, if it does not have livestock data, estimate the 'counts of animals per area' (in the given state) * area of that county
#note that this places livestock emissions in all counties (is this a good assumption?)
for ianimal in np.arange(0,len(proxy_animal_array)):
    #for istate in np.arange(0, len(State_ANSI)):
    for icounty in np.arange(0,len(County_ANSI)):
        match_state = np.where(State_ANSI['ansi']==County_ANSI['State'][icounty])[0][0]
        if County_livestock_12[ianimal,match_state,icounty] == 0.0:
            County_livestock_12[ianimal,match_state,icounty] = Census_summary_per_area_12[ianimal,match_state] * \
                                                                County_ANSI['Area'][icounty]

#Next, calculate the total number of animals in each state based on the number of animal in each county
#Calculate the total number of animals in each state based on the number of animals

#Calculate the area of each state from the sum of the county area data
State_total_Area_12 = np.zeros(len(State_ANSI))
for icounty in np.arange(0,len(County_ANSI)):
    match_state = np.where(State_ANSI['ansi']==County_ANSI['State'][icounty])[0][0]
    State_total_Area_12[match_state] += County_ANSI['Area'][icounty]
    
#Recalculate the state animal counts from the corrected county data
State_total_animals_12 = np.zeros([len(proxy_animal_array),len(State_ANSI)])
for ianimal in np.arange(0,len(proxy_animal_array)):
    for icounty in np.arange(0,len(County_ANSI)):
        match_state = np.where(State_ANSI['ansi']==County_ANSI['State'][icounty])[0][0]
        State_total_animals_12[ianimal,match_state] += County_livestock_12[ianimal,match_state,icounty]

for ianimal in np.arange(0, len(proxy_animal_array)):
    print('Orig. Sum',np.sum(Census_summary_County_12[ianimal,:]))
    print('Corrected Sum',np.sum(State_total_animals_12[ianimal,:]))
    print('State Sum',np.sum(Census_summary_State_12[ianimal,:]))
    print(' ')

##### Step 2.3.3.2 Last Census Year (2017)

In [None]:
#Initialize and calculate state-level animal count arrays from the county-level animal count data

Census_summary_State_17  = np.zeros([len(proxy_animal_array),len(State_ANSI)])
Census_summary_County_17  = np.zeros([len(proxy_animal_array),len(State_ANSI)])
Census_summary_Missing_Area_17  = np.zeros([len(proxy_animal_array),len(State_ANSI)])
Census_summary_per_area_17  = np.zeros([len(proxy_animal_array),len(State_ANSI)])

#save the state sum, the state sum (calc'd from counties), and for a county that has zero livestock data, save area for later
#the arrays were created to follow the index values of 'State_ANSI' and 'County_ANSI' arrays, so can just loop through these here
for ianimal in np.arange(0, len(proxy_animal_array)):
    for istate in np.arange(0, len(State_ANSI)):
        Census_summary_State_17[ianimal,istate] = State_livestock_17[ianimal,istate]
        Census_summary_County_17[ianimal,istate] = np.sum(County_livestock_17[ianimal,istate,:])
    
    for icounty in np.arange(0,len(County_ANSI)):
        match_state = np.where(State_ANSI['ansi']==County_ANSI['State'][icounty])[0][0]
        if County_livestock_17[ianimal,match_state,icounty] ==0:
            Census_summary_Missing_Area_17[ianimal,match_state] += County_ANSI['Area'][icounty]

#if a state has some counties with no livestock data, then calculate missing animals per area in that state
# missing animals in state = (state animal sum - county animal sum)/ total area of counties with no data

for ianimal in np.arange(0,len(proxy_animal_array)):
    for istate in np.arange(0,len(State_ANSI)):
        if Census_summary_Missing_Area_17[ianimal,istate] > 0:
            Census_summary_per_area_17[ianimal,istate] = (Census_summary_State_17[ianimal,istate] - \
                                                          Census_summary_County_17[ianimal,istate]) / \
                                                            Census_summary_Missing_Area_17[ianimal,istate]
        if Census_summary_per_area_17[ianimal,istate] < 0:
            Census_summary_per_area_17[ianimal,istate] = 0.0


#For each county, if it does not have livestock data, estimate the 'counts of animals per area' (in the given state) * area of that county
#note that this places livestock emissions in all counties (is this a good assumption?)
for ianimal in np.arange(0,len(proxy_animal_array)):
    for icounty in np.arange(0,len(County_ANSI)):
        match_state = np.where(State_ANSI['ansi']==County_ANSI['State'][icounty])[0][0]
        if County_livestock_17[ianimal,match_state,icounty] == 0.0:
            County_livestock_17[ianimal,match_state,icounty] = Census_summary_per_area_17[ianimal,match_state] * \
                                                                County_ANSI['Area'][icounty]

#Next, calculate the total number of animals in each state based on the number of animal in each county
#Calculate the total number of animals in each state based on the number of animals

#Calculate the area of each state from the sum of the county area data
State_total_Area_17 = np.zeros(len(State_ANSI))
for icounty in np.arange(0,len(County_ANSI)):
    match_state = np.where(State_ANSI['ansi']==County_ANSI['State'][icounty])[0][0]
    State_total_Area_17[match_state] += County_ANSI['Area'][icounty]
    
#Recalculate the state animal counts from the corrected county data
State_total_animals_17 = np.zeros([len(proxy_animal_array),len(State_ANSI)])
for ianimal in np.arange(0,len(proxy_animal_array)):
    for icounty in np.arange(0,len(County_ANSI)):
        match_state = np.where(State_ANSI['ansi']==County_ANSI['State'][icounty])[0][0]
        State_total_animals_17[ianimal,match_state] += County_livestock_17[ianimal,match_state,icounty]

for ianimal in np.arange(0, len(proxy_animal_array)):
    print('Orig. Sum',np.sum(Census_summary_County_17[ianimal,:]))
    print('Corrected Sum',np.sum(State_total_animals_17[ianimal,:]))
    print('State Sum',np.sum(Census_summary_State_17[ianimal,:]))
    print('')

#### Step 2.3.4. Calculate State and County-Level Animal Counts Across Entire Timeseries (e.g., 2012-2018)

In [None]:
# Using the previous first and latest available state-level census animal counts data, find the
# change in the number of animals in each state for each animal type between the first
# and last avaible census years (e.g., 2012 and 2017). Then use this relationship to 
# calculate the animal counts at the state-level across all inventory years (e.g., 2012-2018)

animal_state_trend = (State_total_animals_17-State_total_animals_12)/5

state_animal_counts = np.zeros([len(proxy_animal_array),len(State_ANSI),num_years])
#Use slope (e.g., animal number / year) and year to calculate the animal counts for each year in the inventory
for iyear in np.arange(0,num_years):
    for ianimal in np.arange(0,len(proxy_animal_array)):
        state_animal_counts[ianimal,:,iyear] = animal_state_trend[ianimal,:]*iyear + State_total_animals_12[ianimal,:]

#Make any negative animal counts zero
state_animal_counts[state_animal_counts < 0] = 0

animal_county_trend = (County_livestock_17-County_livestock_12)/5

#Use slope (e.g., animal number / year) and year to calculate the animal counts for each year in the inventory
county_animal_counts = np.zeros([len(proxy_animal_array),len(State_ANSI),len(County_ANSI),num_years])
for iyear in np.arange(0,num_years):
    for ianimal in np.arange(0,len(proxy_animal_array)):
        county_animal_counts[ianimal,:,:,iyear] = animal_county_trend[ianimal,:,:]*iyear+County_livestock_12[ianimal,:,:]
county_animal_counts[county_animal_counts < 0] = 0
        
# Print time
ct = datetime.datetime.now() 
print("current time:", ct) 

### Step 2.4 Read in and Format Grid-level data

In [None]:
#Read in and format the gridded maps of land use data, also covert from rank into a probability
# These data are held constant over all years

#Generate a vector with the 9 animal types used to grid county-level emissions:
#Census data to LUC data mapping:
#Uniform --> animal
#Broilers + Turkeys --> brltrk
#On Feed --> ctlfed
#Beef + Bison +Cattle --> ctlinv
#Goats --> goat
#Hogs --> hogpig
#Horses --> hrspny
#Chickens + Layers + Pullets + Roosters --> lyrplt
#Dairy --> mlkcow
#Sheep --> shplmb
luc_animal_array = np.array(['animal','ctlfed','ctlinv','goat','hogpig','hrspny','mlkcow','shplmb'])

map_luc_rank = np.zeros([len(luc_animal_array),len(lat001),len(lon001)])

for ianimal in np.arange(len(luc_animal_array)):
    file_temp = Dataset(USDA_LUC_inputloc+luc_animal_array[ianimal]+'_001x001.nc')
    temp_data = np.array(file_temp.variables['rank_'+luc_animal_array[ianimal]])
    map_luc_rank[ianimal,:,:] = np.flipud(temp_data)
    map_luc_rank[ianimal,:,:] = map_luc_rank[ianimal,:,:].astype(float)
    file_temp.close()
    
map_luc_rank[map_luc_rank >  6.5]=   0.0
map_luc_rank[map_luc_rank == 1]  =   0.0001
map_luc_rank[map_luc_rank == 2]  =   0.0200
map_luc_rank[map_luc_rank == 3]  =   0.0500
map_luc_rank[map_luc_rank == 4]  =   0.1000
map_luc_rank[map_luc_rank == 5]  =   0.3300
map_luc_rank[map_luc_rank == 6]  =   0.4999
   


#Calculate the total product of the land area and animal rankings (probability * area) for each county and state
map_cm_rank_temp = np.zeros([len(luc_animal_array),len(lat001),len(lon001)])
for ianimal in np.arange(0, len(luc_animal_array)):
    map_cm_rank_temp[ianimal,:,:] = map_luc_rank[ianimal,:,:]*area_map[:,:]
    
#create re-order and create proxy with correct number of animal types (ctlinv applied to beef, cattle, bison)
#assign the census data to the correct animal type order (proxy animal order)
#data are the same for each year
map_cm_rank = np.zeros([len(proxy_animal_array),len(lat001),len(lon001)])

#for iyear in np.arange(0, num_years):
for ianimal in np.arange(0,len(proxy_animal_array)):
    if proxy_animal_array[ianimal] == 'Mules':
        map_cm_rank[ianimal,:,:] = map_cm_rank_temp[np.where(luc_animal_array=='animal')[0][0],:,:]
    elif proxy_animal_array[ianimal] == 'Beef':
        map_cm_rank[ianimal,:,:] = map_cm_rank_temp[np.where(luc_animal_array=='ctlinv')[0][0],:,:]
    elif proxy_animal_array[ianimal] == 'Cattle':
        map_cm_rank[ianimal,:,:] = map_cm_rank_temp[np.where(luc_animal_array=='ctlinv')[0][0],:,:]
    elif proxy_animal_array[ianimal] == 'Bison':
        map_cm_rank[ianimal,:,:] = map_cm_rank_temp[np.where(luc_animal_array=='ctlinv')[0][0],:,:]
    elif proxy_animal_array[ianimal] == 'Dairy':
        map_cm_rank[ianimal,:,:] = map_cm_rank_temp[np.where(luc_animal_array=='mlkcow')[0][0],:,:]
    elif proxy_animal_array[ianimal] == 'OnFeed':
        map_cm_rank[ianimal,:,:] = map_cm_rank_temp[np.where(luc_animal_array=='ctlfed')[0][0],:,:]
    elif proxy_animal_array[ianimal] == 'Goats':
        map_cm_rank[ianimal,:,:] = map_cm_rank_temp[np.where(luc_animal_array=='goat')[0][0],:,:]
    elif proxy_animal_array[ianimal] == 'Horses':
        map_cm_rank[ianimal,:,:] = map_cm_rank_temp[np.where(luc_animal_array=='hrspny')[0][0],:,:]
    elif proxy_animal_array[ianimal] == 'Sheep':
        map_cm_rank[ianimal,:,:] = map_cm_rank_temp[np.where(luc_animal_array=='shplmb')[0][0],:,:]
    elif proxy_animal_array[ianimal] == 'Swine':
        map_cm_rank[ianimal,:,:] = map_cm_rank_temp[np.where(luc_animal_array=='hogpig')[0][0],:,:]
    
del map_luc_rank, temp_data, map_cm_rank_temp

display(np.shape(map_cm_rank))

-----------------------
## Step 3. Read in and Format US EPA GHGI Data
-------------------------------

In [None]:
#Read in total EPA emissions from public report table 5.2 (in kt)
EPA_emi_agr_CH4 = pd.read_csv(EPA_AGR_inputfile, thousands=',', header=2,nrows = 10)
EPA_emi_agr_CH4 = EPA_emi_agr_CH4.drop(['Unnamed: 0'], axis=1)
EPA_emi_agr_CH4.rename(columns={EPA_emi_agr_CH4.columns[0]:'Source'}, inplace=True)
EPA_emi_agr_CH4 = EPA_emi_agr_CH4.drop(columns = [str(n) for n in range(1990, start_year,1)])
EPA_emi_ent_CH4 = EPA_emi_agr_CH4.loc[EPA_emi_agr_CH4['Source']=="Enteric Fermentation"]
EPA_emi_man_CH4 = EPA_emi_agr_CH4.loc[EPA_emi_agr_CH4['Source']=="Manure Management"]
EPA_emi_ent_CH4.reset_index(inplace=True, drop=True)
EPA_emi_man_CH4.reset_index(inplace=True, drop=True)
print('EPA GHGI National Enteric CH4 Emissions (kt):')
display(EPA_emi_ent_CH4)
print('EPA GHGI National Manure CH4 Emissions (kt):')
display(EPA_emi_man_CH4)


#### 3.2. Split Emissions into Gridding Groups

In [None]:
#split GHG emissions into gridding groups, based on Coal Proxy Mapping file

DEBUG =1
start_year_idx = EPA_emi_ent_CH4.columns.get_loc(str(start_year))
end_year_idx = EPA_emi_ent_CH4.columns.get_loc(str(end_year))+1
ghgi_livestock_groups = ghgi_livestock_map['GHGI_Emi_Group'].unique()
sum_emi = np.zeros([num_years])

for igroup in np.arange(0,len(EPA_emi_ent_CH4)): #loop through all groups, finding the GHGI sources in that group and summing emissions for that region, year        vars()[ghgi_prod_groups[igroup]] = np.zeros([num_regions-1,num_years])
    ##DEBUG## print(ghgi_stat_groups[igroup])
    vars()[ghgi_livestock_groups[igroup]] = np.zeros([num_years])
    source_temp = ghgi_livestock_map.loc[ghgi_livestock_map['GHGI_Emi_Group'] == ghgi_livestock_groups[igroup], 'GHGI_Source']
    pattern_temp  = '|'.join(source_temp) 
    #print(pattern_temp) 
    emi_temp =EPA_emi_ent_CH4[EPA_emi_ent_CH4['Source'].str.contains(pattern_temp)]
    #display(emi_temp)
    vars()[ghgi_livestock_groups[igroup]][:] = emi_temp.iloc[:,start_year_idx:].sum()
        
        
#Check against total summary emissions 
print('QA/QC #1: Check Processing Emission Sum against GHGI Summary Emissions')
for iyear in np.arange(0,num_years): 
    for igroup in np.arange(0,len(EPA_emi_ent_CH4)):
        if iyear ==0:
            vars()[ghgi_livestock_groups[igroup]][iyear] -= 0.5  ##NOTE: correct rounding error so sum of emissions = reported total emissions
        sum_emi[iyear] += vars()[ghgi_livestock_groups[igroup]][iyear]
        
    summary_emi = EPA_emi_ent_CH4.iloc[0,iyear+1]  
    #Check 1 - make sure that the sums from all the regions equal the totals reported
    diff1 = abs(sum_emi[iyear] - summary_emi)/((sum_emi[iyear] + summary_emi)/2)
    if DEBUG==1:
        print(summary_emi)
        print(sum_emi[iyear])
    if diff1 < 0.0001:
        print('Year ', year_range[iyear],': PASS, difference < 0.01%')
    else:
        print('Year ', year_range[iyear],': FAIL (check Production & summary tabs): ', diff1,'%') 

--------------
## Step 4. Grid Data
-------------

#### Step 4.1. Allocate emissions

##### Step 4.1.1 Assign the Appropriate Proxy Variable Names (state & grid)

In [None]:
# The names on the *left* need to match the 'Stationary_ProxyMapping' 'State_Proxy_Group' names 
# (these are initialized in Step 2). 
# The names on the *right* are the variable names used to caluclate the proxies in this code.
# Names on the right need to match those from the code in Step 2

#national --> state proxies (animal x state x year [X month])
State_ent_emi_animal = emi_state_ent_animal

#state --> county proxies (animal x state x county x year [x month])?
County_animal_counts = county_animal_counts

#county --> grid proxies (animal x0.01x0.01)
Map_animal_area_rank = map_cm_rank
Map_animal_area_rank_nongrid = 0 #rank does not include non-CONUS

# remove variables to clear space for larger arrays 


##### Step 4.1.2 Allocate National EPA Emissions to the State-Level

In [None]:
# Calculate state-level emissions 
# Emissions in kt
# State data = national GHGI emissions * state proxy/national total

DEBUG = 1

# Note that national emissions are retained for groups that do not have state proxies (identified in the mapping file)
# and are gridded in the next step

# Make placeholder emission arrays for each group
for igroup in np.arange(0,len(proxy_livestock_map)):
    vars()['State_'+proxy_livestock_map.loc[igroup,'GHGI_Emi_Group']] = np.zeros([len(proxy_animal_array),len(State_ANSI),num_years])
    vars()['NonState_'+proxy_livestock_map.loc[igroup,'GHGI_Emi_Group']] = np.zeros([len(proxy_animal_array),num_years])
        
#Loop over years
for iyear in np.arange(num_years):
    #Loop over states
    for istate in np.arange(len(State_ANSI)):
        for igroup in np.arange(0,len(proxy_livestock_map)):    
            if proxy_livestock_map.loc[igroup,'State_Proxy_Group'] != '-' and proxy_livestock_map.loc[igroup,'GHGI_Emi_Group'] != 'Emi_not_mapped':
                for ianimal in np.arange(0,len(proxy_animal_array)):
                    vars()['State_'+proxy_livestock_map.loc[igroup,'GHGI_Emi_Group']][ianimal,istate,iyear] = \
                        vars()[proxy_livestock_map.loc[igroup,'GHGI_Emi_Group']][iyear]* \
                        data_fn.safe_div(vars()[proxy_livestock_map.loc[igroup,'State_Proxy_Group']][ianimal,istate,iyear], \
                                     np.sum(vars()[proxy_livestock_map.loc[igroup,'State_Proxy_Group']][:,:,iyear]))   
            else:
                vars()['NonState_'+proxy_livestock_map.loc[igroup,'GHGI_Emi_Group']][iyear] = vars()[proxy_livestock_map.loc[igroup,'GHGI_Emi_Group']][iyear]
                
# Check sum of all gridded emissions + emissions not included in state allocation
print('QA/QC #1: Check weighted emissions against GHGI')   
for iyear in np.arange(0,num_years):
    summary_emi = EPA_emi_ent_CH4.iloc[0,iyear+1] 
    calc_emi = 0
    for igroup in np.arange(0,len(proxy_livestock_map)):
        calc_emi +=  np.sum(vars()['State_'+proxy_livestock_map.loc[igroup,'GHGI_Emi_Group']][:,:,iyear])+\
            np.sum(vars()['NonState_'+proxy_livestock_map.loc[igroup,'GHGI_Emi_Group']][:,iyear]) #np.sum(Emissions[:,iyear]) + Emissions_nongrid[iyear] + Emissions_nonstate[iyear]
    if DEBUG ==1:
        print(summary_emi)
        print(calc_emi)
    diff = abs(summary_emi-calc_emi)/((summary_emi+calc_emi)/2)
    if diff < 0.0001:
        print('Year ', year_range[iyear], ': PASS, difference < 0.01%')
    else:
        print('Year ', year_range[iyear], ': FAIL -- Difference = ', diff*100,'%')

##### 4.1.3 Allocate emissions to the county level (need to make sure Mules are allocated to area - no census data)

In [None]:
# Calculate county-level emissions (kt)
# Emissions in kt
# County data (by animal) = state emissions (by animal) * county proxy (by animal)/state total (by animal)

# If there are emissions in a state but no proxy data available in the entire state, 
# emissions are allocated within that state by relative county areas (this will be true for mules)

# If there are 
DEBUG = 1

# Note that national emissions are retained for groups that do not have state proxies (identified in the mapping file)
# and are gridded in the next step

# Make placeholder emission arrays for each group
for igroup in np.arange(0,len(proxy_livestock_map)):
    vars()['County_'+proxy_livestock_map.loc[igroup,'GHGI_Emi_Group']] = \
            np.zeros([len(proxy_animal_array),len(State_ANSI),len(County_ANSI),num_years])
    vars()['NonCounty_'+proxy_livestock_map.loc[igroup,'GHGI_Emi_Group']] = np.zeros([len(proxy_animal_array),num_years])
        
#Loop over years
for iyear in np.arange(0,num_years):
    for icounty in np.arange(0,len(County_ANSI)):
        istate = np.where(State_ANSI['ansi']==County_ANSI['State'][icounty])[0][0]
        state_ansi = State_ANSI['ansi'][istate]
        #print(icounty, istate)
        for igroup in np.arange(0,len(proxy_livestock_map)): 
            for ianimal in np.arange(0,len(proxy_animal_array)):
                emi_temp = vars()['State_'+proxy_livestock_map.loc[igroup,'GHGI_Emi_Group']][ianimal,istate,iyear]
                frac_temp = data_fn.safe_div(vars()[proxy_livestock_map.loc[igroup,'County_Proxy_Group']][ianimal,istate,icounty,iyear], \
                            np.sum(vars()[proxy_livestock_map.loc[igroup,'County_Proxy_Group']][ianimal,istate,:,iyear]))
                if emi_temp > 0 and frac_temp > 0:
                    vars()['County_'+proxy_livestock_map.loc[igroup,'GHGI_Emi_Group']][ianimal,istate,icounty,iyear] = emi_temp * frac_temp
                elif emi_temp > 0 and np.sum(vars()[proxy_livestock_map.loc[igroup,'County_Proxy_Group']][ianimal,istate,:,iyear]) == 0:                
                    #if state emissions >0 and no proxy data in that state, allocate based on relative county areas
                    frac_temp = data_fn.safe_div(County_ANSI.loc[icounty,'Area'],np.sum(County_ANSI['Area'][County_ANSI['State'] == state_ansi]))
                    vars()['County_'+proxy_livestock_map.loc[igroup,'GHGI_Emi_Group']][ianimal,istate,icounty,iyear] = emi_temp * frac_temp  
                else: 
                    # if there are no state emissions OR if there are state emissions, 
                    # there is proxy data in the state, but no proxy data in that county, skip that county and move to next
                    continue

# Check sum of all gridded emissions + emissions not included in state allocation
print('QA/QC #2: Check weighted emissions against GHGI')   
for iyear in np.arange(0,num_years):
    summary_emi = EPA_emi_ent_CH4.iloc[0,iyear+1] 
    calc_emi = 0
    for igroup in np.arange(0,len(proxy_livestock_map)):
        calc_emi +=  np.sum(vars()['County_'+proxy_livestock_map.loc[igroup,'GHGI_Emi_Group']][:,:,:,iyear])+\
            np.sum(vars()['NonCounty_'+proxy_livestock_map.loc[igroup,'GHGI_Emi_Group']][:,iyear])
    if DEBUG ==1:
        print(summary_emi)
        print(calc_emi)
    diff = abs(summary_emi-calc_emi)/((summary_emi+calc_emi)/2)
    if diff < 0.0001:
        print('Year ', year_range[iyear], ': PASS, difference < 0.01%')
    else:
        print('Year ', year_range[iyear], ': FAIL -- Difference = ', diff*100,'%')

##### 4.1.4 Allocate emissions to the CONUS region (0.1x0.1)

In [None]:
# To speed up the code, this notebook does not loop through each county, but instead loops through
# each lat/lon value in the CONUS region. Emissions are allocated based on the fraction of 
# the proxy that is in each grid cell relative to the total in that county. 
# Since the code is not using county masks, the sum of each proxy for each county/state pair
# must first be calcualted. 
# This chunk calculates the county totals for each animal for the area-weighted probability
# map and the county area map. 


Map_animal_area_rank_sum = np.zeros([len(proxy_animal_array),len(State_ANSI),len(County_ANSI)])
Area_sum = np.zeros([len(State_ANSI),len(County_ANSI)])

#For each grid box that falls within the continental US geographic bounds, keep a running sum to calculate 
# the total cm_rank for each animal type within each state and county. 
# Also keep a running sum of the total area within each state and county.
for ilat in np.arange(0, len(lat001)):
    for ilon in np.arange(0, len(lon001)):
        if state_ANSI_map[ilat,ilon] > 0: #only includes CONUS region
            istate = np.where(State_ANSI['ansi']==state_ANSI_map[ilat,ilon])[0][0]
            icounty = np.where((County_ANSI['State']==state_ANSI_map[ilat,ilon]) & \
                                    (County_ANSI['County']==county_ANSI_map[ilat,ilon]))[0][0]
            Area_sum[istate,icounty] += area_map[ilat,ilon]
            for ianimal in np.arange(0, len(proxy_animal_array)):                                          
                Map_animal_area_rank_sum[ianimal,istate,icounty] += Map_animal_area_rank[ianimal,ilat,ilon]

In [None]:
#will need to save yearly emissions as intermediate output and read back in due to memory limits
data_IO_fn.initialize_netCDF001(enteric_int_out, netCDF_description, 0, year_range, loc_dimensions, lat001, lon001)


In [None]:
# Enteric Gridding
# Loop through each lat/lon value on the CONUS grid. County emissions are allocated based on the
# fraction of proxy data in each grid cell relative to the sum of all proxy data in the gridcells
# within the relevant county. 
# If the county does not have animal probability data, then the county emissions are allocated by area
# Because this code takes a long time to run, the data are saved to a netCDF file after each calculated year
# Current computational speed is ~ 2.5-3 hours computation time per year of emissions processing

Emissions_array_01 = np.zeros([len(Lat_01),len(Lon_01),num_years])
Emissions_array_001 = np.zeros([len(lat001),len(lon001),num_years])
Emissions_nongrid = np.zeros([num_years])

print('**QA/QC Check: Sum of national gridded emissions vs. GHGI national emissions')
running_sum = np.zeros([len(proxy_livestock_map),num_years])
running_sum2 = np.zeros([len(proxy_livestock_map),num_years])

#for iyear in np.arange(0,num_years):
for igroup in np.arange(0,len(proxy_livestock_map)):
    #define the proxy and area arrays (there is only one gridding group)
    # this code needs to be manually changed if more gridding groups are added in the future
    proxy_temp = Map_animal_area_rank
    proxy_temp_nongrid = Map_animal_area_rank_nongrid
    proxy_temp_sum = Map_animal_area_rank_sum
    area_map_sum = Area_sum
    vars()['Ext_'+proxy_livestock_map.loc[igroup,'GHGI_Emi_Group']+'_01'] = np.zeros([len(lat001),len(lon001),num_years])
            
    for ilat in np.arange(0,len(lat001)):
        for ilon in np.arange(0,len(lon001)):
            if state_ANSI_map[ilat,ilon] > 0:
                istate = np.where(State_ANSI['ansi']==state_ANSI_map[ilat,ilon])[0][0]
                icounty = np.where((County_ANSI['State']==state_ANSI_map[ilat,ilon]) & \
                                    (County_ANSI['County']==county_ANSI_map[ilat,ilon]))[0][0]

                for ianimal in np.arange(0,len(proxy_animal_array)):
                    county_temp = vars()['County_'+proxy_livestock_map.loc[igroup,'GHGI_Emi_Group']][ianimal,istate,icounty,:]
                    if np.sum(county_temp) > 0:
                        if proxy_temp_sum[ianimal,istate,icounty] >0: # if there is animal count data in the county, allocate by animal counts in grid cell relative to county sum
                            weighted_array = data_fn.safe_div(proxy_temp[ianimal,ilat,ilon],\
                                                      proxy_temp_sum[ianimal,istate,icounty]) #counts at grid cell/counts in county
                            for iyear in np.arange(0, num_years):
                                Emissions_array_001[ilat,ilon,iyear] += county_temp[iyear]*weighted_array
                                running_sum[igroup,iyear] += weighted_array*county_temp[iyear]
                                vars()['Ext_'+proxy_livestock_map.loc[igroup,'GHGI_Emi_Group']+'_01'][ilat,ilon,iyear] += county_temp[iyear]*weighted_array
                        elif proxy_temp_sum[ianimal,istate,icounty] == 0: # if no animal county data in county, use relative area as proxy
                            #weight by county area
                            weighted_array = data_fn.safe_div(area_map[ilat,ilon],\
                                                          area_map_sum[istate,icounty]) #counts at grid cell/counts in county
                            for iyear in np.arange(0, num_years):
                                Emissions_array_001[ilat,ilon,iyear] += county_temp[iyear]*weighted_array
                                running_sum2[igroup,iyear] += weighted_array*county_temp[iyear]
                                vars()['Ext_'+proxy_livestock_map.loc[igroup,'GHGI_Emi_Group']+'_01'][ilat,ilon,iyear] += county_temp[iyear]*weighted_array

        print(ilat,running_sum[igroup,0])
        print(ilat,running_sum2[igroup,0])
    
#non-CONUS regions already filtered from the state_ANSI_map. Therefore, non-grid emissions
# have to be calcuated as the differences between national and CONUS emissions (not ideal as 
# this is not an independent calcualtion of non-grid emissions)
for iyear in np.arange(0, num_years):
    county_sum = 0
    for igroup in np.arange(0,len(proxy_livestock_map)):
        county_sum += np.sum(vars()['County_'+proxy_livestock_map.loc[igroup,'GHGI_Emi_Group']][:,:,:,iyear])
    Emissions_nongrid[iyear] = county_sum -np.sum(Emissions_array_001[:,:,iyear])
    print(Emissions_nongrid[0])
    
for igroup in np.arange(0, len(proxy_livestock_map)):
    vars()['Ext_'+proxy_livestock_map.loc[igroup,'GHGI_Emi_Group']] = np.zeros([len(Lat_01),len(Lon_01),num_years])
    
for iyear in np.arange(0, num_years):  
    Emissions_array_01[:,:,iyear] = data_fn.regrid001_to_01(Emissions_array_001[:,:,iyear], Lat_01, Lon_01)
    
    calc_emi = np.sum(Emissions_array_01[:,:,iyear]) + np.sum(Emissions_nongrid[iyear]) 
    calc_emi2 = 0
    for igroup in np.arange(0, len(proxy_livestock_map)):
        vars()['Ext_'+proxy_livestock_map.loc[igroup,'GHGI_Emi_Group']][:,:,iyear] = data_fn.regrid001_to_01(vars()['Ext_'+proxy_livestock_map.loc[igroup,'GHGI_Emi_Group']+'_01'][:,:,iyear], Lat_01, Lon_01)
        calc_emi2 += np.sum(vars()['Ext_'+proxy_livestock_map.loc[igroup,'GHGI_Emi_Group']][:,:,iyear])
    calc_emi2 += np.sum(Emissions_nongrid[iyear]) 
    summary_emi = EPA_emi_ent_CH4.iloc[0,iyear+1] 
    emi_diff = abs(summary_emi-calc_emi)/((summary_emi+calc_emi)/2)
    if DEBUG ==1:
        print(summary_emi)
        print(calc_emi)
        print(calc_emi2)
    diff = abs(summary_emi-calc_emi)/((summary_emi+calc_emi)/2)
    if diff < 0.0001:
        print('Year ', year_range[iyear], ': PASS, difference < 0.01%')
    else:
        print('Year ', year_range[iyear], ': FAIL -- Difference = ', diff*100,'%')

#### Step 4.1.4 Save gridded emissions (kt)

In [None]:
#save gridded emissions for each gridding group - for extension

#Initialize file
data_IO_fn.initialize_netCDF(grid_emi_outputfile, netCDF_description, 0, year_range, loc_dimensions, Lat_01, Lon_01)

unique_groups = np.unique(proxy_livestock_map['GHGI_Emi_Group'])
unique_groups = unique_groups[unique_groups != 'Emi_not_mapped']

nc_out = Dataset(grid_emi_outputfile, 'r+', format='NETCDF4')

for igroup in np.arange(0,len(unique_groups)):
    print('Ext_'+unique_groups[igroup])
    if len(np.shape(vars()['Ext_'+unique_groups[igroup]])) ==4:
        ghgi_temp = np.sum(vars()[unique_groups[igroup]],axis=3) #sum month data if data is monthly
    else:
        ghgi_temp = vars()['Ext_'+unique_groups[igroup]]

    # Write data to netCDF
    data_out = nc_out.createVariable('Ext_'+unique_groups[igroup], 'f8', ('lat', 'lon','year'), zlib=True)
    data_out[:,:,:] = ghgi_temp[:,:,:]

#save nongrid data to calculate non-grid fraction extension
data_out = nc_out.createVariable('Emissions_nongrid', 'f8', ('year'), zlib=True)  
data_out[:] = Emissions_nongrid[:]
nc_out.close()

#Confirm file location
print('** SUCCESS **')
print("Gridded emissions (kt) written to file: {}" .format(os.getcwd())+grid_emi_outputfile)
print(' ')

del data_out, ghgi_temp, nc_out

#### 4.2. Calculate Gridded Emission Fluxes (molec./cm2/s) (0.1x0.1)

In [None]:
#Convert emissions to emission flux
# conversion: kt emissions to molec/cm2/s flux

DEBUG = 1


Flux_array_01_annual = np.zeros([len(Lat_01),len(Lon_01),num_years])
print('**QA/QC Check: Sum of national gridded emissions vs. GHGI national emissions')
  
for iyear in np.arange(0,num_years):
    if year_range[iyear]==2012 or year_range[iyear]==2016:
        year_days = np.sum(month_day_leap)
        #month_days = month_day_leap
    else:
        year_days = np.sum(month_day_nonleap)
        #month_days = month_day_nonleap
        
    #for imonth in np.arange(0,num_months):
    conversion_factor_01 = 10**9 * Avogadro / float(Molarch4 *year_days * 24 * 60 *60) / area_matrix_01
    Flux_array_01_annual[:,:,iyear] += Emissions_array_01[:,:,iyear]*conversion_factor_01
    
    #convert back to mass to check
    #conversion_factor_annual = 10**9 * Avogadro / float(Molarch4 *year_days * 24 * 60 *60) / area_matrix_01
    calc_emi = np.sum(Flux_array_01_annual[:,:,iyear]/conversion_factor_01)+np.sum(Emissions_nongrid[iyear])
    summary_emi = EPA_emi_ent_CH4.iloc[0,iyear+1] 
    emi_diff = abs(summary_emi-calc_emi)/((summary_emi+calc_emi)/2)
    if DEBUG ==1:
        print(calc_emi)
        print(summary_emi)
    if abs(emi_diff) < 0.0001:
        print('Year '+ year_range_str[iyear]+': Difference < 0.01%: PASS')
    else: 
        print('Year '+ year_range_str[iyear]+': Difference > 0.01%: FAIL, diff: '+str(emi_diff))
        
Flux_Emissions_Total_annual = Flux_array_01_annual

-------------
## Step 5. Write netCDF
------------

In [None]:
# yearly data
#Initialize file
data_IO_fn.initialize_netCDF(gridded_outputfile, netCDF_description, 0, year_range, loc_dimensions, Lat_01, Lon_01)

# Write data to netCDF
nc_out = Dataset(gridded_outputfile, 'r+', format='NETCDF4')
nc_out.variables['emi_ch4'][:,:,:] = Flux_Emissions_Total_annual
nc_out.close()
#Confirm file location
print('** SUCCESS **')
print("Gridded stationary combustion fluxes written to file: {}" .format(os.getcwd())+gridded_outputfile)

----------
## Step 6. Plot Gridded Data
---------

#### Step 6.1. Plot Annual Emission Fluxes

In [None]:
#Plot Annual Data
scale_max = 10
save_flag =0
save_outfile = ''
data_plot_fn.plot_annual_emission_flux_map(Flux_Emissions_Total_annual, Lat_01, Lon_01, year_range, title_str,scale_max,save_flag,save_outfile)

#### Step 6.2 Plot Difference between first and last inventory year

In [None]:
# Plot difference between last and first year
save_flag =0
save_outfile = ''
data_plot_fn.plot_diff_emission_flux_map(Flux_Emissions_Total_annual, Lat_01, Lon_01, year_range, title_diff_str,save_flag, save_outfile)

In [None]:
ct = datetime.datetime.now() 
ft = ct.timestamp() 
time_elapsed = (ft-it)/(60*60)
print('Time to run: '+str(time_elapsed)+' hours')
print('** GEPA_3A_Livestock_Enteric: COMPLETE **')