# Process Measurement Station meter data

- Code Author: Sahar H El Abbadi
- Date Created: 2023-03-29
- Last Edited: 2023-03-29

Code to extract select Kinder Morgan pipeline data and calculated associated uncertainty


In [1]:
# Imports
import pandas as pd
import pathlib
import datetime

# Load data
km_df = pd.read_csv (pathlib.PurePath('02_meter_data', 'km_gas_comp_all_days.csv'), parse_dates=['datetime'])

# Load gas composition spreadsheet that contains refill dates and the su_raw and su_normalized values
import_cols = ['refill_no', 'rawhide_refill_date', 'start_utc', 'end_utc', 'notes', 'su_notes', 'km_composition_date', 'su_raw', 'su_normalized']
gas_comp_summary = pd.read_csv(pathlib.PurePath('02_meter_data', 'gas_comp_clean_su.csv'),
                               nrows=15, usecols=import_cols, parse_dates=['start_utc', 'end_utc'])

## KM gas composition: mean and standard deviation

#### Mean Calculations
KM provided data for two taps that fed into the filling location. As a conservative estimate and to account for latency in gas refill system (ie if gas is stored onsite for several days prior to use), we calculate a five-day average for mean gas composition. First, we calculate a daily average across the two taps. Next, we average the daily value across five days, inclusive of the refill date as the final day. Thus, for a gas refill on October 5th, we average gas composition from October 1st through October 5th.

#### Standard Deviation calculations
We calculate the standard deviation (1-sigma) using the same approach as for calculating the mean

In [2]:
# Calculate average gas composition across the two taps
km_df['tap_mean'] = km_df[['tap_1', 'tap_2']].mean(axis=1)

# Calculate the average gas composition over five days: the gas refill day, and the four days prior
km_gas_comp_mean = {}
for index, row in gas_comp_summary.iterrows():
    refill_no = row['refill_no']
    km_composition_date = row['km_composition_date']
    # km_composition_date is the end time for averaging
    end_t = datetime.datetime.strptime(km_composition_date, '%Y-%m-%d')
    delta_t = pd.Timedelta(days=4)
    start_t = end_t - delta_t

    # Select averaging period
    time_mask = (km_df['datetime'] >= start_t) & (km_df['datetime'] <= end_t)
    ave_period = km_df.loc[time_mask].copy()

    # Calculate mean and sigma values
    ave_period_mean = ave_period['tap_mean'].mean()
    ave_period_sigma = ave_period['tap_mean'].std()

    # Save in dictionary
    km_gas_comp_mean[refill_no] = [refill_no, ave_period_mean, ave_period_sigma]

# Merge the average gas composition with the gas_comp_clean_su_km
km_gas_comp_mean = pd.DataFrame.from_dict(km_gas_comp_mean, orient='index',
                                          columns=['refill_no', 'km', 'km_sigma'])
gas_comp_summary = gas_comp_summary.merge(km_gas_comp_mean, on='refill_no')

## SU gas composition: mean and standard deviation

The Stanford team collected one or two gas samples from each truck refill. Samples were collected using canisters provided by Eurofins AirToxics laboratory in Fulsom (CA), where laboratory compositional analysis was conducted. We excluded results where canisters arrived at the laboratory with vacuum level exceeding 15 inches of mercury. Where multiple samples were available, we take the average of the two. Where samples were missing, we take the average of the samples taken on immediately before and after the missing sample dates.

Eurofins AirToxics Laboratory used Modified ASTM D-1945 to analyze samples using a gas chromatograph with two detectors, a flame ionization detector (FID) and a thermal conductivity detector (TCD). Quantification was conducted on the following: oxygen, nitrogen, carbon monoxide, methane, carbon dioxide, ethane, ethene, acetylene, propane, isobutane, butane, neopentane, isopoentane, pentane.

In the report provided by Eurofins Air Toxins Laboratory, the sum of the individual constituents of the gas sample provided was greater than 100%. Additionally, several samples included values for percent methane which were 100%, which does not align with typical values observed for compressed natural gas. Furthermore, percent recovery for quality control standards run with experimental samples ranged from 95% to 101%, potentially indicative of variability associated with sample loss during measurement.


##### SU Raw:
Here, we use raw values reported by Eurofins Air Toxins laboratory for gas composition, with averages as described above.

##### Su Normalized:
Here, we normalize the gas compositional values such that the sum of all components is 100%.

## Uncertainty in Eurofins Analysis

Eurofins AirToxics Laboratory provided quality control data for all internal standards analyzed on the gas chromatographs used for analyzing Stanford's samples, over the time period most aligned with when Stanford sample analysis was being conducted.

Eurofins AirToxics uses samples of known composition as quality control standards, and measures the percent recovery. Samples at Stanford were analyzed from October 28th, 2022 through December 9th, 2022. Eurofins provided quality control data for Q4 of 2022, which includes all standards analyzed from November 21, 2022 through February 21, 2023. During this period, the 95% confidence interval for methane percent recovery was 92% to 101%, corresponding to a 2-sigma value of 4.5%.

Eurofins did have information to allow us to account for the systematic bias observed in measurements associated with percent recoveries of standards not equal to 100% .

Due to the high level of uncertainty associated with the Eurofins measurements, we use Kinder Morgan analysis as default values.

In [3]:
# Add uncertainty associated with su_raw and su_normalized values

two_sigma = 0.045
gas_comp_summary['su_raw_sigma'] = two_sigma/2
gas_comp_summary['su_normalized_sigma'] = two_sigma/2

In [4]:
# Save values

gas_comp_summary.to_csv(pathlib.PurePath('02_meter_data', 'gas_comp_clean_su_km.csv'))

In [30]:
import pandas as pd
# from methods_source import select_methane_fraction, calc_meter_uncertainty
# all_days = ['10_10', '10_11', '10_12', '10_13', '10_14', '10_17', '10_18', '10_19', '10_24', '10_25', '10_26', '10_27', '10_28', '10_29', '10_30', '10_31', '11_01', '11_02', '11_03', '11_04', '11_07', '11_08', '11_10', '11_11', '11_14', '11_15', '11_16', '11_17', '11_18', '11_21', '11_22', '11_23', '11_28', '11_29', '11_30']
#
# for day in all_days:
#     # Load file for the selected day
#     month = day[0:2]
#     date = day[3:5]
#     col_names = ['datetime_utc', 'whole_gas_kgh', 'meter', 'data_qc']
#     daily_file = pd.read_excel(pathlib.PurePath('02_meter_data', 'daily_meter_data', 'whole_gas_raw', f'{month}_{date}.xlsx'), names=col_names, parse_dates = [0])
#
#     # Abbreviate meter names in raw meter file to be compatible with other functions
#     names = ['Baby Coriolis', 'Mama Coriolis', 'Papa Coriolis']
#     nicknames = ['bc', 'mc', 'pc']
#     for meter_name, meter_nickname in zip(names, nicknames):
#         daily_file.loc[daily_file['meter'] == meter_name, 'meter'] = meter_nickname
#
#     # Create storage array for new rows
#     daily_log = []
#
#     for index, row in daily_file.iterrows():
#         meter = row['meter']
#         whole_gas_kgh = row['whole_gas_kgh']
#         datetime_utc = row['datetime_utc']
#         fraction_methane, fraction_methane_sigma = select_methane_fraction(datetime_utc, gas_comp_source='km')
#         data_qc = row['data_qc']
#
#         # when meters aren't reading, set everything to pd.NA
#         if (pd.isna(meter)) or (pd.isna(whole_gas_kgh)):
#             meter = pd.NA
#             meter_sigma = pd.NA
#             whole_gas_kgh = pd.NA
#             fraction_methane = pd.NA
#             fraction_methane_sigma = pd.NA
#             methane_kgh = pd.NA
#             meter_percent_uncertainty = pd.NA
#
#         # for zero readings, set everything to 0
#         elif whole_gas_kgh == 0:
#             meter = 'None'
#             meter_sigma = 0
#             whole_gas_kgh = 0
#             fraction_methane_sigma = 0
#             methane_kgh = 0
#             meter_percent_uncertainty = 0
#
#         # for non-zero and non-NAN values, calculate all values:
#         else:
#             methane_kgh = whole_gas_kgh * fraction_methane
#             meter_percent_uncertainty = calc_meter_uncertainty(meter, whole_gas_kgh)
#             meter_sigma = meter_percent_uncertainty / 100 / 1.96 * whole_gas_kgh
#
#         new_row = {
#             'datetime_utc': datetime_utc,
#             'meter': meter,
#             'meter_percent_uncertainty': meter_percent_uncertainty,
#             'meter_sigma': meter_sigma,
#             'whole_gas_kgh': whole_gas_kgh,
#             'fraction_methane': fraction_methane,
#             'fraction_methane_sigma': fraction_methane_sigma,
#             'methane_kgh': methane_kgh,
#             'data_qc': data_qc,
#         }
#
#         daily_log.append(new_row)
#     daily_meter_file = pd.DataFrame(daily_log)
#     daily_meter_file.to_csv(pathlib.PurePath('02_meter_data', 'daily_meter_data', 'whole_gas_clean', f'{day}.csv'), na_rep='NA')

In [2]:
import pandas as pd
import datetime
from methods_source import calc_average_release
start_t = datetime.datetime(2022, 10, 10, 17, 13, 15)
stop_t = datetime.datetime(2022, 10, 10, 17, 14, 15)
calc_average_release(start_t, stop_t)

# def calc_average_release(start_t, stop_t):
#     """ Calculate the average flow rate and associated uncertainty given a start and stop time.
#     Inputs:
#       - start_t, stop_t are datetime objects
#
#     Outputs:
#       - Dictionary containing keys for "ch4_kgh_mean and ch4_kgh_sigma
#       - ch4_kgh_mean is the mean methane release rate over the period
#       - ch4_kgh_sigma is the sigma value that combines uncertainty associated with: 1) variability of flow rate over the time period being analyzed 2) meter reading 3) variability in gas composition"""
#
#     if start_t.date() != stop_t.date():
#         print('Do not attempt to calculate average flow across multiple dates. Please consider a new start or end time.')
#     else:
#         # Load data
#         file_name = start_t.strftime('%m_%d')
#         file_path = pathlib.PurePath('02_meter_data', 'daily_meter_data', 'whole_gas_clean', f'{file_name}.csv')
#
#         # Select data for averaging
#         meter_data = pd.read_csv(file_path, index_col=0, parse_dates=['datetime_utc'])
#         time_ave_mask = (meter_data['datetime_utc'] > start_t) & (meter_data['datetime_utc'] <= stop_t)
#         average_period = meter_data.loc[time_ave_mask].copy()
#         length_before_drop_na = len(average_period)
#
#         # Drop rows with NA values
#         average_period.dropna(axis='index', inplace=True, thresh=7)
#         length_after_drop_na = len(average_period)
#         print(f'Number of rows that were NA in the average period: {length_before_drop_na - length_after_drop_na}')
#
#         # Calculate mean and standard deviation for gas flow rate and ch4 flow rate
#         ch4_kgh_mean = average_period['methane_kgh'].mean()
#         gas_kgh_mean = average_period['whole_gas_kgh'].mean()
#         gas_kgh_sigma = average_period['whole_gas_kgh'].std()
#         print(f'gas std: {gas_kgh_sigma}')
#
#         # Calculate the mean methane fraction in case average_period straddles a period when the truck was changed
#         methane_fraction_mean = average_period[f'fraction_methane'].mean()
#         methane_fraction_sigma = average_period[f'fraction_methane_sigma'].mean()
#         print(f'methane fraction sigma: {methane_fraction_sigma}')
#
#         # Calculate the meter reading uncertainty for the mean gas flow rate
#         meter_sigma = average_period['meter_sigma'].mean()
#         print(f'meter sigma gas: {meter_sigma}')
#
#         # Combine sigma values
#
#         if ch4_kgh_mean == 0:
#             sigma_kgh_ch4 = 0
#         else:
#             # Combine uncertainties associated with variability in gas flow rate and meter uncertainty
#             gas_sigma = sum_of_quadrature(gas_kgh_sigma, meter_sigma)
#
#             # Combine uncertainty associated with gas flow rate (flow variability and meter reading) with the uncertainty associated with variability in gas composition.
#             # Because we multiply gas composition by gas flow rate, use sum of quadrature on the relative uncertainty values:
#
#             relative_gas_sigma = gas_sigma / gas_kgh_mean
#             relative_gas_comp_sigma = methane_fraction_sigma / methane_fraction_mean
#
#             ch4_kgh_sigma = sum_of_quadrature(relative_gas_sigma, relative_gas_comp_sigma) * ch4_kgh_mean
#
#     results_summary = {
#         'ch4_kgh_mean': ch4_kgh_mean,
#         'ch4_kgh_sigma': ch4_kgh_sigma,
#     }
#     return results_summary

# Return a dictionary with keys: ch4_kgh_mean, ch4_kgh_sigma,

Number of rows that were NA in the average period: 0
gas std: 0.06896714533943814
methane fraction sigma: 0.0016174275550494
meter sigma gas: 0.039565234708740606


{'ch4_kgh_mean': 6.612948731013749, 'ch4_kgh_sigma': 0.07532125880511012}