# He_line_calc: a notebook for reducing He line data

## Introduction

This notebook reduces data produced by a Pfieffer PrismaPlus220 in the Helium Analysis Laboratory (HAL) at the University of Illinois. The HAL uses h6t and Pychron software for data reporting and measures masses 1-5 on line blanks, hot blanks, line gas standards, and samples. We use an isotope dilution approach with a $^3$He spike and a $^4$He reference gas of a known volume. As such, 4/3 gas ratios are measured with corrections for H, D, and HD. This notebook reads in the raw data files from either h6t or Pychron and reports out He amounts in terms of pmol.

Instructions for the use of this notebook are provided before each cell of code and should be followed step-by-step. Only one notebook is needed for a complete set of analyses that may encompass multiple days (referred throughout in this notebook as a "session"). Some cells will be run at the very beginning of data collection, whereas other cells will be run repeatedly as new data is collected. __Pay attention as to which cells need to be run once, and which will be repeatedly run throughout the data collection process!__ If you are unfamiliar with Jupyter Notebooks, the "run" button is found at the top of the notebook, and each cell can be run by highlighting it with the mouse and then clicking the "run" button.

### Step-by-step instructions

First, save a copy of this notebook to the relevant file on the HAL desktop (C:\Users\lab-admin\Desktop\Line_summary_sheets\20XX\XXX20XX\'lastname' where the Xs are specific to the date and 'lastname' is the user's last name) and make sure that all data files for a given session are saved to the same folder.

This first cell imports some useful packages and sets some constants that will be used throughout the notebook and needs to be __run only once__.

In [25]:
import pandas as pd
import numpy as np
import math
import matplotlib.pyplot as plt
%matplotlib inline

initial_tank_4He = 6

This next cell of code reads each individual run file produced by h6t. You will need to __run this cell everytime you collect new data from the PrismaPlus__. In the cell below, enter the name of the data file you saved to the folder in the file_name variable. __You will need to type within the '' and use the file extension__. This should be the name of the sample, line blank, hot blank, or line standard. Line blanks have the style: 'lb_mmddyyyy' where mm = the month, dd = the day, and yyyy = the year. Hot blanks have the style: 'hb_mmddyyyy'. Line standards have the style: 'stdXXXX' where XXXX is the shot number from the $^4$He pipette as recorded in the notebook. 

In [33]:
file_name = 'MID.xlh'
Prisma_data_list = []

with open(file_name, mode='r') as in_file:
    #read in first line
    line = in_file.readline()
    
    #find the start of the mass intensity data
    while not line.startswith('cycle'):
        line = in_file.readline()
    
    #read in the data lines to a list of lists (because cycle number is not known from run to run) until the end
    line_present = True
    while line_present:
        line = in_file.readline()
        
        if not line:
            line_present = False
        else:
            line = line.split()
            line = [float(i) for i in line]
            Prisma_data_list.append(line)

#convert data_list to an array for easier indexing
Prisma_data_array=np.array(Prisma_data_list)

#create time list (x-values) and corrected 4He/3He list (y-values)
t_list = [(Prisma_data_array[i,1]-Prisma_data_array[0,1])*24*60*60 for i in range(len(Prisma_data_array))]
He_ratio_list = [(Prisma_data_array[i,5]-Prisma_data_array[i,6])/(Prisma_data_array[i,4]-Prisma_data_array[i,6]-0.005*Prisma_data_array[i,2]) for i in range(len(Prisma_data_array))]

#do some math to find the intercept and mean of the corrected 4He/3He
sum_t_y = 0
sum_t2 = 0
sum_slope_err = 0

for i in range(len(t_list)):
    sum_t_y = sum_t_y + t_list[i]*He_ratio_list[i]
    sum_t2 = sum_t2 + t_list[i]**2

slope = (len(t_list)*sum_t_y - sum(t_list)*sum(He_ratio_list))/(len(t_list)*sum_t2 - sum(t_list)**2)
intercept = (sum(He_ratio_list) - slope*sum(t_list))/len(t_list)

for i in range(len(t_list)):
    sum_slope_err = sum_slope_err + (He_ratio_list[i] - intercept - slope*t_list[i])**2

del_slope = math.sqrt(sum_slope_err/(len(t_list) - 2)) * math.sqrt(len(t_list)/(len(t_list)*sum_t2 - sum(t_list)**2))
del_intercept = math.sqrt(sum_slope_err/(len(t_list) - 2)) * math.sqrt(sum_t2/(len(t_list)*sum_t2 - sum(t_list)**2))

mean_4He_3He = np.mean(He_ratio_list)
std_4He_3He = np.std(He_ratio_list)

print('The int and err for this sample is ',intercept,' +/- ',del_intercept)
print('and the mean and std dev is ',mean_4He_3He,' +/- ',std_4He_3He)


The int and err for this sample is  1.0293228342467817  +/-  0.00281305514384124
and the mean and std dev is  1.032146706034711  +/-  0.004645595132030236
