### Example code for data processing 
#### *Author: Ben Stottrup*
- Data is read from an excel file using pandas in a data frame
- Examine the data received to see what it looks like
- Convert the data frame to a numpy array for further processing
- Attempt to process data

In [1]:
#   Python uses a bunch of different packages -library's of pre-build functions.
#   to use them we need to load them into the memory

import numpy as np  #loads and renames numpy.  This is the Python workhorse tool for math
import pandas as pd #loads pandas.  A workhorse tool for data and data manipulation
import matplotlib.pyplot as plt #Loads the standard Graphing Package -simiar to Matlab
%matplotlib inline

In [2]:
#   This is a chunk of code to condition the data to be manipulated.
#   There are probably better ways to do it, but this keeps things simple
#   And consistent with how a student might conceptulaize things from Excel.

Isotherm_Data = pd.read_excel("../../DATA/Exp2.xls") #Loads a EXCEL file
Isotherm_Data.head()

FileNotFoundError: [Errno 2] No such file or directory: '../../DATA/Exp2.xls'

In [None]:
DATA = np.array(Isotherm_Data) # Converts the data into numbers and a vector for manipulation
Time = DATA[:,0]   ##Slicing DATA up into single vectors.
Trough_Area = DATA[:,1] ##Slicing DATA up into single vectors.
Mol_Area = DATA[:,2] ##Slicing DATA up into single vectors.
Pressure = DATA[:,5]## Slicing DATA up into a single vector of pressure.
print(DATA.dtype, Time.dtype, Trough_Area.dtype, Mol_Area.dtype, Pressure.dtype)

In [None]:
#   Now to slice off all of the data which happens after the maximimum pressure is reached.
#   This might not be "pythonic" but it gives clear names to the data we are working with.
#  In the case of these experiments we are adding 40 indices to capture a final movie

index_max = np.argmax(Pressure)+40  
Time = Time[0:index_max]
Trough_Area = Trough_Area[0:index_max]
Mol_Area = Mol_Area[0:index_max]
Pressure = Pressure[0:index_max]

In [None]:
#   In the routine below we try to find those times when the trough was stopped
#   We are using the differences between adjacent area and pressures.

Diff_MA = np.diff(Mol_Area)
Diff_Pres = np.diff(Pressure)
#Diff_ind = np.zeros(len(Diff_MA))    !!!
Pres_val = np.zeros(len(Diff_MA))   #  Defining this array using the np.zeros() function.
Area_val = np.zeros(len(Diff_MA))

count = 0
while (count < len(Diff_MA)):
#    if Diff_MA[count] > .15 and Diff_Pres[count] < 0.1 and Pressure[count] > 1:  !!!
    if Diff_Pres[count] < 0.1 and Pressure[count] > 1 and Diff_MA[count] > .15 and Pressure[count] > Pressure[count+1]:
 #      Diff_ind[count] = count    !!!
       Pres_val[count] = Pressure[count]
       Area_val[count] = Time[count]
       count += 1
    else: 
        count += 1
       
Stop_Pres = Pres_val[Pres_val != 0]
Stop_Area = Area_val[Area_val != 0]   #   This command clears out the zeros (place holders for compressing troughs)
Stop_Pres_Diff = np.diff(Stop_Pres)   #   Finds the differences between our "stopping pressures" to clean these values

count = 0   #  Resets a counter
Ave_Pres = []    # because I don't now how big this array should be I will treat it as a list.
Stopping_Pressure = []   # same.
Ave_Area = []
Stopping_Area = []

while (count < len(Stop_Pres_Diff)):
    if np.absolute(Stop_Pres_Diff[count]) < .5:   #  Testing if true stop
        Ave_Pres.append(Stop_Pres[count])    # appending a value to the lsit
        Ave_Area.append(Stop_Area[count])
        count += 1
    else:
        SP = np.mean(np.array(Ave_Pres))   # we no longer are within the stop, hence we calculate the mean pressures
        SA = np.mean(np.array(Ave_Area))
        Ave_Pres = []     #  we clear the list for the next iteration
        Ave_Area = []
        Stopping_Pressure.append(SP)    # We append the complete average of our stopping pressure
        Stopping_Area.append(SA)
        count += 1
            
SP = np.mean(np.array(Ave_Pres))  # A final append for the last iteration
Stopping_Pressure.append(SP)
SA = np.mean(np.array(Ave_Area))
Stopping_Area.append(SA)   

In [None]:
#  We have a list, and can't to algebra on it as easily.
#  So what we do is we convert to numpy arrays

Stopping_Pressure = np.array(Stopping_Pressure)
Final_Pressures = Stopping_Pressure[Stopping_Pressure > 0]
Stopping_Time = np.array(Stopping_Area)
Final_Times = Stopping_Time[Stopping_Time > 0]

In [None]:
#   Creating a plot to check the results.
#   This plots the values that were ultimately extracted for pauses
#   in the compression.
plt.plot(Time, Pressure)
plt.plot(Time[:-1], Pres_val )
plt.plot(Final_Times, Final_Pressures, 'g*', MarkerSize=10)