In [1]:
import pandas as pd

In [40]:
# read in dataset
xl = pd.ExcelFile("data/130N_Cycles_1-47.xlsx")
df = xl.parse("Specimen_RawData_1")
df

Unnamed: 0,Time,Load
0,0.0000,0.06729
1,0.0018,0.07128
2,0.1000,0.05453
3,0.2000,0.00621
4,0.3000,0.00352
5,0.4000,0.02063
6,0.5000,-0.00168
7,0.6000,-0.01183
8,0.7000,-0.00167
9,0.8000,-0.00656


This is what the dataset currently looks like - it has 170,101 rows and two columns.<br><br>
The dataset contains data from 47 cycles following an experiment. The output of these experiments form the two columns:<br>
- time (seconds)
- load (exerted force, in Newtons)
<br><br>
My task is to predict the load for cycles 48, 49, and 50.<br><br>
I will:
- derive the time for each cycle
- heating time for each cycle
- cooling time for each cycle
- number of cycles (I can do this my finding the local maxima and minima throughout the data)


In [41]:
# append data from time column to list
time = []
for item in df.index:
    time.append(df["Time"][item])

# append data from load column to list
load = []
for item in df.index:
    load.append(df["Load"][item])

In [42]:
# convert time array to np array for further processing
np_time = array(time)

In [43]:
# for local maxima
max = argrelextrema(np_time, np.greater)
print("local maxima array for time is:", max, "\n")

# for local minima
min = argrelextrema(np_time, np.less)
print("local minima array for time is:", min)


local maxima array for time is: (array([], dtype=int64),) 

local minima array for time is: (array([], dtype=int64),)


The arrays above actually look empty...<br><br>
After further research into Python's numpy library, I realized that argrelextrema with np.greater or np.less does NOT consider repeated values to be relative maxima (https://github.com/scipy/scipy/issues/3749).<br><br>
A strict inequality is required to satisfy both sides of the point.

In [44]:
# for local maxima
max_ = argrelextrema(np_time, np.greater_equal)
print("local maxima array for time is:", max_, "\n")

# for local minima
min_ = argrelextrema(np_time, np.less_equal)
print("local minima array for time is:", min_)

local maxima array for time is: (array([ 31934,  47151,  55544,  58516,  61698,  84166,  87735,  91038,
       104427, 120421, 129657, 133953, 155582, 164994, 170100]),) 

local minima array for time is: (array([     0,  31935,  47152,  55545,  58517,  61699,  84167,  87736,
        91039, 104428, 120422, 129658, 133954, 155583, 164995]),)


I applied the _equal parameter to my argrelextrema function and notice that no duplicate values have occurred, which is a good sign so far.

In [45]:
print("The length of the max array for time is:",np.size(max_), "\n")
print("The length of the min array for time is:",np.size(min_), "\n")

The length of the max array for time is: 15 

The length of the min array for time is: 15 



However, it's odd that the numbers returned from each array is only 15...considering there are 47 cycles present in the dataset. So, I will try another method instead (https://docs.scipy.org/doc/numpy/reference/generated/numpy.r_.html).

In [46]:
row_wise_merging = np.r_[True, np_time[1:] < np_time[:-1]] & np.r_[np_time[:-1] < np_time[1:], True]

In [47]:
# print side of row_wise_merging
size = np.size(row_wise_merging)
print(size)
print(row_wise_merging)

170101
[ True False False ..., False False False]


In [48]:
count = 0
for i in np.nditer(row_wise_merging):
    if i == True:
        count +=1 
print(count)

1


The output of numpy's row wise merging method does not prove to be helpful as well.<br><br>
Since I am not familiar with how to compute local maxima and minima manually on my own, I will need to do further research on that.

As of now, I am planning to proceed with other calculations based on the array returned from argrelextrema. When I retrieve the right array for time, I believe I can modify my presented algorithm below.

In [49]:
# places indices returned from local maxima into a list
local_max_indices = []
for idx in np.nditer(max_):
    local_max_indices.append(idx)   
print(local_max_indices)

[array(31934), array(47151), array(55544), array(58516), array(61698), array(84166), array(87735), array(91038), array(104427), array(120421), array(129657), array(133953), array(155582), array(164994), array(170100)]


In [50]:
# create a list of sums of time and load up until
# index in local_max_indices list
concat_data = []
for idx, (t, l) in enumerate(zip(time, load)):
   # print(idx, t, l)
    for item in local_max_indices:
        if idx == item:
            concat_data.append((sum(time[:idx]),sum(load[:idx])))

for item in range(len(concat_data)):
    print("Cycle", item)
    print("Time:", concat_data[item][0])
    print("Load:", concat_data[item][1])
    print("\n")

Cycle 0
Time: 50090330.3853
Load: 886827.10324


Cycle 1
Time: 109224942.15
Load: 1336267.88375


Cycle 2
Time: 151578795.284
Load: 1544432.4858


Cycle 3
Time: 168237161.8
Load: 1634193.77493


Cycle 4
Time: 187035884.677
Load: 1731646.36961


Cycle 5
Time: 348122651.618
Load: 2427467.38261


Cycle 6
Time: 378284414.766
Load: 2573697.19744


Cycle 7
Time: 407315820.051
Load: 2683299.27422


Cycle 8
Time: 536007570.556
Load: 3108964.80223


Cycle 9
Time: 712921430.822
Load: 3680416.99285


Cycle 10
Time: 826598329.026
Load: 4024847.81762


Cycle 11
Time: 882348975.542
Load: 4191389.98264


Cycle 12
Time: 1190793342.17
Load: 5124183.32977


Cycle 13
Time: 1339483600.32
Load: 5513159.5854


Cycle 14
Time: 1423820416.46
Load: 5761623.5985




As mentioned before, the results above are unrealistic since we know that there are 47 cycles (rather than the 15 that were outputted) that exist. Once I have the correct values returned from the local maxima function, I can proceed with modifying my code for that array. 

My next step: to implement an algorithm that would make the actual predictions.