# Java Flight Recorder vs. gc log analysis
### Ellis Brown  (6/7/2021)

#### This notebook hopes to highlight the timing discrepencies between Zulu Mission Control's java flight recorder, and the output times in the log.

> important files: 
> - zulu_output_process: parses a CSV file using a regex string to extract gc pause times from a jfr file inspected using zulu mission control
> - process_log handles parsing and analysis of a log file




In [2]:
# In this cell, I hope to be able to analyze the difference between this data and Java/Zulu Flight Recorder
from scripts import zulu_output_process as zul
from scripts import process_log as pl
import math 
pl.setLogPath("datasets/long_amzn_workload.log") # set the file to analyze
pl.setLogSchema(0)                               # set the log schema (default 0)
# helper function to remove the last character
def remove_last_char(item):
    return float(item[:-1])

# Access the pause information
pause_table = pl.getYoungPauses(create_csv = False)

# Extract the column that has the pause duration
mine = pause_table[-2] # TODO: remove dependency on index information

# Access the pause information from flight recorder
with open("datasets/zulu_pauses_jfr.csv") as reader:
    zulu = reader.readlines()

# remove the '\n' character from each line
zulu = list(map(remove_last_char, zulu))

# calculate differences
## IMPORTANT NOTE: Zulu flight recorder considers concurrent pause/recycle ONE pause, mine TWO.
## Handle this case before using data to make conclusions
# possible solution 1: 
# If we see in pause_table[-1] the pattern 1 then 2, or 2 then 1, combine those values in 'mine'
# solution 2: Temporarily modify the file "process_log.py" to stop collecting one of the two pauses
table = []
sum_zulu = 0
sum_mine = 0
for idx in range(min(len(zulu),len(mine))):
    difference = float(zulu[idx]) - float(mine[idx])
    if (difference > 1):
        print("Difference: ", difference, end="")
        print("\tIndex : ", idx)
    table.append([mine[idx], zulu[idx], difference ])
    sum_zulu += zulu[idx]
    sum_mine += mine[idx]
    
print("Sum zulu: \t", round(sum_zulu, 4), "\n")
print("Sum mine: \t", round(sum_mine, 4), "\n")
print("Difference : \t", round(sum_zulu - sum_mine, 4), "\n")
print("(Difference / zulu) * 100: \t", round(100 * ((sum_zulu - sum_mine) / sum_zulu), 4), "\n")
print("Number of pauses", len(zulu),"\n")

ma = list(([val[2] for val in table]))
print("Max difference in recorded times: ", round(max(ma), 4), "\n")

print("Average time difference (absolute value)", round(math.sqrt(sum([val * val for val in ma])) / 128, 4))

    
print("\n\n| Mine       | Zulu | Difference (ms)")

for line in t:
    print(str(line[0]) +" ", "\t", line[1], "\t", line[2])




Difference:  1.3130000000000024	Index :  43
Difference:  79.749	Index :  141
Difference:  22.35	Index :  145
Difference:  17.736000000000004	Index :  147
Difference:  14.058999999999997	Index :  148
Difference:  6.736999999999995	Index :  149
Difference:  1.8850000000000051	Index :  150
Difference:  10.563999999999993	Index :  151
Difference:  9.272999999999996	Index :  152
Difference:  2.9609999999999985	Index :  153
Difference:  1.9969999999999999	Index :  155
Difference:  4.462999999999994	Index :  156
Difference:  8.254000000000005	Index :  157
Difference:  4.670999999999992	Index :  162
Difference:  2.0090000000000003	Index :  163
Difference:  15.837000000000003	Index :  166
Difference:  6.456999999999994	Index :  173
Difference:  6.051000000000002	Index :  175
Difference:  4.022000000000006	Index :  177
Difference:  2.010000000000005	Index :  180
Difference:  4.612000000000009	Index :  182
Difference:  7.984999999999999	Index :  184
Difference:  2.8880000000000052	Index :  186
Di