# Iron-Python Notebook for Processing Raw Hardware Results

This will be an all-in-one package for processing data/results of workload execution archives in specific directories in this repository.
It is assumed that all the results were captured using WorkloadExec-v2.py file, so that everything is archived appropriately.


## About fields and its interpretation

Before explaining the fields, its essential to understand how data is gathered & sources.

<img src="report/phase-2/rsrc/Experiment-setup.drawio.png" width="50%">

As shown in the above image, there are two data sources in this experiment setup
1. **ODroid-XU4 hardware:** which is running the workload sent by development PC 
	- This provides two sets of data:
		1. **Perf-stat** data captured while the worload executed, with a sampling interval of 100ms. Details about the features(perf counters) being sampled is discussed later.
		1. **Polled Data**: To capture data from ODroid board which are not available in perf-list. This includes thermal, CPU usage statistics etc.
1. **SmartPower3 Power Monitor:** which provides the total power drawn by ODroid XU-4 hardware

### Time Stamp information

From the above mentioned source/sampling method, each record will be having UTC timestamp - using which this data will be collated.

|     Field Name    |               Description                                                                     |
|-------------------|-----------------------------------------------------------------------------------------------|
|  utctime          |  UTC Time field - output of mergig of above mentioned datasets.                               |
|  utctime_x        |  UTC Time from perf-stat data added for reference only and is an output of table merging.     |
|  utctime_y        |  UTC Time from power monitor data added for reference only and is an output of table merging. |
|  ts_utc           |  UTC Time from polled data added for reference only and is an output of table merging.        |



### Perf-stat data 

The columns corresponding to Perf-stat command output of a workload will be like the ones listed below.

```console
...
S0-D0-C0_branch-instructions
S0-D0-C0_branch-misses
S0-D0-C0_branch-load-misses
S0-D0-C0_branch-loads
S0-D0-C0_bus-cycles
S0-D0-C0_cpu-cycles
S0-D0-C0_instructions
S0-D0-C0_cache-misses
S0-D0-C0_cache-references
S0-D0-C0_cpu-clock
S0-D0-C0_L1-dcache-load-misses
S0-D0-C0_L1-dcache-loads
S0-D0-C0_L1-dcache-store-misses
S0-D0-C0_L1-dcache-stores
S0-D0-C0_L1-icache-load-misses
S0-D0-C0_L1-icache-loads
S0-D0-C0_LLC-load-misses
S0-D0-C0_LLC-loads
S0-D0-C0_LLC-store-misses
S0-D0-C0_LLC-stores
...
...
```

This is  following a naming template like so:
```
<Silicon-#>-<Die-#>-<Core-#>_<perf-field-name>
```
So a fieldname prefix of S0-D0-C0 implies core-1 of the CPU on the chip. Similarly, you will find 
- S0-D0-C0 ~ S0-D0-C3
- S1-D0-C0 ~ S1-D0-C3
wherein S0 and S1 will correspond to Little and Big cluster with  C0~C3 cores per cluster.

More info: https://man7.org/linux/man-pages/man1/perf-stat.1.html#:~:text=%2D%2Dper%2Dcore%20Aggregate%20counts,(system%2Dwide). (search for -per-core)


Now on to the field with names after:

|        Field Name          |               Description                             |
|----------------------------|-------------------------------------------------------|
|  branch-instructions       |  Number of branch instructions executed               |
|  branch-misses             |  Number of branch missed                              |
|  branch-load-misses        |  Number of load misses                                |
|  branch-loads              |  Number of branch loads                               |
|  bus-cycles                |  Bus cycle counted                                    |
|  cpu-cycles                |  CPU Cycle counted                                    |
|  instructions              |  Number of instructions executed                      |
|  cache-misses              |  Number of cache misses                               |
|  cache-references          |  Number of Cache references                           |
|  cpu-clock                 |  CPU clocks utilized                                  |
|  L1-dcache-load-misses     |  Number of L1 Data Cache load misses                  |
|  L1-dcache-loads           |  Number of L1 Data cache loads                        |
|  L1-dcache-store-misses    |  Number of L1 Data Cache store misses                 |
|  L1-dcache-stores          |  Number of L1 Data cache stores                       |
|  L1-icache-load-misses     |  Number of L1 instruction cache load misses           |
|  L1-icache-loads           |  Number of L1 instruction cache loads                 |
|  LLC-load-misses           |  Number of Last-level-cache (L2) cache load misses    |
|  LLC-loads                 |  Number of Last-level-cache (L2) cache loads          |
|  LLC-store-misses          |  Number of Last-level-cache (L2) cache store misses   |
|  LLC-stores                |  Number of Last-level-cache (L2) cache stores misses  |


### Polled data 

These fields are captured by polling at a regular interval while workload is executing, by sampling sysfs file (/sys/devices/virtual/thermal/thermal_zone[0~3]/temp)

|        Field Name    |               Description                             |
|----------------------|-------------------------------------------------------|
|  therm_cpu0          |  Thermal reading reported by CPU thermal region#0     |
|  therm_cpu1          |  Thermal reading reported by CPU thermal region#1     |
|  therm_cpu2          |  Thermal reading reported by CPU thermal region#2     |
|  therm_cpu4          |  Thermal reading reported by CPU thermal region#3     |

**Note/Correction**: therm_cpu4 is mislabled as 4, instead it is 3 corresponding to thermal_zone3 values


### Power Monitor data 

|        Field Name          |               Description                                                                        |
|----------------------------|--------------------------------------------------------------------------------------------------|
|  dev_ippwr-ch1-volts_mV    |  Voltage(mV) supplied on to channel-1 of SM3 power supply - on to which ODroid-XU4 is connected. |
|  dev_ippwr-ch1-ampere_mA   |  Current(mA) drawn by the connected board from channel-1.                                        |
|  dev_ippwr-ch1-watt_mW     |  Power(mW) drawn by the connected board from channel-1.                                          |



## Imports

In [1]:

# General Packages
import os
import sys
import glob
import io
import shutil
## For handling BZ archive of result files
import requests
import tempfile
import tarfile

# For data processing 
import datetime
import re
import csv
import pandas


## Utility Functions

In [2]:
def extract_data(data_archive:str, tmp_working_dir: str) ->str:
    # TempDirectory to string returns something like ' <TemporaryDirectory ' prefixed
    ## for handling tmp directory cleanly, striping off these characters
    tmpfilePath = str(tmp_working_dir).split(' ')[1].replace('\'','').replace('>','')
    # print('Temporary directory used: '+tmpfilePath+'/')

    # Extract Files
    archive = tarfile.open(data_archive, 'r:bz2')
    # An assumption is made here that the archives are stored at directory level, than directly files
    # Hence getting the first level directory name 
    # e.g.: <TarInfo '11-03-2023_14-58-03_BigCore-10itr-50msSmplg-CPUFreq-2.0GHz' at 0x7fbc676ecd00>
    dirname=str(archive.getmembers()[0]).split(' ')[1].replace('\'','')

    archive.extractall(tmpfilePath+'/')
    archive.close()
    return os.path.join(tmpfilePath,dirname)

## Function to process Perf-stat output

In [3]:
# Define regular expressions to match the lines with data
# Regular expression for [1] & [2], and associated result variable(s)
re_1_2_firstlevel = re.compile(r"""
                                ^    # Line start
                                [#]+ # Immediate occurance of '#'
                                .*   # Anything following that
                            """, re.X)
re_1_timestamp_info = re.compile(r"""
                                ^[#]+                 # Line Start with # character
                                \s+started\s+on\s+    # This string will be present before timestamp
                                (.*)                  # Time of perf start
                                """, re.X)


# Regular expression for [4]
re_4_summary_header = re.compile(r"""^\s+
                                Performance\scounter\sstats\sfor\s
                                \'(.*)\'\s+              # Type of monitoring in perf
                                \((\d+)\sruns\):
                                """,re.X)

# Regular expression for [3]
re_3_perf_stat_record = re.compile(r"""^\s+
                                (\d+\.\d+)\s+          # Timestamp
                                (\S+)\s+(\d+)\s+       # CPU-ID, CPUs
                                (.*)                   # counts, [unit], events, event-info
                                """, re.X)
re_3_subrec_countfield_nc = re.compile(r"""
                                    <not\scounted>\s+     # Is it a <not counted field> ?
                                    (.*)
                                    """, re.X)
re_3_subrec_countfield = re.compile(r"""
                                    (\d+[.\d]*)\s+        # counts
                                    (.*)              # units or event-name
                                    [#]+(.*)                  # Event extra info
                                    """, re.X)

In [4]:
def Process_ProfFile(proffile:str, outcsvfile:str) -> None:
    with io.open(proffile, 'rt') as  proffile_entry:
        
        stats_gatherer = {}
        perf_stat_starttime = ''
        perf_stat_starttime_dateobj = None
        perf_stat_currtime_dateobj = None
        perf_stat_columns = []
        perf_stat_runtype=''
        perf_stat_runcount=0


        records_ended = False
        record_counter = 0
        prev_timestamp =''
        rec_timestamp = rec_cpuid = rec_cpucnt = rec_counter = rec_eventname = ''
        csvout = open(outcsvfile, 'w') 
        csvwriter = None
        #Decoding the prof/perf-stat file data
        for line in proffile_entry.readlines():
            # print (str(len(line))+':'+line)

            ## Okay so, its a line starting with # character
            if re_1_2_firstlevel.match(line):
                # is it a time stamp info line??
                re_1_match = re_1_timestamp_info.match(line)
                if (re_1_match):
                    perf_stat_starttime = re_1_match.group(1)
                    perf_stat_starttime_dateobj = datetime.datetime.strptime(perf_stat_starttime,'%a %b %d %H:%M:%S %Y')
                    perf_stat_starttime_pddateobj = pandas.Timestamp(perf_stat_starttime_dateobj, unit='ns')
                    perf_stat_currtime_dateobj = datetime.datetime.strptime(perf_stat_starttime,'%a %b %d %H:%M:%S %Y')
                    perf_stat_currtime_pddateobj = pandas.Timestamp(perf_stat_currtime_dateobj)
                    # print('Start Time:'+str(perf_stat_currtime_pddateobj))
                else:
                    # Then it must be a column header info line
                    perf_stat_columns = line[1:].split()
            else:
                # So its lines other than meta info ones ([1] & [2])
                # just to skip the last few records as we only need the perf-stat samples, not the summary at the end
                if (records_ended == False):
                    re_4_match = re_4_summary_header.match (line)
                    re_3_match = re_3_perf_stat_record.match(line)
                
                    if (re_4_match):
                        perf_stat_runtype  = re_4_match.group(1)
                        perf_stat_runcount = int(re_4_match.group(2))
                        records_ended = True
                        # Okay, so we have reached till the summary section
                        # write the last record
                        record_counter += 1
                        csvwriter.writerow(stats_gatherer)
                        stats_gatherer.clear()
                        
                    elif (re_3_match):
                        rec_timestamp = re_3_match.group(1)
                        rec_cpuid     = re_3_match.group(2)
                        rec_cpucnt    = re_3_match.group(3)


                        if (prev_timestamp==''):
                            prev_timestamp = rec_timestamp
                            ns = perf_stat_starttime_pddateobj+pandas.Timedelta(seconds=float(rec_timestamp))
                            # print('PDTime:'+str(ns)+' subsec: '+rec_timestamp)
                            stats_gatherer['utctime'] = ns

                        if (prev_timestamp != rec_timestamp):
                            ## Okay, we are encountering a new record, so dump the old one
                            record_counter += 1
                            if (record_counter == 1):
                                csvwriter =  csv.DictWriter(csvout, stats_gatherer.keys())
                                csvwriter.writeheader()
                            csvwriter.writerow(stats_gatherer)
                            stats_gatherer.clear()
                            prev_timestamp = rec_timestamp

                            ns = perf_stat_starttime_pddateobj+pandas.Timedelta(seconds=float(rec_timestamp))
                            # print('PDTime:'+str(ns)+' subsec: '+rec_timestamp)
                            stats_gatherer['utctime'] = ns

                        re3__ncsubrec_match = re_3_subrec_countfield_nc.match(re_3_match.group(4))

                        if (re3__ncsubrec_match):
                            trec = re3__ncsubrec_match.group(1)
                            # print (rec_cpuid + ' NC - '+trec)
                            stats_gatherer[rec_cpuid+'_'+trec] = 'NaN'
                            pass
                        else:

                            re_3_subrec_match = re_3_subrec_countfield.match(re_3_match.group(4))
                            if (re_3_subrec_match):
                                split_str =re_3_subrec_match.group(2).split() 
                                # print('event: '+re_3_subrec_match.group(1)+'-'+ split_str[len(split_str)-1] )
                                rec_counter = re_3_subrec_match.group(1)
                                rec_eventname = split_str[len(split_str)-1]
                            else:
                                ## Special cases, some instructions stats count seems to be having no '#' separator
                                ## Handle  it separately. e.g:
                                ##     808323      instructions                                                         (3.64%)
                                ##     52451588      instructions                                                         (8.98%)
                                ##     424251      instructions                                                         (8.92%)
                                ##     394060      instructions                                                         (8.86%)
                                tstr = re_3_match.group(4)
                                rec_counter   = tstr[0]
                                rec_eventname = tstr[1]
                                assert len(tstr) != 3,'Some invalid records structure found, please check!!!'
                                # print(re_3_match.group(4))

                            if (rec_eventname == 'cpu-clock'):
                                stats_gatherer[rec_cpuid+'_'+rec_eventname] = float(rec_counter)
                            else:
                                stats_gatherer[rec_cpuid+'_'+rec_eventname] = int(rec_counter)
                                
                else:
                    ## For now, we are only going to process the individual records, not the summary
                    ## So we can very well break out of the loop
                    break


        # print ('Start time: '+ perf_stat_starttime)
        # print ('Columns: '+ str(perf_stat_columns))
        # print ('runs type: '+ perf_stat_runtype)
        # print ('runs count: '+ str(perf_stat_runcount) )
        # print ('Total Records: '+ str(record_counter) )



# Generating Combined dataset per test results

In [108]:
def ProcessTestResults( data_perfstat_file:str,
                        data_power_file:str,
                        data_poll_file:str,
                        outputcsvfile:str,      # Absolute/Full path to output CSV file
                        tolerance_perf_to_power = pandas.Timedelta('50 milliseconds'),   # Tolerance value in UTC-time to merge perf-stat and SmartPower3 data
                        tolerance_perf_to_poll  = pandas.Timedelta('5 seconds'),
                        with_prof_results: bool = True
                       ) -> None:
    # Forming file names
    # data_prof = os.path.join(resultsdir,testnameprefix+'.prof')
    # data_power    = os.path.join(resultsdir,testnameprefix+'.powdata')
    # data_polldata = os.path.join(resultsdir,testnameprefix+'.polldata')
    if with_prof_results:
        temp_data_prof_csv  = data_perfstat_file+'csv'

        # Process the perf-stat file to generate intermediate CSV file for further processing
        Process_ProfFile(data_perfstat_file, temp_data_prof_csv)

        # Now read all the CSV files
        prof_df = pandas.read_csv(temp_data_prof_csv)

    power_df = pandas.read_csv(data_power_file)
    poll_df = pandas.read_csv(data_poll_file)

    # Drop unncessary columsn which are not required and not influencing current test scenarios
    power_df.drop([
                    'sm_mstime', 'localtime',
                    'ps_ippwr-volts_mV', 'ps_ippwr-ampere_mA', 'ps_ippwr-watt_mW', 'ps_ippwr-status_b',
                    'dev_ippwr-ch0-volts_mV', 'dev_ippwr-ch0-ampere_mA', 'dev_ippwr-ch0-watt_mW', 'dev_ippwr-ch0-status_b', 'dev_ippwr-ch0-interrupts',
                    'dev_ippwr-ch1-status_b', 'dev_ippwr-ch1-interrupts',
                    'crc8-2sc','crc8-xor'
                    ],
                axis=1, 
                inplace=True)
    poll_df.drop( [
                    'ts_local', 
                    'stat_cpuid', 'stat_user', 'stat_nice', 'stat_system', 'stat_idle', 'stat_iowait',
                    'stat_irq', 'stat_softirq', 'stat_steal', 'stat_guest', 'stat_guest_nice'
                ],
                axis=1, 
                inplace=True)

    # print ('Perf/Profile Data:')
    # display(prof_df)

    # print ('Power data:')
    # display(power_df)

    # print('Polled data')
    # display(power_df)
    if with_prof_results:
        prof_df["utctime"] = pandas.to_datetime(prof_df['utctime'])
        prof_df.index = prof_df["utctime"]

    power_df["utctime"] = pandas.to_datetime(power_df['utctime'])
    power_df.index = power_df["utctime"]

    poll_df["ts_utc"] = pandas.to_datetime(poll_df['ts_utc'])
    poll_df.index = poll_df["ts_utc"]

    if with_prof_results:
        df = pandas.merge_asof(left=prof_df, right=power_df,right_index=True,left_index=True,direction='nearest',tolerance=tolerance_perf_to_power)
        df = pandas.merge_asof(left=df, right=poll_df,right_index=True,left_index=True,direction='nearest',tolerance=tolerance_perf_to_poll)
    else:
        df = pandas.merge_asof(left=power_df, right=poll_df,right_index=True,left_index=True,direction='nearest',tolerance=tolerance_perf_to_poll)


    pandas.set_option('display.max_columns', None)
    pandas.set_option('display.expand_frame_repr', False)

    # print('Merged data')
    # display(df)

    print('Saving Merged data to CSV location: '+ outputcsvfile)
    df.to_csv(outputcsvfile,header=True)


## Main body to Iterate and process each results set

In [109]:
# direcotories to process , so just converting them to absolute paths
__results_archive_dirs__ = [
        '01-Simple-Idling/MaxFan',
        '01-Simple-Idling/NoFan',
        # '02-Idling-PerfSleep/MaxFan',
        # '02-Idling-PerfSleep/NoFan',
        # '03-Workloads'

    ]
__results_archive_dirs_profflags__ = [
        False,
        False,
        True,
        True,
        True,
    ]


__output_subdir__ = 'combined_dataset'
__output_dirs__ = []
regex_filename = re.compile(r'([\d-]+)_([\d-]+)_(.*GHz)\.tar\.bz2')
results_archive_dirs = []
for dir in __results_archive_dirs__:
    results_archive_dirs.append(os.path.abspath('results/'+dir))
    __output_dirs__.append(os.path.abspath(__output_subdir__+'/'+dir))

# Now iteratively process the archives and create combined data set for each
total_dirs = len(results_archive_dirs)
curr_dir = 1
for curr_archive_dir in results_archive_dirs:
    print ('Processing ('+str(curr_dir)+'/'+str(total_dirs)+'): '+ curr_archive_dir)
    output_dir = __output_dirs__[curr_dir-1]
    if not os.path.exists(output_dir):
        os.makedirs(output_dir)
    

    __res_archives = glob.glob(curr_archive_dir+'/*.tar.bz2')
    for archives in __res_archives:
        print ('Archive: '+ archives)
        output_subdir = os.path.join(output_dir,regex_filename.match(os.path.basename(archives))[3])
        print ('Output to be placed in : '+ output_subdir)
        if not os.path.exists(output_subdir):
            os.makedirs(output_subdir)


        #Temporary files/directories for handling data
        tmpdir = tempfile.TemporaryDirectory()
        # print(tmpdir)
        
        data_dir = extract_data(archives,tmpdir)
        # print ('Temporary extraction to : '+data_dir)

        if (__results_archive_dirs_profflags__[curr_dir-1]):
            __raw_results = glob.glob(data_dir+'/*.prof')
        else:
            __raw_results = glob.glob(data_dir+'/*.powdata')
        filecount = len(__raw_results)
        tctr = 1
        print('Number of results to process: '+str(filecount))
        for raw_result in __raw_results:
            __prefix_ = os.path.splitext(raw_result)[0]
            __basename__ = os.path.basename(raw_result)
            __outfile__ = os.path.join(output_subdir,__basename__+'.csv')

            pow_data_file = __prefix_+'.powdata'
            poll_data_file = __prefix_+'.polldata'
            # try:
            print ('Processing  File ('+str(tctr)+'/'+str(filecount)+'): '+ __basename__)
            # print (raw_result, pow_data_file, tctr)
            ProcessTestResults(data_perfstat_file=raw_result,
                            data_power_file=pow_data_file,
                            data_poll_file=poll_data_file,
                            outputcsvfile=__outfile__,
                            with_prof_results=__results_archive_dirs_profflags__[curr_dir-1]
                            )
            # except ValueError as e:
            #     print ('Invalid data entry found in one of the files for '+str(raw_result))
            tctr += 1
    curr_dir += 1


Processing (1/2): /home/vaisakh/developer/modeling/E0240_ModSim-PowerMon/results/01-Simple-Idling/MaxFan
Archive: /home/vaisakh/developer/modeling/E0240_ModSim-PowerMon/results/01-Simple-Idling/MaxFan/11-14-2023_08-30-00_Idleworkload-MaxFan-LittleCore-60sidle-CPUFreq-0.7GHz.tar.bz2
Output to be placed in : /home/vaisakh/developer/modeling/E0240_ModSim-PowerMon/combined_dataset/01-Simple-Idling/MaxFan/Idleworkload-MaxFan-LittleCore-60sidle-CPUFreq-0.7GHz
Number of results to process: 1
Processing  File (1/1): Idling.powdata
Saving Merged data to CSV location: /home/vaisakh/developer/modeling/E0240_ModSim-PowerMon/combined_dataset/01-Simple-Idling/MaxFan/Idleworkload-MaxFan-LittleCore-60sidle-CPUFreq-0.7GHz/Idling.powdata.csv
Archive: /home/vaisakh/developer/modeling/E0240_ModSim-PowerMon/results/01-Simple-Idling/MaxFan/11-13-2023_10-25-19_Idleworkload-MaxFan-BigCore-60sidle-CPUFreq-1.8GHz.tar.bz2
Output to be placed in : /home/vaisakh/developer/modeling/E0240_ModSim-PowerMon/combined_da

In [None]:
# Compress the combined_dataset
source_dir = 'combined_dataset'
with tarfile.open('combined_dataset.tar.bz2', "w:bz2") as tar:
        tar.add(source_dir, arcname=os.path.basename(source_dir))
#shutil.rmtree(os.path.abspath(__output_subdir__))