# Import Log and License Data

**Contains:**
- *Import Log Data*:  
Import log-data to pandas dataframe. Time data is converted to pandas timestap format.<br><br>     

- *Import License Info*:  
Import license data, which links license IDs to associated toolboxes.<br><br>      

- *Example: Access data in dataframe* 

In [34]:
# Import libraries
import pandas as pd  # Provides pandas "DataFrame"
import os            # OS specific commands for file import
import pickle        # Save and load data

## Import Log Data
Imports all datafiles within a folder and adds their content to a pandas dataframe. Saves dataframe.

*Load dataframe*:  
`dataset = pd.read_pickle(saveName)`

In [40]:
# =====================
# PARAMETER SETTING
# =====================

# Directory containing datafiles
directory = os.getcwd() + "/data"

# File extension of datafiles
fileExtension = ".csv"

# Storage name (load with pd.read_pickle(saveName))
saveName = "myDataFrame.pkl"

In [41]:
# -------------
# Import data
# -------------

# Setup empty dataframe
keyNames = ['License Number', 'Version', 'Start time', 'End time', 'Toolbox']
dataset = pd.DataFrame(columns=keyNames);

# Append data of log files
for file in os.listdir(directory):
    if file.endswith( fileExtension ):
        data = pd.read_csv(os.path.join(directory, file), names = keyNames, header = None)
        dataset = dataset.append(data, ignore_index=True)

# Convert times to Pandas Timestamps        
dataset['Start time'] = pd.to_datetime(dataset['Start time'], format='%d.%m.%Y %H:%M')
dataset['End time'] = pd.to_datetime(dataset['End time'], format='%d.%m.%Y %H:%M')        
        
# Save data
dataset.to_pickle(saveName)

# Show first 10 entries of dataframe
dataset.head(10)

Unnamed: 0,License Number,Version,Start time,End time,Toolbox
0,102,1.0,2020-05-25 08:30:00,2020-05-25 09:30:00,Matlab
1,102,1.0,2020-05-25 10:15:00,2020-05-25 10:30:00,Matlab
2,102,1.0,2020-05-27 07:00:00,2020-05-27 17:30:00,Matlab
3,101,1.0,2020-05-24 10:15:00,2020-05-24 10:30:00,Matlab
4,101,1.0,2020-05-24 17:15:00,2020-05-24 18:00:00,Matlab
5,101,1.0,2020-05-25 11:00:00,2020-05-25 15:00:00,Matlab
6,101,1.0,2020-05-25 11:15:00,2020-05-25 12:03:00,Simulink
7,101,1.0,2020-05-28 09:07:00,2020-05-29 11:23:00,Matlab
8,105,1.0,2020-05-24 10:08:00,2020-05-24 17:02:00,Matlab
9,105,1.0,2020-05-26 14:05:00,2020-05-26 15:44:00,Matlab


## Import License Info

Imports license info data. Links license number to available toolboxes for license. Saves license info as dictionary.

Format: `<License Number: [Toolbox names]>` 

Load dictionary:   
`import pickle
 with open('licenseInfo.pkl', 'rb') as f:
    lInfo = pickle.load(f)`

In [35]:
# Create license info dictionary
licenseInfo = {}
with open('licenses.txt') as readFile:
    for line in readFile:
        # Remove trailing newline or whitespace
        line = line.rstrip(' \n')
        elements = line.split(':')
        licenseInfo[int(elements[0])] = elements[1].split(',')
        
print(licenseInfo)

# Save license info file (load: pickle.load(filename))
with open('licenseInfo' + '.pkl', 'wb') as f:
    pickle.dump(licenseInfo, f, pickle.HIGHEST_PROTOCOL)

{101: ['Matlab', 'Simulink'], 102: ['Matlab'], 105: ['Matlab'], 113: ['Matlab', 'Simulink']}


## Example: Access data in dataframe

In [42]:
print("======================")
print("Only specific columns:")
print("======================")
dataView1 = dataset[['License Number', 'Toolbox']].head(5)
display(dataView1)

print("==================================")
print("Specific columns AND lines 3-10:")
print("==================================")
dataView2 = dataset.loc[3:10,['Toolbox', 'Version']]
display(dataView2)


print("============================")
print("Only data of license 105:")
print("============================")
dataView3 = dataset[ dataset['License Number'] == 105]
display(dataView3)

print("============================================")
print("Only entries with start time > 2020-05-25:")
print("============================================")
minTime_ts = pd.to_datetime('25.05.2020 00:00', format='%d.%m.%Y %H:%M')
dataView4 = dataset[ dataset['Start time'] > minTime_ts ]
display(dataView4)

print("==============")
print("Get entry:")
print("==============")
dataset.loc[3, 'Toolbox']

Only specific columns:


Unnamed: 0,License Number,Toolbox
0,102,Matlab
1,102,Matlab
2,102,Matlab
3,101,Matlab
4,101,Matlab


Specific columns AND lines 3-10:


Unnamed: 0,Toolbox,Version
3,Matlab,1.0
4,Matlab,1.0
5,Matlab,1.0
6,Simulink,1.0
7,Matlab,1.0
8,Matlab,1.0
9,Matlab,1.0
10,Matlab,1.0


Only data of license 105:


Unnamed: 0,License Number,Version,Start time,End time,Toolbox
8,105,1.0,2020-05-24 10:08:00,2020-05-24 17:02:00,Matlab
9,105,1.0,2020-05-26 14:05:00,2020-05-26 15:44:00,Matlab
10,105,1.0,2020-05-27 11:00:00,2020-05-27 11:30:00,Matlab


Only entries with start time > 2020-05-25:


Unnamed: 0,License Number,Version,Start time,End time,Toolbox
0,102,1.0,2020-05-25 08:30:00,2020-05-25 09:30:00,Matlab
1,102,1.0,2020-05-25 10:15:00,2020-05-25 10:30:00,Matlab
2,102,1.0,2020-05-27 07:00:00,2020-05-27 17:30:00,Matlab
5,101,1.0,2020-05-25 11:00:00,2020-05-25 15:00:00,Matlab
6,101,1.0,2020-05-25 11:15:00,2020-05-25 12:03:00,Simulink
7,101,1.0,2020-05-28 09:07:00,2020-05-29 11:23:00,Matlab
9,105,1.0,2020-05-26 14:05:00,2020-05-26 15:44:00,Matlab
10,105,1.0,2020-05-27 11:00:00,2020-05-27 11:30:00,Matlab


Get entry:


'Matlab'