#  Problematic Reports Regarding Performance

---

>You will know how to answer the following questions at the end of this notebook.
>- [x] What is a problematic report regarding performance?

In [35]:
import pandas as pd

### Loading actives reports and lightning performance

Active Reports

In [36]:
active_reports = pd.read_csv("../../datasets/active_reports.csv", low_memory=False)

In [37]:
active_reports.shape

(38285, 51)

Lightning Performance

In [38]:
lightning_performance = pd.read_csv("../../data/Salesforce/ELF/LightningPerformance/2022-06-04_LightningPerformance.csv",low_memory=False)

In [39]:
list(lightning_performance.columns)

['EVENT_TYPE',
 'TIMESTAMP',
 'REQUEST_ID',
 'ORGANIZATION_ID',
 'USER_ID',
 'CLIENT_ID',
 'SESSION_KEY',
 'LOGIN_KEY',
 'USER_TYPE',
 'APP_NAME',
 'DEVICE_PLATFORM',
 'SDK_APP_VERSION',
 'OS_NAME',
 'OS_VERSION',
 'USER_AGENT',
 'BROWSER_NAME',
 'BROWSER_VERSION',
 'SDK_VERSION',
 'DEVICE_MODEL',
 'DEVICE_ID',
 'SDK_APP_TYPE',
 'CLIENT_GEO',
 'CONNECTION_TYPE',
 'UI_EVENT_ID',
 'UI_EVENT_TYPE',
 'UI_EVENT_SOURCE',
 'UI_EVENT_TIMESTAMP',
 'PAGE_START_TIME',
 'DURATION',
 'DEVICE_SESSION_ID',
 'TIMESTAMP_DERIVED',
 'USER_ID_DERIVED',
 'CLIENT_IP']

### Merge actives reports with lightning performance 

To merge the active reports and lightning performance was define [common columns](https://github.com/dell-splab/lightning-analysis/blob/research-questions/Data%20Discovery%20-%20MUST%20READ.ipynb) beteween both

In [40]:
common_columns = ['USER_ID_DERIVED', 'SESSION_KEY', 'LOGIN_KEY', 'ORGANIZATION_ID', 'CLIENT_IP']

Remove unsued columns before merge

In [55]:
lightning_performance.drop(['EVENT_TYPE','TIMESTAMP','REQUEST_ID','USER_ID','USER_TYPE','TIMESTAMP_DERIVED'], inplace=True, axis=1)

In [56]:
ltng_full_reports_performance = pd.merge(left=active_reports, right=lightning_performance, on=common_columns)

In [57]:
ltng_full_reports_performance.shape

(7176291, 73)

In [58]:
list(ltng_full_reports_performance.columns)

['CreatedById',
 'CreatedDate',
 'Description',
 'DeveloperName',
 'FolderName',
 'Format',
 'Id',
 'IsDeleted',
 'LastModifiedById',
 'LastModifiedDate',
 'LastReferencedDate',
 'LastRunDate',
 'LastViewedDate',
 'Name',
 'NamespacePrefix',
 'OwnerId',
 'ReportTypeApiName',
 'SystemModstamp',
 'EVENT_TYPE',
 'TIMESTAMP',
 'REQUEST_ID',
 'ORGANIZATION_ID',
 'USER_ID',
 'RUN_TIME',
 'CPU_TIME',
 'URI',
 'SESSION_KEY',
 'LOGIN_KEY',
 'USER_TYPE',
 'REQUEST_STATUS',
 'DB_TOTAL_TIME',
 'ENTITY_NAME',
 'DISPLAY_TYPE',
 'RENDERING_TYPE',
 'REPORT_ID',
 'ROW_COUNT',
 'NUMBER_EXCEPTION_FILTERS',
 'NUMBER_COLUMNS',
 'AVERAGE_ROW_SIZE',
 'SORT',
 'DB_BLOCKS',
 'DB_CPU_TIME',
 'NUMBER_BUCKETS',
 'TIMESTAMP_DERIVED',
 'USER_ID_DERIVED',
 'CLIENT_IP',
 'URI_ID_DERIVED',
 'REPORT_ID_DERIVED',
 'ORIGIN',
 'IsActiveSinceCreation',
 'IsActiveSinceLastModification',
 'CLIENT_ID',
 'APP_NAME',
 'DEVICE_PLATFORM',
 'SDK_APP_VERSION',
 'OS_NAME',
 'OS_VERSION',
 'USER_AGENT',
 'BROWSER_NAME',
 'BROWSER_VER

### Problematic reports

To determine which reports are problematic, a DURATION (The duration in milliseconds since the page start time as per this [reference](https://developer.salesforce.com/docs/atlas.en-us.238.0.object_reference.meta/object_reference/sforce_api_objects_eventlogfile_lightningperformance.htm?q=performance)) is assumed to be > 120 seconds.

In [59]:
threshold = 120000

In [60]:
ltng_full_reports_performance  = ltng_full_reports_performance[ltng_full_reports_performance.DURATION >= threshold]

In [61]:
ltng_full_reports_performance.shape

(86196, 73)

Storing a dataset with only active and problematic reports.

In [63]:
ltng_full_reports_performance.to_csv("datasets/active_and_problematic_reports_performance.csv", index=False)