# Profiling Reports: Node Performance lib

This notebook analyses the performance records created using Node's Performance library.

- Status: WIP

---

> Currently, `src/profiling/verify-presentations.ts` needs to be manually updated to track a particular implementation (e.g., SolidVCActorFactory, DidVCActorFactory)

Workflow

- Assumption: `src/profiling/verify-presentations.ts`'s factory (cfr. `main()`) is set to `solidVCActorFactory` -- yes this should be parametrizable.
- `npm run sandbox:profile:verify-presentations`
- Update `src/profiling/verify-presentations.ts`'s factory (cfr. `main()`) by setting it to `didVCActorFactory` -- yes this should be parametrizable.
- `npm run sandbox:profile:verify-presentations`
- Run this notebook

In [1]:
import pandas as pd
import json
from glob import glob
import os
import matplotlib.pyplot as plt
import seaborn as sns

In [2]:
fpaths = glob('performance-entries.*.json')
fpaths

['performance-entries.SolidVCActorFactory.json',
 'performance-entries.DidVCActorFactory.json']

In [3]:
dfs = pd.concat(
    dict(map(lambda x: (x.split('.')[1], pd.read_json(x)), fpaths))
    ,axis=0)
dfs.index.set_names(['factory','record_idx'],inplace=True)
dfs.reset_index(inplace=True)
display(dfs.head(3))
display(dfs.factory.value_counts())

Unnamed: 0,factory,record_idx,name,entryType,startTime,duration,initiatorType,workerStart,redirectStart,redirectEnd,fetchStart,requestStart,responseStart,responseEnd,transferSize,encodedBodySize,decodedBodySize,detail
0,SolidVCActorFactory,0,http://localhost:3000/alice/profile/card,resource,3289.530792,67.407625,fetch,0.0,0.0,0.0,3289.530792,0.0,0.0,3356.938417,300.0,0.0,0.0,
1,SolidVCActorFactory,1,http://localhost:3000/alice/profile/card,resource,3362.861042,40.259167,fetch,0.0,0.0,0.0,3362.861042,0.0,0.0,3403.120209,300.0,0.0,0.0,
2,SolidVCActorFactory,2,http://localhost:3000/alice/profile/card,resource,3403.969084,13.209083,fetch,0.0,0.0,0.0,3403.969084,0.0,0.0,3417.178167,300.0,0.0,0.0,


factory
SolidVCActorFactory    307
DidVCActorFactory      288
Name: count, dtype: int64

**Mark type**

Group `mark`-entrypTypes

In [4]:
dfs[dfs.entryType == 'mark'].groupby('factory').name.value_counts()

factory              name                                                                                                                                                                                                                                                                                   
DidVCActorFactory    https://w3id.org/security/bbs/v1                                                                                                                                                                                                                                                           72
                     https://www.w3.org/2018/credentials/v1                                                                                                                                                                                                                                                     60
                     https://w3id.org/citizenship/v1                                 

**RESOURCE TYPE**
Group `resource`-entryTypes

In [5]:
display(dfs[dfs.entryType == 'resource'].groupby('factory').name.value_counts())

factory              name                                             
DidVCActorFactory    https://w3id.org/security/v1                         12
                     https://w3id.org/security/v2                         12
SolidVCActorFactory  https://w3id.org/security/v1                         12
                     https://w3id.org/security/v2                         12
                     http://localhost:3000/alice/profile/card              5
                     http://localhost:3001/pseudo/profile/card             5
                     http://localhost:3002/recruiter/profile/card          5
                     http://localhost:3003/university/profile/card         5
                     http://localhost:3004/government/profile/card         5
                     http://localhost:3000/alice/profile/card#key          2
                     http://localhost:3000/alice/profile/card#me           2
                     http://localhost:3003/university/profile/card#key     1
     

---


In [6]:
dfs

Unnamed: 0,factory,record_idx,name,entryType,startTime,duration,initiatorType,workerStart,redirectStart,redirectEnd,fetchStart,requestStart,responseStart,responseEnd,transferSize,encodedBodySize,decodedBodySize,detail
0,SolidVCActorFactory,0,http://localhost:3000/alice/profile/card,resource,3289.530792,67.407625,fetch,0.0,0.0,0.0,3289.530792,0.0,0.0,3356.938417,300.0,0.0,0.0,
1,SolidVCActorFactory,1,http://localhost:3000/alice/profile/card,resource,3362.861042,40.259167,fetch,0.0,0.0,0.0,3362.861042,0.0,0.0,3403.120209,300.0,0.0,0.0,
2,SolidVCActorFactory,2,http://localhost:3000/alice/profile/card,resource,3403.969084,13.209083,fetch,0.0,0.0,0.0,3403.969084,0.0,0.0,3417.178167,300.0,0.0,0.0,
3,SolidVCActorFactory,3,http://localhost:3000/alice/profile/card,resource,3418.321167,11.834000,fetch,0.0,0.0,0.0,3418.321167,0.0,0.0,3430.155167,300.0,0.0,0.0,
4,SolidVCActorFactory,4,http://localhost:3000/alice/profile/card,resource,3431.123667,17.124208,fetch,0.0,0.0,0.0,3431.123667,0.0,0.0,3448.247875,300.0,0.0,0.0,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
590,DidVCActorFactory,283,did:key:zUC79nD3kTq71aFonkDrcbR1UuN8ZD7Txy4X8E...,mark,7433.099166,0.000000,,,,,,,,,,,,start
591,DidVCActorFactory,284,did:key:zUC79nD3kTq71aFonkDrcbR1UuN8ZD7Txy4X8E...,mark,7433.342166,0.000000,,,,,,,,,,,,end
592,DidVCActorFactory,285,https://www.w3.org/ns/did/v1,mark,7433.352583,0.000000,,,,,,,,,,,,start
593,DidVCActorFactory,286,https://www.w3.org/ns/did/v1,mark,7433.353958,0.000000,,,,,,,,,,,,end


Create dataframe with only `mark` records (because Node's automatically registers performance records for each `fetch`, i.e., the records of entryType = `resource`)

In [7]:
dfs_marks = dfs[dfs.entryType == 'mark']
dfs_marks

Unnamed: 0,factory,record_idx,name,entryType,startTime,duration,initiatorType,workerStart,redirectStart,redirectEnd,fetchStart,requestStart,responseStart,responseEnd,transferSize,encodedBodySize,decodedBodySize,detail
25,SolidVCActorFactory,25,https://www.w3.org/2018/credentials/v1,mark,7037.287792,0.0,,,,,,,,,,,,start
26,SolidVCActorFactory,26,https://www.w3.org/2018/credentials/v1,mark,7037.771834,0.0,,,,,,,,,,,,end
27,SolidVCActorFactory,27,https://w3id.org/security/bbs/v1,mark,7040.181792,0.0,,,,,,,,,,,,start
28,SolidVCActorFactory,28,https://w3id.org/security/bbs/v1,mark,7040.215792,0.0,,,,,,,,,,,,end
29,SolidVCActorFactory,29,https://w3id.org/citizenship/v1,mark,7041.157917,0.0,,,,,,,,,,,,start
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
590,DidVCActorFactory,283,did:key:zUC79nD3kTq71aFonkDrcbR1UuN8ZD7Txy4X8E...,mark,7433.099166,0.0,,,,,,,,,,,,start
591,DidVCActorFactory,284,did:key:zUC79nD3kTq71aFonkDrcbR1UuN8ZD7Txy4X8E...,mark,7433.342166,0.0,,,,,,,,,,,,end
592,DidVCActorFactory,285,https://www.w3.org/ns/did/v1,mark,7433.352583,0.0,,,,,,,,,,,,start
593,DidVCActorFactory,286,https://www.w3.org/ns/did/v1,mark,7433.353958,0.0,,,,,,,,,,,,end


Obtain the record indices that mark the start and end of *verify presentation*, so we can consider only the performance records registered during that period).

In [8]:
verify_dataframes = {}
for x in ['presentation01','presentation02']:
    a = f'start verify {x}'
    b = f'end verify {x}'
    idx_start = dfs[dfs.name.str.contains(a)].set_index('factory').record_idx.to_dict()
    idx_end = dfs[dfs.name.str.contains(b)].set_index('factory').record_idx.to_dict()
    fac_df_verify = {}
    for fac in ['DidVCActorFactory', 'SolidVCActorFactory']:
        
        i0,i1 = idx_start[fac], idx_end[fac]
        assert i0 < i1
        fac_df_verify[fac] = dfs_marks.set_index(['factory','record_idx'])\
            .loc[fac]\
            .loc[i0+1:i1-1] # the slice between start & verify presentation (excluding the records that mark the start and verify presentation records)
    df_verify = pd.concat(fac_df_verify,names=['factory'])
    df_verify.dropna(how='all',axis=1,inplace=True) # clean up nan columns
    df_verify = df_verify.reset_index().sort_values(['startTime'],ascending=[True])
    # Compute a pair-wise incremental group index (this will allow us to unstack on the column detail, to obtain one row that contains the start and end startTimes)
    df_verify['group_idx'] =  df_verify.groupby(['factory']).cumcount() // 2
    df_verify.reset_index(inplace=True)
    # Unstack by detail so that each row contains the start and end times
    df_verify_unstacked = df_verify.set_index(['factory','name','detail','group_idx']).startTime.unstack('detail').sort_values('start')
    df_verify_unstacked['duration'] = df_verify_unstacked.end - df_verify_unstacked.start
    verify_dataframes[x] = df_verify_unstacked

In [9]:
dfs_v = pd.concat(verify_dataframes, names=['presentation'])
dfs_v

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,detail,end,start,duration
presentation,factory,name,group_idx,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
presentation01,DidVCActorFactory,https://www.w3.org/2018/credentials/v1,0,5159.416416,5159.4135,0.002916
presentation01,DidVCActorFactory,https://w3id.org/security/bbs/v1,1,5159.462125,5159.46025,0.001875
presentation01,DidVCActorFactory,https://w3id.org/citizenship/v1,2,5159.982125,5159.978958,0.003167
presentation01,DidVCActorFactory,https://w3id.org/security/suites/jws-2020/v1,3,5163.956833,5163.950083,0.00675
presentation01,DidVCActorFactory,did:key:zUC79nD3kTq71aFonkDrcbR1UuN8ZD7Txy4X8EHUQRDmVUpxUqCLQkRGMPCbTWpjsx56qHVWvvZMTg5chbYMCEKBfvxnucEEmz4RvFysZscnav2vu1PjvXo9a2ACGFWmnAku7ZD#zUC79nD3kTq71aFonkDrcbR1UuN8ZD7Txy4X8EHUQRDmVUpxUqCLQkRGMPCbTWpjsx56qHVWvvZMTg5chbYMCEKBfvxnucEEmz4RvFysZscnav2vu1PjvXo9a2ACGFWmnAku7ZD,4,5164.258666,5163.988083,0.270583
presentation01,DidVCActorFactory,https://www.w3.org/ns/did/v1,5,5164.277916,5164.276291,0.001625
presentation01,DidVCActorFactory,did:key:zUC79nD3kTq71aFonkDrcbR1UuN8ZD7Txy4X8EHUQRDmVUpxUqCLQkRGMPCbTWpjsx56qHVWvvZMTg5chbYMCEKBfvxnucEEmz4RvFysZscnav2vu1PjvXo9a2ACGFWmnAku7ZD,6,5202.403916,5202.125541,0.278375
presentation01,DidVCActorFactory,https://www.w3.org/ns/did/v1,7,5202.484458,5202.482333,0.002125
presentation01,SolidVCActorFactory,https://www.w3.org/2018/credentials/v1,0,9976.933584,9976.928459,0.005125
presentation01,SolidVCActorFactory,https://w3id.org/security/bbs/v1,1,9977.010125,9977.007334,0.002791


In [10]:
dfs_v.reset_index().groupby(['presentation','factory']).duration.sum()

presentation    factory            
presentation01  DidVCActorFactory       0.567416
                SolidVCActorFactory    91.801957
presentation02  DidVCActorFactory       0.665124
                SolidVCActorFactory    34.891377
Name: duration, dtype: float64

In [472]:
dfs_v.reset_index().groupby(['presentation','factory']).duration.sum()

presentation    factory            
presentation01  DidVCActorFactory      135.854959
                SolidVCActorFactory    234.984584
presentation02  DidVCActorFactory      107.536875
                SolidVCActorFactory    179.459501
Name: duration, dtype: float64

In [473]:
dfs_v.to_excel('dfs_v_performance_marks_agg-preload-did-v1-context.xlsx')

---