LANL Unified Host and Network Data Set analysis (Turcotte, Kent and Hash, 2018) @ https://arxiv.org/pdf/1708.07518.pdf

Events Types

| EventID | Description |
| :--- | :--- |
| 4768 | Kerberos authentication ticket was requested (TGT) |
| 4769 | Kerberos service ticket was requested (TGS) |
| 4770 | Kerberos service ticket was renewed |
| 4774 | An account was mapped for logon |
| 4776 | The domain controller attempted to validate credentials |
| 4624 | An account was successfully logged on, see Logon Types |
| 4625 | An account failed to logon, see Logon Types |
| 4634 | An account was logged off, see Logon types |
| 4647 | User initiated logoff |
| 4648 | A logon was attempted using explicit credentials |
| 4672 | Special privileges assigned to a new logon |
| 4800 | The workstation was locked |
| 4801 | The workstation was unlocked |
| 4802 | The screensaver was invoked |
| 4803 | The screensaver was dismissedProcess events |
| 4688 | Process start |
| 4689 | Process end |
| 4608 | Windows is starting up |
| 4609 | Windows is shutting down |
| 1100 | Event lgging service has shut down(often recorded instead of EventID 4609) |

Logon Types

| Type | Description |
| - | - |
| 2 | Interactive |
| 5 | Service |
| 9 | NewCredentials |
| 3 | Network |
| 7 | Unlock |
| 10 | RemoteInteractive |
| 4 | Batch |
| 8 | NetworkClearText |
| 11 | CachedInteractive |
| 12 | CachedRemoteInteractive |
| 0 | Used only by the system account |

For full description of Windows Logging Service (WLS) event types see: https://www.ultimatewindowssecurity.com/securitylog/encyclopedia/default.aspx)

Event Fields (Note events contain the EventID, LogHost and Time fields plus other fields dependent of event type):

| Field | Description |
| :--- | :--- |
| Time | The epoch time of the event in seconds |
| EventID | Four digit integer corresponding to the event id of the record |
| LogHost | The hostname of the computer that the event was recorded on.  In the case of directed authen-tication events, the LogHostwill correspond to the computer that the authentication event is terminating at (destination computer) |
| LogonType | Integer corresponding to the type of logon, see Table 2 |
| LogonTypeDescription | Description of the LogonType, see Table 2 | 
| UserName | The user account initiating the event.  If the user ends in $, then it corresponds to a computer account for the specified computer |
| DomainName | Domain name of UserName |
| LogonID | A semi-unique (unique between current sessions andLogHost) number that identifies the logon session just initiated.  Any events logged subsequently during this logon session should report the same Logon ID through to the logoff event | 
| SubjectUserName | For authentication mapping events, the user account specified by this field is mapping to the user account in UserName | 
| SubjectDomainName | Domain name of SubjectUserName |
| SubjectLogonID | SeeLogonID |
| Status | Status of the authentication request.  “0x0” means success otherwise failure, see R. F. Smith for failure codes for the appropriate Event ID. | 
| Source | For  authentication  events,  this  will  correspond  to  the  the  computer  where  the  authentication originated (source computer), if it is a local logon event then this will be the same as the LogHost | 
| ServiceName | The account name of the computer or service the user is requesting the ticket for |
| Destination | This is the server the mapped credential is accessing.  This may indicate the local computer when starting another process with new account credentials on a local computer |
| AuthenticationPackage | The type of authentication occurring including Negotiate, Kerberos, NTLM plusa few more | FailureReason | The reason for a failed logon |
| ProcessName | The process executable name, for authentication events this is the process that processedthe authentication event ProcessNames may include the file type extensions (i.e exe). |
| ProcessID | A semi-unique (unique between currently running processes AND LogHost) value that identifies the process.  Process ID allows you to correlate other events logged in association with the same processthrough to the process end |
| ParentProcessName | The process executable that started the new process.  ParentProcessNames often do not have file extensions like ProcessName but can be compared by removing file extensions from the name | 
| ParentProcessID | Identifies the exact process that started the new process.  Look for a preceding event 4688 with a ProcessID that matches thisParentProcessID |

In [106]:
# Create a sample of the data

filePath = "D://LANL//2017//"
inputFileName = "wls_day-01"

sampleFileName = "sample"

with open(filePath + inputFileName, "r") as fIn, open(filePath + sampleFileName, "w") as fOut:
    line = fIn.readline()
    for n in range(100000):
        line = fIn.readline()
        fOut.write(line)
#        print(line)
fIn.close()
fOut.close()
       

In [118]:
# Read the sample data into a pandas dataframe and display the start and end rows

import pandas as pd
      
df = pd.read_json (filePath+sampleFileName, orient='records', lines=True)
df


Unnamed: 0,UserName,EventID,LogHost,LogonID,DomainName,ParentProcessName,ParentProcessID,ProcessName,Time,ProcessID,...,AuthenticationPackage,LogonType,Source,Destination,SubjectUserName,SubjectLogonID,SubjectDomainName,Status,ServiceName,FailureReason
0,Comp991643$,4688,Comp991643,0x3e7,Domain001,services,0x334,rundll32.exe,1,0xc0c,...,,,,,,,,,,
1,Comp736087$,4688,Comp736087,0x3e7,Domain001,services,0x2e8,svchost.exe,1,0x2074,...,,,,,,,,,,
2,Comp093128$,4688,Comp093128,0x3e7,Domain001,services,0x2d4,vssvc.exe,1,0x2200,...,,,,,,,,,,
3,Comp006850$,4688,Comp006850,0x3e7,Domain001,services,0x278,svchost.exe,1,0x498,...,,,,,,,,,,
4,system,4624,Comp828729,0x3e7,nt authority,,,services.exe,1,0x29c,...,Negotiate,5.0,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
99995,Comp343436$,4688,Comp343436,0x3e7,Domain001,svchost,0x138,dllhost.exe,109,0x4fd4,...,,,,,,,,,,
99996,Comp832971$,4688,Comp832971,0x3e7,Domain001,Proc443607,0x1508,searchfilterhost.exe,109,0x2a14,...,,,,,,,,,,
99997,system,4624,Comp560289,0x3e7,nt authority,,,services.exe,109,0x2b4,...,Negotiate,5.0,,,,,,,,
99998,system,4672,Comp560289,0x3e7,nt authority,,,,109,,...,,,,,,,,,,


In [108]:
# Show the row counts for populated fields

df.count()

UserName                  99999
EventID                  100000
LogHost                  100000
LogonID                   91631
DomainName                99999
ParentProcessName         40371
ParentProcessID           40371
ProcessName               50691
Time                     100000
ProcessID                 50691
LogonTypeDescription      36615
AuthenticationPackage     28648
LogonType                 36615
Source                    18885
Destination                1295
SubjectUserName            1304
SubjectLogonID             1306
SubjectDomainName          1304
Status                     8217
ServiceName                1730
FailureReason               371
dtype: int64

In [109]:
# Histogram showing EventID counts in ascending order (for comparison with Turcotte et al. )

df.EventID.value_counts(ascending=True).to_frame().style.bar()

Unnamed: 0,EventID
4609,1
4800,3
4689,3
4625,371
4768,575
4648,1295
4769,1730
4776,5912
4672,13495
4634,13879


In [110]:
# Histogram whowing LogonTypes in ascending order (for comparison with Turcotte et al.)

df.LogonType.value_counts(ascending=True).to_frame().style.bar()

Unnamed: 0,LogonType
9.0,11
8.0,69
2.0,95
4.0,170
5.0,7723
3.0,28547


In [113]:
# Hostogram of LogHosts, UserNames, Sources and ProcessNames for comparison with Turcotte et al.
# Note it is not entirely clear what comp_accounts refers to!

df[['LogHost', 'UserName', 'Source', 'ProcessName']].nunique().sort_values().to_frame().style.bar()

Unnamed: 0,0
ProcessName,270
Source,2016
LogHost,6756
UserName,8108


In [117]:
# Filter all Process Start Events

df.loc[df['EventID'] == 4688 ].dropna(axis='columns', how='all')

Unnamed: 0,UserName,EventID,LogHost,LogonID,DomainName,ParentProcessName,ParentProcessID,ProcessName,Time,ProcessID
0,Comp991643$,4688,Comp991643,0x3e7,Domain001,services,0x334,rundll32.exe,1,0xc0c
1,Comp736087$,4688,Comp736087,0x3e7,Domain001,services,0x2e8,svchost.exe,1,0x2074
2,Comp093128$,4688,Comp093128,0x3e7,Domain001,services,0x2d4,vssvc.exe,1,0x2200
3,Comp006850$,4688,Comp006850,0x3e7,Domain001,services,0x278,svchost.exe,1,0x498
7,Comp466209$,4688,Comp466209,0x3e7,Domain001,services,0x354,vssvc.exe,1,0x2d20
...,...,...,...,...,...,...,...,...,...,...
99993,Comp748369$,4688,Comp748369,0x3e7,Domain001,svchost,0x390,dllhost.exe,109,0x2228
99994,Comp832971$,4688,Comp832971,0x3e7,Domain001,Proc443607,0x1508,searchprotocolhost.exe,109,0x2f70
99995,Comp343436$,4688,Comp343436,0x3e7,Domain001,svchost,0x138,dllhost.exe,109,0x4fd4
99996,Comp832971$,4688,Comp832971,0x3e7,Domain001,Proc443607,0x1508,searchfilterhost.exe,109,0x2a14
