# Add a volatility3 timeline to timesketch
This will run through how to add a volatility timeline to timesketch

Get libraries needed.

In [70]:
import os
import codecs
import pandas as pd
from timesketch_api_client import config
from timesketch_import_client import helper
from timesketch_import_client import importer
from timesketch_api_client import client

Get volatililty3 from github

In [1]:
!git clone https://github.com/volatilityfoundation/volatility3.git

Cloning into 'volatility3'...
remote: Enumerating objects: 238, done.[K
remote: Counting objects: 100% (238/238), done.[K
remote: Compressing objects: 100% (175/175), done.[K
remote: Total 21563 (delta 145), reused 107 (delta 63), pack-reused 21325[K
Receiving objects: 100% (21563/21563), 4.05 MiB | 9.26 MiB/s, done.
Resolving deltas: 100% (16153/16153), done.


Test out vol.py

In [10]:
!/home/ubuntu/jupyter/volatility3/vol.py -h

Volatility 3 Framework 2.0.0-beta.1
usage: volatility [-h] [-c CONFIG] [--parallelism [{processes,threads,off}]]
                  [-e EXTEND] [-p PLUGIN_DIRS] [-s SYMBOL_DIRS] [-v] [-l LOG]
                  [-o OUTPUT_DIR] [-q] [-r RENDERER] [-f FILE]
                  [--write-config] [--clear-cache]
                  [--single-location SINGLE_LOCATION]
                  [--stackers [STACKERS [STACKERS ...]]]
                  [--single-swap-locations [SINGLE_SWAP_LOCATIONS [SINGLE_SWAP_LOCATIONS ...]]]
                  plugin ...

An open-source memory forensics framework

optional arguments:
  -h, --help            Show this help message and exit, for specific plugin
                        options use 'volatility <pluginname> --help'
  -c CONFIG, --config CONFIG
                        Load the configuration from a json file
  --parallelism [{processes,threads,off}]
                        Enables parallelism (defaults to processes if no
                        argument given)
 

Run the timeline.Timliner plugin on the memory images to create a csv file.


In [4]:
PATH_TO_FOLDER = '/home/ubuntu/jupyter/cases/stolen_szechuan_sauce/'
MEMORY_IMAGE = 'citadeldc01.mem'
OUTPUT_FILE = 'citadeldc01_mem_timeline.csv'
PATH_TO_MEMORY_IMAGE = os.path.join(PATH_TO_FOLDER, MEMORY_IMAGE)
PATH_TO_CSV = os.path.join(PATH_TO_FOLDER, OUTPUT_FILE)
print(PATH_TO_CSV)

/home/ubuntu/jupyter/cases/stolen_szechuan_sauce/citadeldc01_mem_timeline.csv


This seems to run faster when not run in jupyter

!/home/ubuntu/jupyter/volatility3/vol.py -r csv -f /home/ubuntu/jupyter/cases/stolen_szechuan_sauce/citadeldc01.mem timeliner.Timeliner > /home/ubuntu/jupyter/cases/stolen_szechuan_sauce/citadeldc01_mem_timeline.csv


Read the memory file into Pandas.

In [5]:
df = None
with codecs.open(PATH_TO_CSV, 'r', encoding='utf-8', errors='replace') as fh:
  df = pd.read_csv(fh, error_bad_lines=False)

print(df.shape)

(21923, 7)


Looks like we got 2193 rows with 7 columns. Let's looke at the first few columns.

In [6]:
df.head()

Unnamed: 0,TreeDepth,Plugin,Description,Created Date,Modified Date,Accessed Date,Changed Date
0,0,DllList,DLL Load: Process 400 ServerManager. Loaded <v...,1762-05-05 04:53:46.000000,,,
1,0,SymlinkScan,Symlink: DosDevices -> \??,2020-09-19 01:22:34.000000,,,
2,0,SymlinkScan,Symlink: Global -> \GLOBAL??,2020-09-19 01:22:34.000000,,,
3,0,SymlinkScan,Symlink: SystemRoot -> \Device\BootDevice\Windows,2020-09-19 01:22:34.000000,,,
4,0,SymlinkScan,Symlink: RESOURCE_HUB -> \Device\RESOURCE_HUB,2020-09-19 01:22:34.000000,,,


Let's look at the info for the dataframe.

In [7]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 21923 entries, 0 to 21922
Data columns (total 7 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   TreeDepth      21923 non-null  int64  
 1   Plugin         21923 non-null  object 
 2   Description    21923 non-null  object 
 3   Created Date   21923 non-null  object 
 4   Modified Date  3 non-null      object 
 5   Accessed Date  0 non-null      float64
 6   Changed Date   0 non-null      float64
dtypes: float64(2), int64(1), object(4)
memory usage: 1.2+ MB


It looks like every entry has a created date, but none of them have an Access Date or a Changed Date and only 3 have a modified date. Let's use Creation Date as a first pass to try and get this data into timesketch.

The first thing is to create a datetime field that contains the timestamp. We will use the built-in conversion in pandas (stole this from googlers)


In [65]:
df['datetime'] = pd.to_datetime(df['Created Date'])
df['datetime'][55]

Timestamp('2020-09-19 01:22:36')

The next thing is to add few fields that Timesketch expects:

In [68]:
df['data_type'] = 'volatility3:timeline:Timeliner:record'
df['timestamp_desc'] = 'Created Date'
df['message'] = 'Volatility3: [' + df['Plugin'] + ' - ' + df['Description'] + ']'

df.head(3)

Unnamed: 0,TreeDepth,Plugin,Description,Created Date,Modified Date,Accessed Date,Changed Date,datetime,data_type,timestamp_desc,message
0,0,DllList,DLL Load: Process 400 ServerManager. Loaded <v...,1762-05-05 04:53:46.000000,,,,1762-05-05 04:53:46,volatility3:timeline:Timeliner:record,Created Date,Volatility3: [DllList - DLL Load: Process 400 ...
1,0,SymlinkScan,Symlink: DosDevices -> \??,2020-09-19 01:22:34.000000,,,,2020-09-19 01:22:34,volatility3:timeline:Timeliner:record,Created Date,Volatility3: [SymlinkScan - Symlink: DosDevice...
2,0,SymlinkScan,Symlink: Global -> \GLOBAL??,2020-09-19 01:22:34.000000,,,,2020-09-19 01:22:34,volatility3:timeline:Timeliner:record,Created Date,Volatility3: [SymlinkScan - Symlink: Global ->...




Get a copy of the timesketch client. This will prompt for info about where to find timesketch

In [126]:
ts_client = config.get_client(confirm_choices=True)


Want to change the value for "host_uri" [http://ec2-18-191-32-121.us-east-2.compute.amazonaws.com/] [y/N]: N
Want to change the value for "client_id" [] [y/N]: N
Want to change the value for "client_secret" [] [y/N]: N
Want to change the value for "auth_mode" [userpass] [y/N]: N
Want to change credentials? [y/N]: N


List sketches

In [72]:
for sketch in ts_client.list_sketches():
    print(f"sketch id:{sketch.id} sketch name:{sketch.name} sketch description:{sketch.description}")

sketch id:1 sketch name:szechuan sketch description:


In [73]:
# @markdown This needs to be changed to reflect the correct sketch.

SKETCH_ID = 1 # @param {type: "integer"}

We will use the importer client to import the data as a data frame, for that we need to setup an import streamer:

In [74]:
sketch = ts_client.get_sketch(SKETCH_ID)
import_helper = helper.ImportHelper() 

with importer.ImportStreamer() as streamer:
  streamer.set_sketch(sketch)
  streamer.set_config_helper(import_helper) 

  streamer.set_timeline_name('volatility3_server_citadeldc01_memory')

  streamer.add_data_frame(df)

Let's add the other memory timeline that we have.

In [78]:
PATH_TO_FOLDER = '/home/ubuntu/jupyter/cases/stolen_szechuan_sauce/'
MEMORY_IMAGE = 'DESKTOP-SDN1RPT.mem'
OUTPUT_FILE = 'DESKTOP-SDN1RPT_mem_timeline.csv'
PATH_TO_MEMORY_IMAGE = os.path.join(PATH_TO_FOLDER, MEMORY_IMAGE)
PATH_TO_CSV = os.path.join(PATH_TO_FOLDER, OUTPUT_FILE)
print(PATH_TO_CSV)
print(PATH_TO_MEMORY_IMAGE)

/home/ubuntu/jupyter/cases/stolen_szechuan_sauce/DESKTOP-SDN1RPT_mem_timeline.csv
/home/ubuntu/jupyter/cases/stolen_szechuan_sauce/DESKTOP-SDN1RPT.mem


This seems to run faster when not run in jupyter

!/home/ubuntu/jupyter/volatility3/vol.py -r csv -f /home/ubuntu/jupyter/cases/stolen_szechuan_sauce/DESKTOP-SDN1RPT.mem timeliner.Timeliner > /home/ubuntu/jupyter/cases/stolen_szechuan_sauce/DESKTOP-SDN1RPT_mem_timeline.csv

In [115]:
df = None
with codecs.open(PATH_TO_CSV, 'r', encoding='utf-8', errors='replace') as fh:
  df = pd.read_csv(fh, error_bad_lines=False)

print(df.shape)

(432, 7)


In [116]:
df.head()

Unnamed: 0,TreeDepth,Plugin,Description,Created Date,Modified Date,Accessed Date,Changed Date
0,0,PsList,Process: 4096 (209519118647424),1601-01-01 00:00:35.000000,1601-01-01 00:02:34.000000,,
1,0,PsList,Process: 4096 (209519118647424),1601-01-01 00:00:35.000000,,,
2,0,PsList,Process: 4096 (209519074865280),1601-01-01 00:00:53.000000,1601-01-01 00:01:25.000000,,
3,0,PsList,Process: 4096 (209519074865280),1601-01-01 00:00:53.000000,,,
4,0,PsList,Process: 92 Registry (209518991138944),2020-09-19 01:24:04.000000,,,


In [117]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 432 entries, 0 to 431
Data columns (total 7 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   TreeDepth      432 non-null    int64  
 1   Plugin         432 non-null    object 
 2   Description    432 non-null    object 
 3   Created Date   432 non-null    object 
 4   Modified Date  13 non-null     object 
 5   Accessed Date  0 non-null      float64
 6   Changed Date   0 non-null      float64
dtypes: float64(2), int64(1), object(4)
memory usage: 23.8+ KB


In [83]:
df['datetime'] = pd.to_datetime(df['Created Date'])


OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1601-01-01 00:00:35

In [87]:
df[df.stack().str.contains("1601-01-01").any(level=0)]

Unnamed: 0,TreeDepth,Plugin,Description,Created Date,Modified Date,Accessed Date,Changed Date
0,0,PsList,Process: 4096 (209519118647424),1601-01-01 00:00:35.000000,1601-01-01 00:02:34.000000,,
1,0,PsList,Process: 4096 (209519118647424),1601-01-01 00:00:35.000000,,,
2,0,PsList,Process: 4096 (209519074865280),1601-01-01 00:00:53.000000,1601-01-01 00:01:25.000000,,
3,0,PsList,Process: 4096 (209519074865280),1601-01-01 00:00:53.000000,,,


In [92]:
df_filtered = df[df["CreatedDate"] != 1601-01-01 00:00:53.000000]

SyntaxError: leading zeros in decimal integer literals are not permitted; use an 0o prefix for octal integers (<ipython-input-92-72ff143f586a>, line 1)

After some stack overflowing I found this: https://stackoverflow.com/questions/32888124/pandas-out-of-bounds-nanosecond-timestamp-after-offset-rollforward-plus-adding-a. It looks like you can add a coerce option and get the out of bounds times changes to NaT which might be easier to filter out. The example from stack overflow looks like this: datetime_variable = pd.to_datetime(datetime_variable, errors = 'coerce')

In [119]:
df['datetime'] = pd.to_datetime(df['Created Date'], errors = 'coerce')

Looks like it worked! Now let's see if we can filter by NaT. The filter is not representd by "~" and then isna(). So it is saying filter where the datatime files is not null or NaT.

In [120]:
df[~df["datetime"].isna()]


Unnamed: 0,TreeDepth,Plugin,Description,Created Date,Modified Date,Accessed Date,Changed Date,datetime
4,0,PsList,Process: 92 Registry (209518991138944),2020-09-19 01:24:04.000000,,,,2020-09-19 01:24:04
5,0,PsList,Process: 92 Registry (209518991138944),2020-09-19 01:24:04.000000,,,,2020-09-19 01:24:04
6,0,PsScan,Process: 92 Registry (209518991138944),2020-09-19 01:24:04.000000,,,,2020-09-19 01:24:04
7,0,PsScan,Process: 92 Registry (209518991138944),2020-09-19 01:24:04.000000,,,,2020-09-19 01:24:04
8,0,PsList,Process: 4 System (209518991011904),2020-09-19 01:24:07.000000,,,,2020-09-19 01:24:07
...,...,...,...,...,...,...,...,...
427,0,PsScan,Process: 6544 FTK Imager.exe (209519090503808),2020-09-19 05:09:56.000000,,,,2020-09-19 05:09:56
428,0,PsScan,Process: 4868 backgroundTask (209519119593600),2020-09-19 05:10:08.000000,,,,2020-09-19 05:10:08
429,0,PsScan,Process: 4868 backgroundTask (209519119593600),2020-09-19 05:10:08.000000,,,,2020-09-19 05:10:08
430,0,PsScan,Process: 3696 RuntimeBroker. (209519118319744),2020-09-19 05:10:09.000000,,,,2020-09-19 05:10:09


When we did the search for 1601 we got back 4 hits. The total number of hits for the dataframe was 432 if we remove our 4 bad timestamps that leaves us with 428, which looks right.

In [121]:
df_filtered = df[~df["datetime"].isna()]


In [114]:
df['data_type'] = 'volatility3:timeline:Timeliner:record'
df['timestamp_desc'] = 'Created Date'
df['message'] = 'Volatility3: [' + df['Plugin'] + ' - ' + df['Description'] + ']'

df.head(3)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df['data_type'] = 'volatility3:timeline:Timeliner:record'
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df['timestamp_desc'] = 'Created Date'
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df['message'] = 'Volatility3: [' + df['Plugin'] + ' - ' + df['Description'] + ']'


Unnamed: 0,TreeDepth,Plugin,Description,Created Date,Modified Date,Accessed Date,Changed Date,datetime,data_type,timestamp_desc,message
4,0,PsList,Process: 92 Registry (209518991138944),2020-09-19 01:24:04.000000,,,,2020-09-19 01:24:04,volatility3:timeline:Timeliner:record,Created Date,Volatility3: [PsList - Process: 92 Registry (2...
5,0,PsList,Process: 92 Registry (209518991138944),2020-09-19 01:24:04.000000,,,,2020-09-19 01:24:04,volatility3:timeline:Timeliner:record,Created Date,Volatility3: [PsList - Process: 92 Registry (2...
6,0,PsScan,Process: 92 Registry (209518991138944),2020-09-19 01:24:04.000000,,,,2020-09-19 01:24:04,volatility3:timeline:Timeliner:record,Created Date,Volatility3: [PsScan - Process: 92 Registry (2...


So, some weird errors, I don't really know what they mean, going to plan b to drop the entries in place. Going to rerun some previous cells to put things back. For ref: df.drop(df[df['Age'] < 25].index, inplace = True) from here: https://www.geeksforgeeks.org/drop-rows-from-the-dataframe-based-on-certain-condition-applied-on-a-column/

In [124]:
df.drop(df[df['datetime'].isna()].index, inplace = True)

In [125]:
df['data_type'] = 'volatility3:timeline:Timeliner:record'
df['timestamp_desc'] = 'Created Date'
df['message'] = 'Volatility3: [' + df['Plugin'] + ' - ' + df['Description'] + ']'

df.head(3)

Unnamed: 0,TreeDepth,Plugin,Description,Created Date,Modified Date,Accessed Date,Changed Date,datetime,data_type,timestamp_desc,message
4,0,PsList,Process: 92 Registry (209518991138944),2020-09-19 01:24:04.000000,,,,2020-09-19 01:24:04,volatility3:timeline:Timeliner:record,Created Date,Volatility3: [PsList - Process: 92 Registry (2...
5,0,PsList,Process: 92 Registry (209518991138944),2020-09-19 01:24:04.000000,,,,2020-09-19 01:24:04,volatility3:timeline:Timeliner:record,Created Date,Volatility3: [PsList - Process: 92 Registry (2...
6,0,PsScan,Process: 92 Registry (209518991138944),2020-09-19 01:24:04.000000,,,,2020-09-19 01:24:04,volatility3:timeline:Timeliner:record,Created Date,Volatility3: [PsScan - Process: 92 Registry (2...


That seemed to work. No more errors.

In [127]:
for sketch in ts_client.list_sketches():
    print(f"sketch id:{sketch.id} sketch name:{sketch.name} sketch description:{sketch.description}")

sketch id:1 sketch name:szechuan sketch description:


In [128]:
# @markdown This needs to be changed to reflect the correct sketch.

SKETCH_ID = 1 # @param {type: "integer"}

In [129]:
sketch = ts_client.get_sketch(SKETCH_ID)
import_helper = helper.ImportHelper() 

with importer.ImportStreamer() as streamer:
  streamer.set_sketch(sketch)
  streamer.set_config_helper(import_helper) 

  streamer.set_timeline_name('volatility3_desktop_SDN1RPT_memory')

  streamer.add_data_frame(df)