# MemProcFS memory forensics using Jupyter notebook

The [MemProcFS](https://github.com/ufrisk/MemProcFS) Jupyter example notebook showcase how it is possible to leverage the [MemProcFS Python API](https://github.com/ufrisk/MemProcFS/wiki/API_Python) to perform fast and efficient memory analysis and forensics on memory dump files. The [example notebook](https://github.com/ufrisk/MemProcFS/wiki/API_Python_Jupyter) is not a production ready notebook.

MemProcFS is available on Python pip for both Windows and Linux. MemProcFS analyzes Windows memory. MemProcFS is also able to analyze live memory from PCIe FPGA devices, drivers and virtual machines. But for the purposes of this notebook it is recommended to use a full memory dump file.

If you decide to publish your own notebook using MemProcFS, or if you have improvement suggestions please let me know at Github or [contact me](https://github.com/ufrisk/MemProcFS#Links).

## Install and Import
First lets install and import the dependencies.

In [None]:
#%%capture
%pip install --upgrade memprocfs colorama matplotlib networkx pandas tqdm pyvis

In [None]:
import memprocfs
import time
import pandas as pd
import networkx as nx
import matplotlib.pyplot as plt
from tqdm import tqdm
from io import StringIO
from pyvis.network import Network
from IPython.display import display, HTML
from colorama import *
# Do not truncate outputs from pandas:
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)
pd.set_option('display.max_colwidth', None)

## Configure Memory Dump file
**Set the full path to the memory dump file you wish to analyze.** Optionally set the path to the page files as well (not required but will improve analysis quality).

In [None]:
memory_image_path_file = "C:/Dumps/warren.mem"
memory_image_path_pagefile = ""
memory_image_path_swapfile = ""

## Initialize MemProcFS
Initialize MemProcFS and execute the forensic mode. The forensic mode will take a short while to process the entire memory dump file. It will run multiple analysis tasks and generate CSV files which is used by pandas for data analytics. Lets wait for the forensic mode to complete.

In [None]:
# Process arguments
args = ['-device', memory_image_path_file, '-forensic', '1', '-waitinitialize', '-vm']
if len(memory_image_path_pagefile) > 0:
    args = args.append("-pagefile0").append(memory_image_path_pagefile)
    if len(memory_image_path_swapfile) > 0: args = args.append("-pagefile0").append(memory_image_path_swapfile)
    
# Initialize MemProcFS
vmm = memprocfs.Vmm(args)

# Wait for forensic mode to complete...
with tqdm(desc="MemProcFS forensic analysis progress",total=100) as pbar:
    while True:
        pbar.n = int(vmm.vfs.read('/forensic/progress_percent.txt'))
        pbar.update(0)
        time.sleep(0.5)
        if pbar.n == 100:
            pbar.disable = True
            break
print(Fore.GREEN + "[✓] MemProcFS forensic mode completed.")

## Ingest Data
Ingest CSV data from the MemProcFS virtual file system (VFS) path `/forensic/csv/` into pandas for data analytics. Even if the MemProcFS virtual file system isn't mounted as a virtual drive it is possible to use the API to retrieve these files.

In [None]:
def get_csv(f):
    # by default vmm.vfs.read(str) will read 1MB, since the csv files may be larger allow reads up to 256MB.
    return StringIO(vmm.vfs.read('/forensic/csv/' + f, 0x10000000).decode("utf-8"))

dfdevice = pd.read_csv(get_csv('devices.csv'))
dfdriver = pd.read_csv(get_csv('drivers.csv'))
dfhandle = pd.read_csv(get_csv('handles.csv'))
dfmodule = pd.read_csv(get_csv('modules.csv'))
dfprocess = pd.read_csv(get_csv('process.csv'))
dfservice = pd.read_csv(get_csv('services.csv'))
dfthread = pd.read_csv(get_csv('threads.csv'))
dfunloadedmodule = pd.read_csv(get_csv('unloaded_modules.csv'))

## Suspicious Drivers

The kernel drivers of the analyzed system is already retrieved into a dataframe. First lets show the driver dataframe as-is.

Then lets go hunting for suspicious drivers. We assume suspicious drivers don't have the words `SystemRoot` or `System32` in them.

NB! if a driver called ad_driver.sys is found it is indeed loaded from a suspicious path, but it belongs to FTK which probably was the driver used to dump the memory.

In [None]:
dfdriver

In [None]:
mask = ~dfdriver['DriverPath'].str.contains('SystemRoot|System32', case=False).fillna(True)
suspicious = dfdriver[mask][['Name', 'DriverName', 'DriverPath']]

if len(suspicious) == 0:
    print(Fore.GREEN + "[✓] No suspicious drivers.")
else:
    display(HTML(suspicious.to_html()))
    print(Fore.RED + "[!] Suspicious drivers found!")

## Open file handles in Word

Word may have open file handles to interesting files such as autosave files. These files may be recovered from memory.

Join the process and handle dataframes on PID. Then filter on the word process and file handles which contain the text AppData.

These files may be downloaded from the MemProcFS path `/<pid>/files/handles/<object-address>-<filename>` using either the `vmm.vfs.read()` function showcased above in the csv import or using the mounted MemProcFS file system.

In [None]:
df_word = pd.merge(dfprocess, dfhandle, on="PID")[['PID', 'Name', 'Object', 'Type', 'Description']].dropna()
df_word = df_word[ df_word['Name'].str.contains('word', case=False) ]
df_word = df_word[ df_word['Type'].str.match('File') ]
df_word = df_word[ df_word['Description'].str.contains('AppData', case=False) ]

if len(df_word) == 0:
    print(Fore.GREEN + "[✓] No word process with open handles in AppData found.")
else:
    display(HTML(df_word.to_html()))
    print(Fore.YELLOW + "[!] Interesting open files in the word process found!")


## Process Relationships #1

The MemProcFS process CSV file does not contain the parent process name - only the PPID. Lets perform a left outer join to add the ParentName.

Lets display a simplified version of the process relationship table with some additional information

In [None]:
df_proc2_ppid = dfprocess[['PID', 'Name']].rename(columns={'PID': 'P_PID', 'Name': 'ParentName'})
df_proc2 = dfprocess.merge(df_proc2_ppid, left_on='PPID', right_on='P_PID', how='left')
df_proc2 = df_proc2[['PID', 'Name', 'PPID', 'ParentName', 'User', 'UserPath', 'KernelPath']].fillna('---')

df_proc2

## Process Relationships #2

Process relationships may be visualized in a directed graph. Lets visualize the proces relationships in a pyvis graph. Here we'll just use the normal process dataframe.

NB! this does not take reused PIDs into account!

In [None]:
net = Network(directed=True, cdn_resources='remote', notebook=True, height=800, width=800)
for index, row in dfprocess.iterrows():
    net.add_node(row['PID'], row['Name'] + '(' + str(row['PID']) + ')')
for index, row in dfprocess.iterrows():
    net.add_node(row['PPID'], '(' + str(row['PPID']) + ')')
for index, row in dfprocess.iterrows():
    try: net.add_edge(row['PID'], row['PPID'])
    except: pass
net.show('_pyvis.html')

## Loaded Modules / DLLs

Show loaded DLLs in which it's possible to retrieve the company name from the PE VersionInfo. Also exclude Microsoft from here - so show only non-Microsoft loaded modules.

In [None]:
df_noms = dfmodule[['PID', 'Process', 'Name', 'VerCompanyName', 'Path']]

df_noms = df_noms[ ~df_noms['VerCompanyName'].str.contains('microsoft', case=False).fillna(True) ]
df_noms

## API: Physical Memory Map

The physical memory map describes at which physical memory address ranges the actual physical RAM available to the operating system is located. MemProcFS retrieves it from the kernel. Check out the guide for info about [vmm.maps.memmap()](https://github.com/ufrisk/MemProcFS/wiki/API_Python_Base).

In [None]:
memmap = vmm.maps.memmap()
for memrange in memmap:
    print("%10x -> %10x" % (memrange[0], memrange[0] +  memrange[1] - 1))

## API: Read 0x100 bytes from kernel32 PE header (in explorer.exe)

This should be the PE file header. Display it as hex. Use the MemProcFS API to achieve this. Check out the guide about the [process](https://github.com/ufrisk/MemProcFS/wiki/API_Python_Process#VmmProcess) and [module](https://github.com/ufrisk/MemProcFS/wiki/API_Python_Process#VmmModule) objects.

In [None]:
explorer = vmm.process('explorer.exe')
kernel32 = explorer.module('kernel32.dll')
memory_peheader = explorer.memory.read(kernel32.base, 0x100)
print('PE header of explorer.exe!kernel32.dll:')
print(vmm.hex(memory_peheader))



## API: List exported functions from kernel32.dll (in explorer.exe)

Show only the first 20 entries.

In [None]:
explorer = vmm.process('explorer.exe')
kernel32 = explorer.module('kernel32.dll')
exports = kernel32.maps.eat()
count = 0
print("ordinal  address  name                                  forward")
print("===============================================================")
for e in exports['e']:
    print("%2i  %12x  %-36s  %s" % (e['ord'], e['va'], e['fn'], e['fwdfn']))
    if count == 20: break
    count += 1

## API: Network Connections

The network connections can be retrieved using the [MemProcFS API](https://github.com/ufrisk/MemProcFS/wiki/API_Python_Base).

In [None]:
import pprint 
netmap = vmm.maps.net()
for e in netmap:
    print("%04i: %32s -> %32s" % (e['pid'], e['src-ip'], e['dst-ip']))