# Malware Simulation Demo / Happy Holidays

. | .
- | - 
![alt](gift.png) | ![alt](Element-AI-Symbol-RGB.jpg)

This notebook shows how to run a simple ITSim simulation and collect its telemetry data from a datastore server. Then it quickly overlooks the generated data by showing the telemetry records and plotting the chronological graphs of events.


## Step 1:  Launch a Datastore Server to collect simulation data

This server collects ITSim telemetry and logs over a REST API and archives them into a Database (SQLite db). This server is running throughout the simulations. It could be used to collect data from simulations running simultaneously on multiple machines.    

In [1]:
import os

DB_FILE = "malware_simulation_1.sqlite"
HOSTNAME = "localhost"
PORT = "5000"

os.system(f'python ../bin/itsim_serve_datastore.py --sqlite_file {DB_FILE} --host {HOSTNAME} --port {PORT} &')

0

## Step 2:  Run ITSim Simulations

### Define Simulation Parameters

In [4]:
import ipywidgets as widgets
from IPython.display import display
from ipywidgets import interact, interactive, fixed, interact_manual

def set_simulation_parameters(sim_runs, network_topology,nb_endpoints, backdoor_random_start):
    return {'sim_runs':sim_runs,
            'network_topology':network_topology,
            'nb_endpoints':nb_endpoints,
            'backdoor_random_start':backdoor_random_start}
            
style={'description_width': 'initial'}

sim_runs_widget = widgets.IntSlider(value=4,
                                    min=1,
                                    max=10,
                                    description='Simulations:',
                                    style=style)

network_topology_widget = widgets.Dropdown(options=['Flat + Internet'],
                                           value='Flat + Internet',
                                           description='Topology:',
                                           style=style)

nb_endpoints_widget = widgets.IntSlider(value=20,
                                        min=2,
                                        max=65000,
                                        description='Endpoints:',
                                        style=style)

backdoor_random_start_widget = widgets.IntRangeSlider(value=[15, 45],
                                                   min=1,
                                                   max=100,
                                                   description='Backdoor Uni(a,b):',
                                                   style=style)

sim = interactive(set_simulation_parameters, 
                sim_runs=sim_runs_widget, 
                network_topology=network_topology_widget,
                nb_endpoints=nb_endpoints_widget,
                backdoor_random_start = backdoor_random_start_widget)
display(sim)


interactive(children=(IntSlider(value=4, description='Simulations:', max=10, min=1, style=SliderStyle(descript…

### Run Simulations

In [5]:
import itsim
import random
from uuid import uuid4
from itsim.schemas.items import create_json_node, create_json_network_event
from itsim.datastore.datastore import DatastoreRestClient
from itsim.time import now_iso8601
from itsim.simulator import Simulator
from time import sleep

In [6]:
# MEANT TO BE REPLACED BY LATEST MALWARE SIMULATION
def run_simulations(sim):
    print(f"Running {sim.result['sim_runs']} simulations based on the following configuration:")
    print(f"\t- Network Topology: \t\t\t\t\t{sim.result['network_topology']}")
    print(f"\t- Number of servers: \t\t\t\t\t1")
    print(f"\t- Number of Endpoints: \t\t\t\t\t{sim.result['nb_endpoints']}")
    print(f"\t- Random backdoor callback based on distribution: \tUniform({sim.result['backdoor_random_start'][0]},{sim.result['backdoor_random_start'][1]})")

    bar = widgets.IntProgress(value=0,
                              min=0,
                              max=sim.result['sim_runs'],
                              step=1,
                              description='Progress:',
                              bar_style='info', 
                              orientation='horizontal')
    display(bar)
    for _ in range(sim.result['sim_runs']):
        sim_uuid = uuid4()
        network_events = []
        nb_events = 2
        event_uuids = [uuid4() for _ in range(nb_events)]
        node_uuids = [uuid4() for _ in range(nb_events)]

        for uuid, uuid_node, network_event_type, src, dst in [
            (event_uuids[0], node_uuids[0], 'open', ['192.168.1.1', 64], ['192.168.11.200', 72]),
            (event_uuids[1], node_uuids[1], 'open', ['192.168.1.111', 64], ['192.168.11.20', 72]),
            (event_uuids[0], node_uuids[0], 'close', ['192.168.1.1', 64], ['192.168.11.200', 72]),
            (event_uuids[1], node_uuids[1], 'close', ['192.168.1.111', 64], ['192.168.11.20', 72])
        ]:
            network_events.append(
                create_json_network_event(
                    sim_uuid=sim_uuid,
                    timestamp=now_iso8601(),
                    uuid=uuid,
                    uuid_node=uuid_node,
                    network_event_type=network_event_type,
                    protocol='UDP',
                    pid=32145,
                    src=src,
                    dst=dst))
            sleep(random.uniform(0, 1))

        datastore = DatastoreRestClient(sim_uuid=sim_uuid)

        for event in network_events:
            datastore.store_item(event)

        item_type = 'network_event'

        for i in range(nb_events):
            event = datastore.load_item(item_type, event_uuids[i])
            assert event.uuid_node == str(node_uuids[i])
    
        bar.value+=1 



In [7]:
run_simulations(sim)

Running 4 simulations based on the following configuration:
	- Network Topology: 					Flat + Internet
	- Number of servers: 					1
	- Number of Endpoints: 					20
	- Random backdoor callback based on distribution: 	Uniform(15,45)


IntProgress(value=0, bar_style='info', description='Progress:', max=4)

http://0.0.0.0:5000/isrunning/6348c4d3-e71e-4148-bef8-e1b426d0493c
http://0.0.0.0:5000/isrunning/1c466610-66c8-40da-b388-65a35dce0989
http://0.0.0.0:5000/isrunning/e2c30828-3cf5-4e42-82b1-9f18e81dea83
http://0.0.0.0:5000/isrunning/9fdc41e4-f995-4096-ac2d-7010b393d191


## Step 3: Shutting down the Datastore Server
We're done with the datastore server, we can close it. 

In [8]:
import requests

try: 
    response = requests.post(f'http://{HOSTNAME}:{PORT}/stop')
    print("Server properly shutdown")
except:
    print("Can't reach the server to shut it down")

Server properly shutdown


## Step 4: Retrieve Simulation Telemetry

Now that the simulations ran to completion, we can access the data collected by the datastore (SQLite database).

The datastore is storing telemetry events as JSON strings into a SQLite database (for the convenience of quick prototyping). This is a quick preview of what the data looks like using Pandas. 

In [9]:
import sqlite3
import pandas as pd
import json

conn = sqlite3.connect(DB_FILE)
df = pd.read_sql_query("SELECT * FROM network_event;", conn)
df

Unnamed: 0,uuid,timestamp,sim_uuid,json
0,f1a272f6-3002-4ae0-899b-6b90204ab66b,2018-12-21T03:20:22.713225,3ed5f6bc-7949-4f6b-baad-c34fa1e45001,"{""sim_uuid"": ""3ed5f6bc-7949-4f6b-baad-c34fa1e4..."
1,328cec62-0c42-4a76-a435-f7eb8877c79a,2018-12-21T03:20:22.905655,3ed5f6bc-7949-4f6b-baad-c34fa1e45001,"{""sim_uuid"": ""3ed5f6bc-7949-4f6b-baad-c34fa1e4..."
2,f1a272f6-3002-4ae0-899b-6b90204ab66b,2018-12-21T03:20:23.269817,3ed5f6bc-7949-4f6b-baad-c34fa1e45001,"{""sim_uuid"": ""3ed5f6bc-7949-4f6b-baad-c34fa1e4..."
3,328cec62-0c42-4a76-a435-f7eb8877c79a,2018-12-21T03:20:24.037552,3ed5f6bc-7949-4f6b-baad-c34fa1e45001,"{""sim_uuid"": ""3ed5f6bc-7949-4f6b-baad-c34fa1e4..."
4,34eb2eb3-7a57-4cd3-8555-0b172869adac,2018-12-21T03:35:06.879077,3f33471d-d825-4f7c-8b6a-c9f51b1e6995,"{""sim_uuid"": ""3f33471d-d825-4f7c-8b6a-c9f51b1e..."
5,938cd78e-b261-4040-a0d1-45cf5456c576,2018-12-21T03:35:07.835870,3f33471d-d825-4f7c-8b6a-c9f51b1e6995,"{""sim_uuid"": ""3f33471d-d825-4f7c-8b6a-c9f51b1e..."
6,34eb2eb3-7a57-4cd3-8555-0b172869adac,2018-12-21T03:35:07.861082,3f33471d-d825-4f7c-8b6a-c9f51b1e6995,"{""sim_uuid"": ""3f33471d-d825-4f7c-8b6a-c9f51b1e..."
7,938cd78e-b261-4040-a0d1-45cf5456c576,2018-12-21T03:35:08.502685,3f33471d-d825-4f7c-8b6a-c9f51b1e6995,"{""sim_uuid"": ""3f33471d-d825-4f7c-8b6a-c9f51b1e..."
8,b9e53a8f-024c-4701-9e23-8042cbabf71f,2018-12-21T03:47:44.598694,5283d0a8-cf01-4e33-ae3e-fab64c537bfc,"{""sim_uuid"": ""5283d0a8-cf01-4e33-ae3e-fab64c53..."
9,7ec46ca3-1b8b-44c9-a4fb-a76fbf1d11d3,2018-12-21T03:47:45.407743,5283d0a8-cf01-4e33-ae3e-fab64c537bfc,"{""sim_uuid"": ""5283d0a8-cf01-4e33-ae3e-fab64c53..."


From this dataframe, we can load the json data and list the fields of the telemetry events: 

In [10]:
simulations = df.sim_uuid.unique()
print(f"Telemetry events for {len(simulations)} simulations")
df_json = []

simulations_data = {}
for sim in simulations:
    df2 = df.loc[df['sim_uuid'] == sim]
    df2 = df2.set_index("timestamp", drop = False)
    df1 = df2.loc[:, 'json'].to_frame()
    entry_list = []
    for _, row in df1.iterrows():
        entry_list.append(json.loads(row.json))
    simulations_data[sim] = entry_list

for sim_uuid, telemetry_list in simulations_data.items():
    print(f"\nSimulation {sim_uuid}\n")
    for telemetry in telemetry_list:
        print(f"Telemetry {telemetry['uuid']}")
        for key, value in telemetry.items():
            print(f"\t{key}: {value}")


Telemetry events for 29 simulations

Simulation 3ed5f6bc-7949-4f6b-baad-c34fa1e45001

Telemetry f1a272f6-3002-4ae0-899b-6b90204ab66b
	sim_uuid: 3ed5f6bc-7949-4f6b-baad-c34fa1e45001
	timestamp: 2018-12-21T03:20:22.713225
	type: network_event
	uuid: f1a272f6-3002-4ae0-899b-6b90204ab66b
	uuid_node: d3316a63-3c3f-437a-8639-9bd2e2091797
	network_event_type: open
	protocol: UDP
	pid: 32145
	src: ['192.168.1.1', 64]
	dst: ['192.168.11.200', 72]
Telemetry 328cec62-0c42-4a76-a435-f7eb8877c79a
	sim_uuid: 3ed5f6bc-7949-4f6b-baad-c34fa1e45001
	timestamp: 2018-12-21T03:20:22.905655
	type: network_event
	uuid: 328cec62-0c42-4a76-a435-f7eb8877c79a
	uuid_node: 575c83eb-536d-4e0a-9c69-ab4fb3ac608f
	network_event_type: open
	protocol: UDP
	pid: 32145
	src: ['192.168.1.111', 64]
	dst: ['192.168.11.20', 72]
Telemetry f1a272f6-3002-4ae0-899b-6b90204ab66b
	sim_uuid: 3ed5f6bc-7949-4f6b-baad-c34fa1e45001
	timestamp: 2018-12-21T03:20:23.269817
	type: network_event
	uuid: f1a272f6-3002-4ae0-899b-6b90204ab66b
	u

## Step 4: Plot Telemetry Data
Here is a "quick and dirty approach" to plot telemetry events chronologically: 

In [11]:
import ast
import random
from plotly import __version__
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
import plotly.figure_factory as ff
print(__version__) # requires version >= 1.9.0

def plot_simulation_telemetry_events(conn, sim_uuid):
    node_uuid_idx = 0
    node_timestamp_idx = 1
    node_simulation_idx = 2
    node_json_idx = 3
    df = []

    print(f"Simulation: {sim_uuid}")
    with conn:
        cursor = conn.cursor()
        cursor.execute(f'SELECT * FROM network_event WHERE "sim_uuid"="{sim_uuid}"')
        all_entries = cursor.fetchall()    
        cnt = 0
        colors = []
        net_events_post_proc = []

        for network_event in all_entries:
            data = ast.literal_eval(network_event[node_json_idx])      
            uuid_node = data["uuid_node"]
            event_info = {"uuid_node": uuid_node,"start":"","stop":""}
            net_events_post_proc.append(event_info)

        for network_event in all_entries:
            data = ast.literal_eval(network_event[node_json_idx])      
            uuid_node = data["uuid_node"]   
            event_dict = next(item for item in net_events_post_proc if item["uuid_node"] == uuid_node)
            if event_dict["start"]=="":
                event_dict["start"]=network_event[node_timestamp_idx]
            else:
                event_dict["stop"]=network_event[node_timestamp_idx]

        for network_event_dict in net_events_post_proc:
            if network_event_dict["start"] is not "" and network_event_dict["stop"] is not "":
                colors.append((random.uniform(0, 1),random.uniform(0, 1),random.uniform(0, 1)))
                df.append(dict(Task=network_event_dict["uuid_node"], Start=network_event_dict["start"], Finish=network_event_dict["stop"], Event_Name="connection"))
        init_notebook_mode(connected=True)
        fig = ff.create_gantt(df, title="ITsim Network Events", colors=colors, index_col='Event_Name', showgrid_x=True, showgrid_y=True, show_colorbar=True, group_tasks=True)
        iplot(fig)


3.4.2


We're now calling this function for every simulations ran:

In [38]:
for sim_uuid in simulations:
    plot_simulation_telemetry_events(conn, sim_uuid)

Simulation: 3ed5f6bc-7949-4f6b-baad-c34fa1e45001


Simulation: 3f33471d-d825-4f7c-8b6a-c9f51b1e6995


Simulation: 5283d0a8-cf01-4e33-ae3e-fab64c537bfc


Simulation: e635d951-b168-4c3f-9ca3-eb160858353d


Simulation: c17fc957-57ee-4fe6-a669-a0b04cd46413


Simulation: 61d7696e-b11b-4c99-ae1b-a08cd63c5b88


Simulation: 86373129-bb41-4a65-8a96-fb9855bc771c


Simulation: 5f3d4aef-dee9-4646-a1db-1e838d77fd47


Simulation: 036b157e-a2ea-4c25-8c3d-c3426c9cd947


Simulation: 38b5f4ab-9d26-447b-93cb-788b73a8b4b4


Simulation: 44c5d023-0296-42ea-aac8-f4da26ea51ea


Simulation: 4908c305-bb78-43bc-ba8d-3c9235d50a59


Simulation: 5ee3290d-192f-491d-86d1-4b7b20b2e4ef


Simulation: 223135d7-6dea-4f00-b84a-f815f67004b7


Simulation: 8581a7e7-8f23-4d64-b01b-b3193a25b7e6


Simulation: 1e88f33d-65fb-4f6d-93c9-8fd622ec251a


Simulation: 5027b3ab-840c-4445-9cbe-e8cee03aee46


Simulation: 16e77512-4f34-4652-a838-0a620aef8125


Simulation: 1be48a4f-fb4f-4fd6-a7eb-8cd6985a029b


Simulation: ec76af68-4145-45f2-8e46-941b0f917cbe


Simulation: 9a67a7a2-0366-4a96-b638-9299905ca36c


Simulation: dbc3e7c2-04fa-4292-ba7a-f3551d50aba1


Simulation: e3530951-b82c-42a4-8492-5822a411248d


Simulation: 3e11480c-e13e-439e-863c-9ace1653342f


Simulation: 7f9a7d25-4e9c-42de-b6a0-6f800b172f8b


# THANKS!!
