# Timesketch and Colab

This is a small colab that is built to demonstrate how to interact with Timesketch from colab to do some additional exploration of the data.

Colab can greatly complement investigations by providing the analyst with access to the powers of using python to manipulate the data stored in Timeskech. Additionally it provides developers with the ability to do research on the data in order to speed up developments of analyzers, aggregators and graphing. The purpose of this colab is simply to briefly introduce the powers of colab to analysts and developers, with the hope of inspiring more to take advantage of this powerful platform.

Each code cell (denoted by the [] and grey color) can be run simply by hitting "shift + enter" insice it. The first code that you execute will automatically connect you to a public runtime for colab and connect to the publicly open demo timesketch. You can easily add new code cells, or modify the code that is already there to experiment.

## README

If you want to have your own copy of the colab to make some changes or do some other experimentation you can simply select "File / Save a Copy in Drive" button to make your own copy of this colab and start making changes.

If you want to connect colab to your own Timesketch instance (that is if it is not publicly reachable) you can build your own colab runtime, hit the small triangle right next to the "**Connect**" button in the upper right corner and select "Connect to local runtime". There will be instructions on how to setup your local runtime there. 

Once you have your local runtime setup you should be able to reach your local Timesketch instance.


## Installation

Let's start by installing the TS API client... all commands that start with ! are executed in the shell, therefore if you are missing Python packages you can use pip.

This colab uses python2 as the underlying python binary.

In [0]:
!pip install timesketch-api-client

Then we need to import some libraries that we'll use in this colab.

In [0]:
import altair as alt # For graphing.
import numpy as np   # Never know when this will come in handy.
import pandas as pd  # We will be using pandas quite heavily.

from timesketch_api_client import client

## Connect to TS

And now we can start creating a timesketch client. The client is the object used to connect to the TS server and provides the API to interact with it.

This will connect to the public demo of timesketch, you may want to change these parameters to connect to your own TS instance.

In [0]:
ts_client = client.TimesketchApi('https://demo.timesketch.org', 'demo', 'demo')

### Let's Explore
And now we can start to explore. The first thing is to get all the sketches that are available. Most of the operations you want to do with TS are available in the sketch API.

In [0]:
sketches = ts_client.list_sketches()

Now that we've got a lis of all available sketches, let's print out the names of the sketches as well as the index into the list, so that we can more easily choose a sketch that interests us.

In [0]:
for i, sketch in enumerate(sketches):
  print '[{0:d}] {1:s}'.format(i, sketch.name)

Let's now take a closer look at some of the data we've got in the "Greendale" investigation.

In [0]:
gd_sketch = sketches[1]

Another way is to create a dictionary where the keys are the names of the sketchces and values are the sketch objects.

In [0]:
sketch_dict = dict((x.name, x) for x in sketches)

In [0]:
sketch_dict

Now that we've connected to a sketch we can do all sorts of things.

Try doing: `gd_sketch.<TAB>`

In colab you can use TAB completion to get a list of all attributes of the object you are working with. See a function you may want to call? Try calling it with `gd_sketch.function_name?` and hit enter.. let's look at an example:



In [0]:
gd_sketch.explore?

This way you'll get a list of all the parameters you may want or need to use. You can also use tab completion as soon as you type, `gd_sketch.e<TAB>` will give you all options that start with an `e`, etc.

You can also type `gd_sketch.explore(<TAB>)` and get a pop-up with a list of what parameters this function provides.

But for now, let's look at what views are available to use here:

In [0]:
views = gd_sketch.list_views()

for index, view in enumerate(views):
  print '[{0:d}] {1:s}'.format(index, view.name)

You can then start to query the API to get back results from these views. Let's try one of them...

Word of caution, try to limit your search so that you don't get too many results back. The API will happily let you get all the results back as you choose, but the more records you get back the longer the API call will take (10k events per API call). 

In [0]:
# You can change this number if you would like to test out another view.
view_number = 1


print 'Fetching data from : {0:s}'.format(views[view_number].name)
print '        Query used : {0:s}'.format(views[view_number].query_string if views[view_number].query_string else views[view_number].query_dsl)


If you want to issue this query, then you can run the cell below, otherwise you can change the view_number to try another one.

In [0]:
greendale_frame = gd_sketch.explore(view=views[1], as_pandas=True)

Did you notice the "`as_pandas=True`" parameter that got passed to the "`explore`" function? That means that the data that we'll get back is a pandas DataFrame that we can now start exploring. 

Let's start with seeing how many entries we got back.

In [0]:
greendale_frame.shape

This tells us that the view returned back 3.038 events with 9 columns. Let's explore the first few entries, just so that we can wrap our head around what we got back.

In [0]:
greendale_frame.head(5)

Let's look at what columns we got back... and maybe create a slice that contains fewer columns.

In [0]:
greendale_frame.columns

In [0]:
greendale_frame['datetime'] = pd.to_datetime(greendale_frame['datetime'])

greendale_slice = greendale_frame[['datetime', 'timestamp_desc', 'tag', 'message', 'label']]

greendale_slice.head(4)

We can now start to manipulate these events a bit and start graphing.

Since we don't yet have the analyzers, let's extract some values...

In [0]:
greendale_frame['account_name'] = greendale_frame.message.str.extract(r'Account Name:.+Account Name:\\t\\t([^\\]+)\\n', expand=False)
greendale_frame['account_domain'] = greendale_frame.message.str.extract(r'Account Domain:.+Account Domain:\\t\\t([^\\]+)\\n', expand=False)
greendale_frame['process_name'] = greendale_frame.message.str.extract(r'Process Name:.+Process Name:\\t\\t([^\\]+)\\n', expand=False)
greendale_frame['date'] = pd.to_datetime(greendale_frame.timestamp/1e6, unit='s')

What accounts have logged in:

In [0]:
greendale_frame.account_name.value_counts()

Let's graph.... and you can then interact with the graph... try zomming in, etc.

In [0]:
alt.Chart(greendale_frame, height=200).mark_point().encode(
  x='date', y='account_name'
).properties(
  title='Accounts Logged In'
).interactive()

Or we can do this as a bar graph...

In [0]:
alt.Chart(greendale_frame, width=400).mark_bar().encode(
  x='account_name', y='count()',
  tooltip=['account_name', 'count()']
).properties(
  title='Accounts Logged In'
).interactive()