# Downloading raw PCAP

* https://github.com/aol/moloch/wiki/API#sessionspcap

We can download raw PCAP data, as opposed to indexed metadata, via `sessions.pcap` endpoint. Can be useful if you wish to extract capture data for closer investigation in wireshark. Start by setting up variables, as always.

In [25]:
import requests
from requests.auth import HTTPDigestAuth
user="vagrant"
passwd="vagrant"
auth=HTTPDigestAuth(user, passwd)

Then extract all DNS packets by Moloch query. Note the `stream=True` parameter for our GET request. **This is very important, as you do not want your script to pull all PCAP data into memory before writing out the file**.

In [26]:
query = {
    "expression": "protocols == dns && dns.host == berylia.org",
    "date": 1,
}
resp = requests.get("http://192.168.10.13:8005/sessions.pcap", params=query, auth=auth, stream=True)

Stream the response data into a newly create file. Open the file in wireshark to verify output.

In [27]:
with open("/vagrant/dns-berylia.pcap", 'wb') as f:
    for chunk in resp.iter_content(chunk_size=8192):
        if chunk: # filter out keep-alive new chunks
            f.write(chunk)

Note that multiple sessions get clumpted into a single PCAP stream when relying on Moloch expressions. Alternatively, `ids` parameter can be specified to download specific sessions one by one and to write each session into a distinct output file. For example, we can extract a list of example session ID-s via CSV endpoint.

In [28]:
import datetime as dt
end = int(dt.datetime.now().strftime("%s"))
start = end - 5*60
r = requests.get("http://192.168.10.13:8005/sessions.csv", params={
    "startTime": start,
    "stopTime": end,
    "date": 1,
    "expression": "host.dns == berylia.org",
    "fields": ",".join([
        "_id"
    ])
}, auth=auth)
ids = r.text.split("\r\n")
# Drop csv header
ids = ids[1:]
# Get rid of empty element from last newline
ids = [i for i in ids if len(i) > 0]
print(ids)

['190522-tAIbGZrl6xpCVLZj2bl_S1lC', '190522-tAJRqThOVJhLtbrhKDTeq0Bt', '190522-tALAA6vM5_VCk7L3XxVwCA2m', '190522-tAIPovzkOupJuL7UTkfdn_Vy', '190522-tAKtD5JXy2xNXpNl5h0qSlQR', '190522-tAKl1A73clNLvqWs15y1wZIj', '190522-tAJV2FBAL8FDbY7c-y0MbSVN', '190522-tALVEPKxwH9EaLaeV7PYq1XV', '190522-tAIAze0zGC5IX4AOOziGUjZr', '190522-tALa6KLjgwtM1LPM3dZVA_wz', '190522-tAJBMxxLhaxPOaVjeBUwZvMg', '190522-tAIdrT8OC11Fg7cFuYf_xZ3w', '190522-tAIsjQzSRhFCG4DAkEOW0CgZ', '190522-tAICmKIWPgZIe4AWScJl0eEx']


In [29]:
for i in ids:
    query = {
        "ids": i,
        "date": 1,
    }
    resp = requests.get("http://192.168.10.13:8005/sessions.pcap", params=query, auth=auth, stream=True)
    with open("/vagrant/{}.pcap".format(i), 'wb') as f:
        for chunk in resp.iter_content(chunk_size=8192):
            if chunk: # filter out keep-alive new chunks
                f.write(chunk)

# Tasks

See [suricata eve.json parsing example](https://github.com/ccdcoe/CDMCS/blob/master/Suricata/indexing/001-load-eve.ipynb). 
* Load `community_id` values from `alert` events in `/var/log/suricata/eve.json`. Write raw pcap data for each `community_id` into a distinct pcap file.