# Splunk setup using docker

This notebook contains scripts for setting up a splunk docker container with security apps for analysing the botsv2 dataset

In [2]:
# The rest of the script will assume that the currect working direcotory is the same
!mkdir -p ~/splunk
%cd ~/splunk

/home/ole/splunk


## Basic container setup

In [69]:
!sudo docker pull splunk/splunk:latest
!sudo docker network create --driver bridge --attachable soc1

latest: Pulling from splunk/splunk
Digest: sha256:8b62363063d91138f8eceaaaec66c024ade9da0baa851fff679e25849585cdbb
Status: Image is up to date for splunk/splunk:latest
Error response from daemon: network with name soc1 already exists


## Copy default apps from container to host

Splunk ships with a set of apps inside the container. In addition to the defaults apps we will be downloading apps from [splunkbase](https://splunkbase.splunk.com/). 

In this docker setup we will put all apps in a folder on the host and access them with a shared folder from the container. The first step  towards this is starting the container and copying the default apps to the host and stopping the container.


In [70]:
!sudo docker run --network soc1 --name soc1 --hostname soc1 -p 8000:8000 -p 8089:8089 -p 8070:8070 -e "SPLUNK_PASSWORD=changeme" -e "SPLUNK_START_ARGS=--accept-license" -t splunk/splunk:latest


PLAY [Run default Splunk provisioning] *****************************************
Wednesday 03 July 2019  12:19:41 +0000 (0:00:00.021)       0:00:00.021 ******** 

TASK [Gathering Facts] *********************************************************
[0;32mok: [localhost][0m
Wednesday 03 July 2019  12:19:43 +0000 (0:00:01.347)       0:00:01.369 ******** 
Wednesday 03 July 2019  12:19:43 +0000 (0:00:00.038)       0:00:01.408 ******** 
Wednesday 03 July 2019  12:19:43 +0000 (0:00:00.036)       0:00:01.444 ******** 
Wednesday 03 July 2019  12:19:43 +0000 (0:00:00.109)       0:00:01.554 ******** 
[0;36mincluded: /opt/ansible/roles/splunk_common/tasks/get_facts.yml for localhost[0m
Wednesday 03 July 2019  12:19:43 +0000 (0:00:00.074)       0:00:01.628 ******** 

TASK [splunk_common : Set privilege escalation user] ***************************
[0;32mok: [localhost][0m
Wednesday 03 July 2019  12:19:43 +0000 (0:00:00.037)       0:00:01.666 ******** 

TASK [splunk_common : Check for existing ins

Wednesday 03 July 2019  12:19:58 +0000 (0:00:00.036)       0:00:16.422 ******** 
Wednesday 03 July 2019  12:19:58 +0000 (0:00:00.034)       0:00:16.456 ******** 
Wednesday 03 July 2019  12:19:58 +0000 (0:00:00.039)       0:00:16.496 ******** 
Wednesday 03 July 2019  12:19:58 +0000 (0:00:00.041)       0:00:16.538 ******** 
Wednesday 03 July 2019  12:19:58 +0000 (0:00:00.037)       0:00:16.575 ******** 
Wednesday 03 July 2019  12:19:58 +0000 (0:00:00.040)       0:00:16.615 ******** 
Wednesday 03 July 2019  12:19:58 +0000 (0:00:00.036)       0:00:16.652 ******** 
[0;36mincluded: /opt/ansible/roles/splunk_standalone/tasks/../../splunk_common/tasks/check_for_required_restarts.yml for localhost[0m
Wednesday 03 July 2019  12:19:58 +0000 (0:00:00.064)       0:00:16.717 ******** 

TASK [splunk_standalone : Check for required restarts] *************************
[0;32mok: [localhost][0m
Wednesday 03 July 2019  12:19:58 +0000 (0:00:00.157)       0:00:16.874 ******** 

PLAY RECAP **************

In [72]:
!sudo docker ps

CONTAINER ID        IMAGE                  COMMAND                  CREATED             STATUS                             PORTS                                                                                                                      NAMES
37a1b5ba9ce3        splunk/splunk:latest   "/sbin/entrypoint.sh…"   27 seconds ago      Up 25 seconds (health: starting)   0.0.0.0:8000->8000/tcp, 8065/tcp, 0.0.0.0:8070->8070/tcp, 8088/tcp, 8191/tcp, 9887/tcp, 0.0.0.0:8089->8089/tcp, 9997/tcp   soc1


In [73]:
dockerid=!sudo docker ps -aqf "name=soc1"
dockerid=dockerid[0]
dockerid

'37a1b5ba9ce3'

In [74]:
!mkdir apps_splunk_default
!sudo docker cp $dockerid:/opt/splunk/etc/apps/. apps_splunk_default/

In [75]:
!sudo docker stop $dockerid

37a1b5ba9ce3


In [76]:
!sudo docker rm $dockerid

37a1b5ba9ce3


## Download apps from splunkbase

Most of the security apps for splunk is located at splunkbase. Splunkbase seems to be intended to be used only manually. This is a bit annoying when we need a lot of apps
for a docker cotainer that we would like to be able to automatically rebuild. 

The main problem is that Splunkbase uses reCAPTCHA for logins. The best way around this problem is to login with a browser and export the cookies. Use for example https://addons.mozilla.org/en-US/firefox/addon/export-cookies-txt/

Having the cookies available it is possible to use curl to download the apps. For example:
```
curl -L --cookie cookies.txt https://splunkbase.splunk.com/app/3749/release/2.0.0/download/ -o 3749.tgz
```

Using curl it is difficult to automatically find the latest version of an app. It is also difficult to do proper error checking. I therefor ended up using the python script below for downloading splunk apps

Use these scripts at your own risk. 

In [3]:
!mkdir -p ~/splunk/download
%cd ~/splunk/download

/home/ole/splunk/download


In [78]:
##Function for downloading splunkapps
##
## You need to login with a browser and export the cookies.
## default the script expects the cookies.txt to be located in the current working directory
## Use for example https://addons.mozilla.org/en-US/firefox/addon/export-cookies-txt/
##

import os
import re
import requests
import http.cookiejar
from collections import namedtuple
from tqdm import tqdm
import pickle

from IPython.core.debugger import set_trace

class LoginError(Exception): pass
class NoFileToDownload(Exception): pass
SplunkApp=namedtuple("SplunkApp",["id","name","filename","baseurl","downloadurl","version"])

def download_splunk_app(appid,cookiefile="cookies.txt"):
    
    #Load cookies from browser, site uses recaptcha and we can't automatically login
    cookies=http.cookiejar.MozillaCookieJar(cookiefile)
    cookies.load()
    
    #Download html
    baseurl=f"https://splunkbase.splunk.com/app/{appid}/"
    r = requests.get(baseurl, cookies=cookies)
    
    #Check if login success
    if "LOGIN TO DOWNLOAD" in r.text:
        raise LoginError("Old or missing cookies?")
    
    #Extract data from html
    version=re.search(r'sb-target="(.*?)"',r.text)[1]
    name=re.search(r'<title>([^|]*)',r.text)[1]
    downloadurl=f"{baseurl}release/{version}/download/"
    
    #Download headers from real download url
    header = requests.head(downloadurl, allow_redirects=True,cookies=cookies)
    print(header.headers.get('content-type'))
    
    #Check if download is a real tgz
    if header.headers.get('content-type') != 'application/x-tar':
        raise NoFileToDownload(f"Appid:{appid} Name:{name}  No file to download")
    
    #Get filename form header
    disposition=header.headers.get("content-disposition")
    filename=re.search('filename="(.*)"',disposition)[1]
    
    #Download if files does not already exists
    if os.path.isfile(filename):
        print (f"{filename} File already exist. Skipping")
    else:
        r = requests.get(downloadurl, cookies=cookies)
        open(filename, 'wb').write(r.content)
    return SplunkApp(appid,name,filename,baseurl,downloadurl,version)

def download_splunk_apps(appids,cookiefile="cookies.txt"):
    apps=[]
    for appid in tqdm(appids):
        try:
            app=download_splunk_app(appid)
            apps.append(app)
        except LoginError as err:
            print (err)
            break
        except NoFileToDownload as err:
            print (err)
            continue
    return apps
#app = download_splunk_app("4305")
    

In [79]:
#Setup apps 
#Remeber to have cookies.txt in pwd 

appids=[3749, 1922, 2734, 3435, 3626, 1621, 3186, 2757, 2772, 2760, 1914, 3278, 3172, 1493, 3185, 833, 742, 3110, 1809,3540,3449,4305,1914,3129,3767,3112,1724]
apps=download_splunk_apps(appids)

  4%|▎         | 1/27 [00:04<02:07,  4.92s/it]

application/x-tar
sa-investigator-for-enterprise-security_200.tgz File already exist. Skipping


  7%|▋         | 2/27 [00:08<01:49,  4.38s/it]

application/x-tar
base64_11.tgz File already exist. Skipping


 11%|█         | 3/27 [00:10<01:32,  3.84s/it]

application/x-tar
url-toolbox_16.tgz File already exist. Skipping


 15%|█▍        | 4/27 [00:15<01:33,  4.05s/it]

application/x-tar
splunk-security-essentials_242.tgz File already exist. Skipping


 19%|█▊        | 5/27 [00:17<01:17,  3.51s/it]

application/x-tar
jellyfisher_010.tgz File already exist. Skipping


 22%|██▏       | 6/27 [00:20<01:09,  3.32s/it]

application/x-tar
splunk-common-information-model-cim_4130.tgz File already exist. Skipping


 26%|██▌       | 7/27 [00:22<00:59,  2.97s/it]

application/x-tar
splunk-add-on-for-apache-web-server_100.tgz File already exist. Skipping


 30%|██▉       | 8/27 [00:25<00:55,  2.94s/it]

application/x-tar
palo-alto-networks-add-on-for-splunk_611.tgz File already exist. Skipping


 33%|███▎      | 9/27 [00:27<00:49,  2.77s/it]

application/x-tar
splunk-add-on-for-symantec-endpoint-protection_230.tgz File already exist. Skipping


 37%|███▋      | 10/27 [00:29<00:44,  2.60s/it]

application/x-tar
splunk-ta-for-suricata_233.tgz File already exist. Skipping


 41%|████      | 11/27 [00:33<00:47,  3.00s/it]

application/x-tar
add-on-for-microsoft-sysmon_810.tgz File already exist. Skipping


 44%|████▍     | 12/27 [00:36<00:43,  2.88s/it]

application/x-tar
splunk-app-for-osquery_10.tgz File already exist. Skipping


 48%|████▊     | 13/27 [00:38<00:37,  2.71s/it]

application/x-tar
ssl-certificate-checker_32.tgz File already exist. Skipping


 52%|█████▏    | 14/27 [00:43<00:43,  3.33s/it]

application/x-tar
website-monitoring_274.tgz File already exist. Skipping


 56%|█████▌    | 15/27 [00:46<00:37,  3.10s/it]

application/x-tar
splunk-add-on-for-microsoft-iis_101.tgz File already exist. Skipping


 59%|█████▉    | 16/27 [00:49<00:34,  3.09s/it]

application/x-tar
splunk-add-on-for-unix-and-linux_602.tgz File already exist. Skipping


 63%|██████▎   | 17/27 [00:52<00:30,  3.03s/it]

application/x-tar
splunk-add-on-for-microsoft-windows_600.tgz File already exist. Skipping


 67%|██████▋   | 18/27 [00:54<00:26,  2.92s/it]

application/x-tar
splunk-add-on-for-microsoft-cloud-services_310.tgz File already exist. Skipping


 70%|███████   | 19/27 [00:57<00:23,  2.96s/it]

application/x-tar
splunk-stream_713.tgz File already exist. Skipping


 74%|███████▍  | 20/27 [01:00<00:19,  2.79s/it]

application/x-tar
json-tools_013.tgz File already exist. Skipping


 78%|███████▊  | 21/27 [01:05<00:20,  3.45s/it]

application/x-tar
splunk-es-content-update_1040.tgz File already exist. Skipping


 81%|████████▏ | 22/27 [01:10<00:19,  3.90s/it]

application/x-tar
threathunting_134.tgz File already exist. Skipping


 85%|████████▌ | 23/27 [01:13<00:14,  3.70s/it]

application/x-tar
add-on-for-microsoft-sysmon_810.tgz File already exist. Skipping


 89%|████████▉ | 24/27 [01:15<00:09,  3.32s/it]

application/x-tar
punchcard-custom-visualization_130.tgz File already exist. Skipping


 93%|█████████▎| 25/27 [01:19<00:06,  3.47s/it]

application/x-tar
force-directed-app-for-splunk_301.tgz File already exist. Skipping


 96%|█████████▋| 26/27 [01:22<00:03,  3.24s/it]

application/x-tar
sankey-diagram-custom-visualization_130.tgz File already exist. Skipping


100%|██████████| 27/27 [01:26<00:00,  3.68s/it]

application/x-tar
lookup-file-editor_332.tgz File already exist. Skipping





In [80]:
#Checkpoint 
#Makes it possible to skip downloading every time
pickle.dump(apps,open("apps.pickle","wb"))

## Install apps

Now that both default apps and splunkbase apps are downloaded we can extract them to a common folder and run install scripts.

In [81]:
#Load from checkpoint
%cd ~/splunk/
apps=pickle.load(open("download/apps.pickle","rb"))

/home/ole/splunk


In [82]:
#Copy default apps
!mkdir apps
!sudo cp -r apps_splunk_default/* apps/

In [83]:
# Install splunkbase apps
for app in apps:
    print(app.filename)
    !sudo tar -zxf download/$app.filename -C apps/

sa-investigator-for-enterprise-security_200.tgz
base64_11.tgz
url-toolbox_16.tgz
splunk-security-essentials_242.tgz
jellyfisher_010.tgz
splunk-common-information-model-cim_4130.tgz
splunk-add-on-for-apache-web-server_100.tgz
palo-alto-networks-add-on-for-splunk_611.tgz
splunk-add-on-for-symantec-endpoint-protection_230.tgz
splunk-ta-for-suricata_233.tgz
add-on-for-microsoft-sysmon_810.tgz
splunk-app-for-osquery_10.tgz
ssl-certificate-checker_32.tgz
website-monitoring_274.tgz
splunk-add-on-for-microsoft-iis_101.tgz
splunk-add-on-for-unix-and-linux_602.tgz
splunk-add-on-for-microsoft-windows_600.tgz
splunk-add-on-for-microsoft-cloud-services_310.tgz
splunk-stream_713.tgz
json-tools_013.tgz
splunk-es-content-update_1040.tgz
threathunting_134.tgz
add-on-for-microsoft-sysmon_810.tgz
punchcard-custom-visualization_130.tgz
force-directed-app-for-splunk_301.tgz
sankey-diagram-custom-visualization_130.tgz
lookup-file-editor_332.tgz


In [84]:
#Download lookup files for threat hunting app
if not os.path.isfile("ThreatHunting.tar.gz"):
    !curl -L https://github.com/olafhartong/ThreatHunting/raw/master/files/ThreatHunting.tar.gz --output ThreatHunting.tar.gz

In [85]:
!sudo tar -zxf ThreatHunting.tar.gz -C apps/

In [86]:
#Download botsv2 dataset
#https://github.com/splunk/botsv2
if not os.path.isfile("botsv2_data_set_attack_only.tgz"):
    !curl -O https://s3.amazonaws.com/botsdataset/botsv2/botsv2_data_set_attack_only.tgz

In [87]:
!sudo tar -zxf botsv2_data_set_attack_only.tgz  -C apps/

## Start splunk docker with shared folder

In [88]:
splunkapps=%pwd 
splunkapps+= "/apps"

In [89]:
!sudo docker run -v $splunkapps:/opt/splunk/etc/apps --network soc1 --name soc1 --hostname soc1 -p 8000:8000 -p 8089:8089 -p 8070:8070 -e "SPLUNK_PASSWORD=changeme" -e "SPLUNK_START_ARGS=--accept-license" -t splunk/splunk:latest


PLAY [Run default Splunk provisioning] *****************************************
Wednesday 03 July 2019  12:22:50 +0000 (0:00:00.020)       0:00:00.020 ******** 

TASK [Gathering Facts] *********************************************************
[0;32mok: [localhost][0m
Wednesday 03 July 2019  12:22:51 +0000 (0:00:01.344)       0:00:01.365 ******** 
Wednesday 03 July 2019  12:22:52 +0000 (0:00:00.038)       0:00:01.404 ******** 
Wednesday 03 July 2019  12:22:52 +0000 (0:00:00.036)       0:00:01.440 ******** 
Wednesday 03 July 2019  12:22:52 +0000 (0:00:00.102)       0:00:01.542 ******** 
[0;36mincluded: /opt/ansible/roles/splunk_common/tasks/get_facts.yml for localhost[0m
Wednesday 03 July 2019  12:22:52 +0000 (0:00:00.072)       0:00:01.615 ******** 

TASK [splunk_common : Set privilege escalation user] ***************************
[0;32mok: [localhost][0m
Wednesday 03 July 2019  12:22:52 +0000 (0:00:00.037)       0:00:01.652 ******** 

TASK [splunk_common : Check for existing ins

Wednesday 03 July 2019  12:23:44 +0000 (0:00:00.036)       0:00:54.050 ******** 
Wednesday 03 July 2019  12:23:44 +0000 (0:00:00.033)       0:00:54.083 ******** 
Wednesday 03 July 2019  12:23:44 +0000 (0:00:00.036)       0:00:54.120 ******** 
Wednesday 03 July 2019  12:23:44 +0000 (0:00:00.043)       0:00:54.163 ******** 
Wednesday 03 July 2019  12:23:44 +0000 (0:00:00.036)       0:00:54.199 ******** 
Wednesday 03 July 2019  12:23:44 +0000 (0:00:00.042)       0:00:54.242 ******** 
Wednesday 03 July 2019  12:23:44 +0000 (0:00:00.038)       0:00:54.280 ******** 
[0;36mincluded: /opt/ansible/roles/splunk_standalone/tasks/../../splunk_common/tasks/check_for_required_restarts.yml for localhost[0m
Wednesday 03 July 2019  12:23:44 +0000 (0:00:00.065)       0:00:54.346 ******** 

TASK [splunk_standalone : Check for required restarts] *************************
[0;32mok: [localhost][0m
Wednesday 03 July 2019  12:23:45 +0000 (0:00:00.156)       0:00:54.502 ******** 

PLAY RECAP **************

In [90]:
dockerid=!sudo docker ps -aqf "name=soc1"
dockerid=dockerid[0]
dockerid

'ab2f4667fdfe'