This project lets you stand up a Splunk instance in Docker on a quick and dirty basis.
Paste either of these on the command line:
bash <(curl -s https://raw.githubusercontent.com/dmuth/splunk-lab/master/go.sh)
bash <(curl -Ls https://bit.ly/splunklab)
...and the script will print up what directory it will ingest logs from, your password, etc. Follow the on-screen
instructions for setting environment variables and you'll be up and running in no time! Whatever logs you had sitting in your logs/
directory will be searchable in Splunk with the search index=main
.
If you want to see neat things you can do in Splunk Lab, check out the Cookbook section.
- https://localhost:8000/ - Default port to log into the local instance. Username is
admin
, password is what was set when starting Splunk Lab. - Splunk Dashboard Examples - Wanna see what you can do with Splunk? Here are some example dashboards.
- App databoards can be stored in the local filesystem (they don't dissappear when the container exits)
- Ingested data can be stored in the local filesystem
- Multiple REST and RSS endpoints "built in" to provide sources of data ingestion
- Integration with REST API Modular Input
- Splunk Machine Learning Toolkit included
/etc/hosts
can be appended to with local ip/hostname entries- Ships with Eventgen to populate your index with fake webserver events for testing.
These are screenshots with actual data from production apps which I built on top of Splunk Lab:
![](/duanshuaimin/splunk-lab/raw/main/img/bella-italia.png)
![](/duanshuaimin/splunk-lab/raw/main/img/facebook-glassdoor.png)
![](/duanshuaimin/splunk-lab/raw/main/img/pa-furry-stats.jpg)
![](/duanshuaimin/splunk-lab/raw/main/img/network-huge-outage.png)
![](/duanshuaimin/splunk-lab/raw/main/img/fitbit-sleep-dashboard.png)
![](/duanshuaimin/splunk-lab/raw/main/img/snepchat-tag-cloud.jpg)
What can you do with Splunk Lab? Here are a few examples of ways you can use Splunk Lab:
- Drop your logs into the
logs/
directory. bash <(curl -Ls https://bit.ly/splunklab)
- Go to https://localhost:8000/
- Ingsted data will be written to
data/
which will persist between runs.
SPLUNK_DATA=no bash <(curl -Ls https://bit.ly/splunklab)
- Note that
data/
will not be written to and launching a new container will causelogs/
to be indexed again.- This will increase ingestion rate on Docker for OS/X, as there are some issues with the filesystem driver in OS/X Docker.
SPLUNK_EVENTGEN=1 bash <(curl -Ls https://bit.ly/splunklab)
- Fake webserver logs will be written every 10 seconds and can be viewed with the query
index=main sourcetype=nginx
. The logs are based on actual HTTP requests which have come into the webserver hosting my blog.
- Edit a local hosts file
ETC_HOSTS=./hosts bash <(curl -Ls https://bit.ly/splunklab)
- This can be used in conjunction with something like Splunk Network Monitor to ping hosts that don't have DNS names, such as your home's webcam. :-)
- Run any of the above with
PRINT_DOCKER_CMD=1
set, and the Docker command line that's used will be written to stdout.
This would normally be done with the script ./bin/devel.sh
when running from the repo,
but if you're running Splunk Lab just with the Docker image, here's how to do it:
docker run -p 8000:8000 -e SPLUNK_PASSWORD=password1 -v $(pwd)/data:/data -v $(pwd)/logs:/logs --name splunk-lab --rm -it -v $(pwd):/mnt -e SPLUNK_DEVEL=1 dmuth1/splunk-lab bash
This is useful mainly if you want to poke around in Splunk Lab while it's running. Note that you
could always just run docker exec splunk-lab bash
instead of doing all of the above. :-)
The following Splunk apps are included in this Docker image:
- REST API Modular Input (requires registration)
- Wordcloud Custom Visualization
- Slack Notification Alert
- Splunk Machine Learning Toolkit
All apps are covered under their own license. Please check the Apps page for more info.
Splunk has its own license. Please abide by it.
I put together this curated list of free sources of data which can be pulled into Splunk via one of the included apps:
- RSS
- REST (you will need to set
$REST_KEY
when starting Splunk Lab)- Non-streaming
- Streaming
Since building Splunk Lab, I have used it as the basis for building other projects:
- SEPTA Stats
- Website with real-time stats on Philadelphia Regional Rail.
- Pulled down over 60 million train data points over 4 years using Splunk.
- Splunk Twint
- Splunk dashboards for Twitter timelines downloaded by Twint. This now a part of the TWINT Project.
- Splunk Yelp Reviews
- This project lets you pull down Yelp reviews for venues and view visualizations and wordclouds of positive/negative reviews in a Splunk dashboard.
- Splunk Glassdoor Reviews
- Similar to Splunk Yelp, this project lets you pull down company reviews from Glassdoor and Splunk them
- Splunk Telegram
- This app lets you run Splunk against messages from Telegram groups and generate graphs and word clouds based on the activity in them.
- Splunk Network Health Check
- Pings 1 or more hosts and graphs the results in Splunk so you can monitor network connectivity over time.
- Splunk Fitbit
- Analyzes data from your Fitbit
- Splunk for AWS S3 Server Access Logs
- App to analyize AWS S3 Access Logs
Here's all of the above, presented as a graph:
A sample app (and instructions on how to use it) are in the
sample-app directory.
Feel free to expand on that app for your own apps.
HTTPS is turned on by default. Passwords such as password
and 12345 are not permitted.
Please, use a strong password if you are deploying this on a public-facing machine.
Yes, you can!
First, install mkcert and then run mkcert -install && mkcert localhost 127.0.0.1 ::1
to generate a local CA and a cert/key combo for localhost.
Then, when you run Splunk Lab, set the environment variables SSL_KEY
and SSL_CERT
and those files will be pulled into Splunk Lab.
Example: SSL_KEY=./localhost.key SSL_CERT=./localhost.pem ./go.sh
Sure does! I built this on a Mac. :-)
I wrote a series of helper scripts in bin/
to make the process easier:
./bin/build.sh
- Build the containers.- Note that this downloads packages from an AWS S3 bucket that I created. This bucket is set to "requestor pays", so you'll need to make sure the
aws
CLI app set up.
- Note that this downloads packages from an AWS S3 bucket that I created. This bucket is set to "requestor pays", so you'll need to make sure the
./bin/download.sh
- Download tarballs of various apps and splits some of them into chunks./bin/upload-file-to-s3.sh
- Upload a specific file to S3. For rolling out new versions of apps./bin/push.sh
- Tag and push the container../bin/devel.sh
- Build and tag the container, then start it with an interactive bash shell.- This is a wrapper for the above-mentioned
go.sh
script. Any environment variables that work there will work here. - To force rebuilding a container during development touch the associated Dockerfile in
docker/
. E.g.touch docker/1-splunk-lab
to rebuild the contents of that container.
- This is a wrapper for the above-mentioned
./bin/create-1-million-events.py
- Create 1 million events in the file1-million-events.txt
in the current directory.- If not in
logs/
but reachable from the Docker container, the file can then be oneshotted into Splunk with the following command:/opt/splunk/bin/splunk add oneshot ./1-million-events.txt -index main -sourcetype oneshot-0001
- If not in
./bin/kill.sh
- Kill a runningsplunk-lab
container../bin/attach.sh
- Attach to a runningsplunk-lab
container../bin/clean.sh
- Removelogs/
and/ordata/
directories../bin/tarsplit
- Local copy of my pacakge from https://github.com/dmuth/tarsplit
- Here's the layout of the
cache/
directorycache/
- Where tarballs for Splunk and its apps hang out. These are downloaded whenbin/download.sh
is run for the first time.cache/deploy/
- When creating a specific Docker image, files are copied here so the Dockerfile can ingest them. (Or rather hardlinked to the files in the parent directory.)cache/build/
- 0-byte files are written here when a specific container is built, and on future builds, the age of that file is checked against the Dockerfile. If the Dockerfile is newer, then the container is (re-)built. Otherwise, it is skipped. This shortens a run ofbin/devel.sh
where no containers need to be built from 12 seconds on my 2020 iMac to 0.2 seconds.
I had to struggle with this for awhile, so I'm mostly documenting it here.
When in devel mode, /opt/splunk/etc/apps/splunk-lab/
is mounted to ./splunk-lab-app/
via go.sh
and the entrypoint script inside of the container symlinks local/
to default/
.
This way, any changes that are made to dashboards will be propagated outside of
the container and can be checked in to Git.
When in production mode (e.g. running ./go.sh
directly), no symlink is created,
instead local/
is mounted by whatever $SPLUNK_APP
is pointing to, so that any
changes made by the user will show up on their host, with Splunk Lab's default/
directory being untouched.
- The Docker containers are dmuth1/splunk-lab and dmuth1/splunk-lab-ml. The latter has all of the Machine Learning apps built in to the image. Feel free to extend those for your own projects.
- If I run
./bin/create-test-logfiles.sh 10000
and then start Splunk Lab on a Mac, all of the files will be Indexed without any major issues, but then the CPU will spin, and not from Splunk.- The root cause is that the filesystem code for Docker volume mappings on OS/X's Docker implementation is VERY inefficient in terms of both CPU and memory usage, especially when there are 10,000 files involved. The overhead is just crazy. When reading events from a directory mounted through Docker, I see about 100 events/sec. When the directory is local to the container, I see about 1,000 events/sec, for a 10x difference.
- The HTTPS cert is self-signed with Splunk's own CA. If you're tired of seeing a Certificate Error every time you try connecting to Splunk, you can follow the instructions at https://stackoverflow.com/a/31900210/196073 to allow self-signed certificates for
localhost
in Google Chrome.- Please understand the implications before you do this.
- Splunk N' Box - Splunk N' Box is used to create entire Splunk clusters in Docker. It was the first actual use of Splunk I saw in Docker, and gave me the idea that hey, maybe I could run a stand-alone Splunk instance in Docker for ad-hoc data analysis!
- Splunk, for having such a fantastic product which is also a great example of Operational Excellence!
- Eventgen is a super cool way of generating simulating real data that can be used to generate dashboards for testing and training purposes.
- This text to ASCII art generator, for the logo I used in the script.
- The logo was made over at https://www.freelogodesign.org/
- Splunk is copyright by Splunk, Inc. Please stay within the confines of the 500 MB/day free license when using Splunk Lab, unless you brought your own license along.
- The various apps are copyright by the creators of those apps.
My email is doug.muth@gmail.com. I am also @dmuth on Twitter and Facebook!