Skip to content
Splunk messages from Telegram groups
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
bin
img
splunk-app
.dockerignore
.gitignore
1-telegram-html-to-json.sh
2-start-splunk.sh
CODE_OF_CONDUCT.md
Dockerfile-python
Dockerfile-splunk
LICENSE
README.md
entrypoint.sh
requirements.txt

README.md

Splunk Telegram

This app lets you run Splunk against messages from Telegram groups and generate graphs based on the activity in them.

Splunk Telegram includes a Natural Language Processing (NLP) module which lets you extract things like sentiment, Named Entities, etc.

This app is based on my other app, Splunk Lab, which is a generate Splunk platform build for ingesting data on an ad-hoc basis. You should check it out!

Screenshots

Requirements

  • Docker
  • HTML exports from a Telegram conversation, channel, or group.
    • Exporting is explained further below

Usage

  • First step is to convert Telegram's HTML into JSON that Splunk can understand:
    • bash <(curl -s https://raw.githubusercontent.com/dmuth/splunk-telegram/master/1-telegram-html-to-json.sh path/to/telegram-export/messages\*.html > logs/Group-Name.json
  • Then, run Splunk:
    • SPLUNK_START_ARGS=--accept-license bash <(curl -s https://raw.githubusercontent.com/dmuth/splunk-telegram/master/2-start-splunk.sh)
    • You'll be presented with a list of options to confirm, change your environment variables if you like and re-run, otherwise press ENTER to launch Splunk.

By default, Splunk will be listening at https://localhost:8000/.

Exporting Data From Telegram

Telegram has a blog post which explains how to export data over here. However, if you follow those instructions, everything will be exported, a process which will take hours and hours. Instead, we recommend that you export a single channel, group, or conversation at a time. This can be done in the Telegram Desktop App by going into the converstaion or group and manually exporting it:

This will save the converstaion in Telegram's own HTML format, which we can then parse to extract messages.

Licensing

Splunk has its own license. Please abide by it.

The Docker image ships with the NLP Text Analytics app, which is licensed under the MIT License.

TODO/Bugs

Only regular messages are supported at this time. If a photo or sticker is found, a note will be made that it was a photo of a specified size. No other media types (including stickers) are supported at this time. Forwarded messages are not counted/supported at this time.

Contact

My email is doug.muth@gmail.com. I am also @dmuth on Twitter and Facebook!

You can’t perform that action at this time.