Why waste time writing meaningful instrumentation like a sucker when you can make your users do your monitoring for you? They say that the best alerting is done from the persepctive of the user, so let's harness the fact that people are going to jump on Twitter to complain the moment something goes wrong.
export GOOS="linux"
export GOARCH="amd64"
VERSION=$(git describe --always --dirty)
SHA1=$(git rev-parse --short --verify HEAD)
BUILD_DATE=$(date -u +%F-%T)
go build -ldflags "-extldflags -static -X main.VERSION=${VERSION} -X main.COMMIT_SHA1=${SHA1} -X main.BUILD_DATE=${BUILD_DATE}"
You'll need to Generate a twitter access token pair. The exporter doesn't need to post to Twitter, so you should set the newly created application's permissions model to "Read only".
Once that's done, export the four keys as environment variables.
export TWITTER_ACCESS_TOKEN="..."
export TWITTER_ACCESS_SECRET="..."
export TWITTER_CONSUMER_KEY="..."
export TWITTER_CONSUMER_SECRET="..."
Then run the exporter.
twitter_stream_exporter -twitter.track 'akeyword,anotherkeyword'
The value provided to -twitter.track
should be a comma-separated list of phrases to use in filtering
tweets. See Twitter's API documentation
for details on supported syntax, and continue reading for caveats.
The exporter provides a set of counters that can be used to determine how frequently keywords are being used.
Metric | Notes |
---|---|
twitter_stream_tweets_total | The total number of tweets delivered to the stream. |
twitter_stream_user_mentions_total | The number of times a username provided as an argument to -twitter.track has been @mentioned in the text of a tweet. |
twitter_stream_hashtag_mentions_total | Then number of times a hashtag provided as an argument to -twitter.track has been #mentioned in the text of a tweet. |
twitter_stream_word_mentions_total | The number of times an arguent to -twitter.track has been mentioned as a raw keyword (not an @mention or #hashtag) in the text of a tweet. |
All metrics have a retweet
label (true
or false
). The *_mentions_total
metrics also have a
keyword
label. Keywords are normalised to lowercase.
It's possible for the sum of the *_mentions_total
metrics to exceed twitter_stream_tweets_total
if individual tweets contain multiple keywords or keywords used in multiple contexts. The value of
twitter_stream_tweets_total
may exceed the sum of the other matreics when twitter reutrns tweets
using filtering logic this exporter does not replicate.
A full sample of output can be found below.
twitter_stream_hashtag_mentions_total{keyword="widgetfrobber",retweet="false"} 11
twitter_stream_hashtag_mentions_total{keyword="widgetfrobber",retweet="true"} 23
twitter_stream_hashtag_mentions_total{keyword="dodgycorp",retweet="false"} 7
twitter_stream_hashtag_mentions_total{keyword="dodgycorp",retweet="true"} 14
twitter_stream_tweets_total{retweet="false"} 89
twitter_stream_tweets_total{retweet="true"} 71
twitter_stream_user_mentions_total{keyword="dodgycorp",retweet="true"} 2
twitter_stream_word_mentions_total{keyword="widgetfrobber",retweet="false"} 29
twitter_stream_word_mentions_total{keyword="widgetfrobber",retweet="true"} 19
twitter_stream_word_mentions_total{keyword="dodgycorp",retweet="false"} 37
twitter_stream_word_mentions_total{keyword="dodgycorp",retweet="true"} 11
This exporter uses the Twitter streaming API. The streaming API returns a much more complete set of data than the regular search API, and the last thing we'd want to do is make ill-informed decisions. The streaming API brings some caveats with it.
-
As of 2017-04 each account is limited to a single active stream with up to 400 track keywords.
-
There's a small chance that the streaming API may return duplicate messages, and the exporter makes no attempt to account for this.
-
The underlying Golang Twitter stream handler does not support gzip.
The streaming API doesn't provide any indication as to which filter caused it to deliver a Tweet,
meaning the exporter needs to inspect each message it receives in order to set meaningful labels.
The exporter doesn't implement any of the fuzzier matching that Twitter does
and cannot match track
arguments other than when they appear as single words. The following
circumstances in which it can't detect keywords are known to arise.
- Immediately alongside UTF characters (
KEYWORDベ
). - When separated with an underscore (
KEYWORD_somethingelse
), and possibly other characters. - When adjacent to punctuation (
"KEYWORD"
). - Hashtags (and possibly @mentions and bare words?) on the other side of t.co direct links.
There are some odd occasions in which the stream also appears to return some tweets that seemingly match none of the filters. That may be an expected behaviour of the streaming API, or some less obvious filtering behaviour.