This repository features an example service consisting of multiple components working hand in hand to collect URLs mentioned on Twitter and create a hotlist of popular URLs.
It can be run with docker-compose or Kubernetes.
docker-compose.yml files for a more technical description of what this example service provides. Compare with the Kubernetes manifests in the
This component consumes the Twitter Stream API, looking for tweets containing the strings
https to fetch all tweets with links. The tweets are then parsed for contained URLs.
The URLs found are stored in the
inbox redis database.
This component is a simple Redis database that receives all found URLs from the
tracker component. It makes use of the official Redis Docker image.
This component consciously does not provide a volume, which means that whenever this component is restarted, the database content is lost.
The script inside this component reads URLs from the
inbox Redis database and creates requests to those URLs in order to resolve redirects, to reveal the actual target URL. The resulting URL is stored in the
hotlist Redis database.
To prevent accessing the same URL several times, a cache is maintained in the
resolver component can be thought of as a worker, processing jobs from a queue. Since resolving URLs is in many cases a time-consuming job, there can be multiple instances of this component working in parallel.
This component contains a little script that watches the size of the
inbox Redis database to find out if it remains constant. In case it's growing, it logs this information and tells that there shoul be more
resolver instances to prevent the inbox from growing too big.
As a future improvement, the
resolver-scaler can be modified to actually initiate the scaling of the
resolver component via the Giant Swarm API.
This second Redis database component stores all resolved URLs together with scoring information. It also contains the cache for the
resolver. Just like the
inbox component, we use the official Redis Docker image here.
In contrast to the
inbox component, the
hotlist provides a volume to persist the database throughout restarts.
This component contains a little helper that periodically removes outdated information from the
hotlist Redis database.
This is a Python/Flask web application that offers a JSON API to fetch the resulting URL hotlist.
rebrow component offers a web-based user interface ("rebrow" stands for "redis browser") to debug the content of both Redis databases. It makes use of a third party Docker image.
Credentials to Access Twitter API
Name: thux Description: Tracks URLs mentioned on Twitter and creates a ranked list Website: https://github.com/giantswarm/twitter-hot-urls-example Callback URL: <leave this field blank>
Additionally an Access Token needs to be generated under "Keys and Access Tokens". In the end four secrets or tokens need to be edited in
secrets/twitter-api-secret.env for the docker-compose setup and in
secrets/twitter-api-secret.yaml to run the Kubernetes example. For Kubernetes these values need to be encoded with
base64, please see Kubernetes documentation about secrets.
Starting with Docker Compose
docker-compose up -d docker-compose ps docker-compose logs docker-compose stop tracker docker network ls docker network inspect thux_default