Telegram channels' posts aggregator

🚧 ...Work in progress... 🚧

📩 📭 Aggregates posts from your telegram channel(s) assigned to your bot(s), filters the data, saves it into MongoDB & renders the data using React (see the client folder).

Features & Technologies 💡

telegraf.js
cheerio
sync-request but consider then-request
normalize-url
@multifeed_edge_bot to aggregate and redirect messages from a set of channels
ElasticSearch
monstache
Docker Hub Registry
Traefik
Self-hosted Drone.io: CI/CD Tool
Telegram

Installation

See server/README.md & client/README.md files for the detailed installation steps.

The big picture of CI/CD

_{Figure 1. CI/CD step-by-step.}

The setup of Continuous Deployment and prerequisites

My set-up is a healthy combination of the containers arranged into a Docker Swarm Stack, the Docker Hub Registry's with automated builds, Traefik (v2) and the self-hosted lightweight container-native CI/CD platform called Drone.io with couple of plugins.

Docker Hub's Automated Builds vs GitHub Actions

The set-up is somewhat similar to the one presented in this post, The post outlines a CI/CD pipeline entirely based on GitHub Actions and on some bash scripts, executed via SSH on a VPS. Besides the redundant configuration of the Docker Hub's secrets and the recurring logins to Docker Hub, author hard-codes a deployment webhook in the deploy step of the pipeline, i.e. a curl POST request to his own custom endpoint. His solution is a bit over-engineered, though, it allows to control things manually, and useful if you're planning to stick to GitHub Actions.

If you connect your Docker Hub account with a GitHub repository you actually don't need to configure any additional secrets in GitHub, nor use any GitHub Actions or any commands to build & push your containers (step 5 in the Figure 1). With a pre-configured automated build for each stack (or even for each microservice, coming from another repository), Docker Hub Registry (re-)builds and (re-)places the required containers by itself, provided there was a push event (step 1).

An automated build avoids the manual work of building, tagging and pushing Docker images. It also makes it easier to keep the code in your images in sync with the code in your version control system. Lastly there are some important security benefits if you rebuild your images regularly to incorporate security updates. Source: Why use an automated build?

For instance, I have two containers in the current GitHub repository with the corresponding Dockerfiles for the frontend and for the backend: ./client/Dockerfile.prod and ./server/Dockerfile.prod. The example configuration of the frontend's container is shown in the image below:

_{Figure 2. Automated docker build: frontend container's configuration.}

These two containers are (re-)built and saved in the Docker Hub Registry every time I push new changes to the repository, these microservices were pushed to.

To avoid any mis-configuration, it's recommended to verify the GitHub's Webhooks (https://github.com/{github-user}/{github-repo}/settings/hooks), which are being added by Docker Hub, after the set-up of the automated build(s) in the Docker Account: https://hub.docker.com/repository/docker/{organisation-name}/{repo-name}/builds/edit.

Automated builds allow you to go even more deeper. For instance, if you need to build a container with some custom parameters, i.e. pass additional arguments or environment variables, you can override the docker build command by setting up the custom build phase hooks. The docker build command would then look like this one:

docker build -t $IMAGE_NAME -f $DOCKERFILE_PATH --build-arg CUSTOM_ENV_VAR=$ENV_VAR_FROM_DOCKER_HUB .

If you save the ENV_VAR_FROM_DOCKER_HUB environment variable in the Automated Build's settings (Build configurations in Figure 2), you will then be able to access it as an argument later as explaned below:

_{Figure 3. Set up of environment variables in Automated Build's settings.}

As a result, the CUSTOM_ENV_VAR will be available and can then be accessed during the build phase like so:

# see the complete example in: ./client/Dockerfile.prod
FROM mhart/alpine-node:10 as machine-1
ARG CUSTOM_ENV_VAR
ENV CUSTOM_ENV $CUSTOM_ENV_VAR
RUN PASS_THIS_ENV_VAR_TO_SCRIPT=${CUSTOM_ENV} node scripts/script.js
# you can get the env variable passed to scripts/script.js using: process.env.PASS_THIS_ENV_VAR_TO_SCRIPT
# see: ./client/src/utils/constants/index.js
RUN echo $CUSTOM_ENV_VAR > /files/in/the/path/file_with_a_custom_var
# (...)
# and then in another build phase of you Dockerfile:
COPY --from=machine-1 /files/in/the/path /copied/here/in/another/path
RUN cat /copied/here/in/another/path/file_with_a_custom_var

By the way, you need to prepend the REACT_APP_ prefix to your environment variables if you are trying to compile a react application (based on the create-react-app) during the build phase and do want to pass any ENV vars into it:

# (...) so the line with an environment variable becomes:
RUN REACT_APP_ENV_VAR=${CUSTOM_ENV} node scripts/build.js --env.NODE_ENV=production
# (...)
# you can then read the env variable in any script using: process.env.REACT_APP_ENV_VAR

Of course you can set up multiple environment variables (like in this example) using the --build-arg flag:

docker build --build-arg <varname1>=<value1> --build-arg <varname2>=<value2> -t $IMAGE_NAME  -f $DOCKERFILE_PATH .

In case you're wondering what these $IMAGE_NAME and $DOCKERFILE_PATH are? Those are the default utility environment variables and "are available during automated builds". The flags -t and -f with these variables should remain unchanged in order to correctly rerun (i.e., override) the docker build command with the same build configuration, as it was shown previously in Figures 2 and 3.

Caution: A hooks/build file overrides the basic docker build command used by the builder, so you must include a similar build command in the hook or the automated build fails. Source: Override the “build” phase to set variables.

All this is required due to the reason that:

For sensible reasons Docker don’t allow dynamic code to run in the Dockerfile. Instead the ARG command is provided to pass data into the Dockerfile. This is easy when building images locally but for automated builds you need to use a build hook script. (...) Create a file called /hooks/build relative to your Dockerfile. This overrides the Docker build process so your script needs to build and tag the image. Source: Docker don’t allow dynamic code to run in the Dockerfile

Read this post for another short explanation and a good example.

The complete code of my example of a build hook script is here.

The deployment step with Drone CI/CD

_{Figure 1. CI/CD step-by-step.}

As for the (re-)deployment (step 8 in Figure 1), the Drone CI/CD tool is responsible for it as well as for the notification about the build's status (step 9 in Figure 1) at least in the pipeline of this repo. The new Docker containers are pulled (step 7 in Figure 1) and the Docker Swarm Stack is updated after the execution of one single command in the end via the Drone SSH plugin on a VPS:

docker stack deploy -c /path/to/docker-compose-stack-file.yml {stack-name}

If you prefer to use multiple repositories for each stack (or even for each microservice), you can also set up Drone CI/CD in the way to trigger the Drone's build (deployment) of the current repository with a set of alternative pipeline commands for the containers used in this stack only.

It is always possible to update services in a stack one after another, applying the rolling update to each microservice separately:

docker service update {SERVICE-NAME}

Docker Registry Webhooks & Drone.io insights

Docker Hub's outgoing Webhooks can be received by Drone CI/CD

The step 6 of the Figure 1 is all about Docker Hub Webhooks that are "(...) POST requests sent to a URL you define in Docker Hub" by default. This way one can notify the Drone CI/CD (or any other platform) that the containers were built, you can set up the webhook in Docker Hub in one of your repositories, preferably in the repository that's built last of all:

https://hub.docker.com/repository/docker/{organisation-name}/{repo-name}/webhooks

The Webhook's name is not important, but the Webhook URL is crucial and is explained in detail in the next section.

Which Docker Hub's Webhook URLs are valid to use with the Drone.io's API?

To process an external call I use the built-in Drone.io's REST API endpoint, which receives GET, POST, DELETE and other external requests.

Accessing the endpoint, you can, for instance, view the list of the recent deployments (the deployments in the Drone CI/CD platform are called builds) in one of your repositories by sending the following GET request using curl or just by opening this page in your browser:

http://YOUR_IP_OR_DNS:PORT/api/repos/{github-owner}/{repo-name}/builds

I'm not doing any tests nor am I building & pushing any containers, as they were already built by Docker Hub. Though, in my case I need to trigger the pipeline and start a new deployment process (Drone's build), which will then pull all the newly built containers from Docker Hub and deploy them to Swarm.

To trigger the deploy step in the pipeline, a new build needs to be invoked with the custom (or promote) drone event type. Consequently, a new build with the corresponding listing of the pipeline commands is limited to the execution of the deploy step, as it's stated by the Drone developers. In fact, the execution isn't limited to just one step, but runs the pipeline, starting from this step with the custom event, then executes all the subsequent steps. In this way the deployment is triggered by Drone CI/CD after the webhook from Docker Hub was received by the Drone's API endpoint as the valid POST request:

POST http://YOUR_IP_OR_DNS:PORT/api/repos/{github-owner}/{repo-name}/builds?branch={branch-in-the-repo-with-the-drone-yml-pipeline-file}

It would have been enough if the API endpoint were public. The part on how to accomplish an authorized POST request without curl and the part about /builds parameters are so far (March, 2020) quite badly documented. By the way, the personal bearer authorization token generated by Drone.io itself can be found in the account settings: http://YOUR_IP_OR_DNS:PORT/account.

In the Docker Hub there is currently no way to send POST requests with parameters like you would do it on reqbin.com as well as with an option to provide a type of authorization, etc. So what does the Webhook URL should look like to be the valid request that satisfies the Drone.io API endpoint?

It turns out, you can send an authorized POST request by passing over the ?access_token= parameter that, among other parameters, is being validated during the creation of a build. The final Webhook URL, triggering a new build, is then should look like this:

http://YOUR_IP_OR_DNS:PORT/api/repos/{github-owner}/{repo-name}/builds?branch={branch-in-the-repo-with-the-drone-yml-pipeline-file}&access_token={authorization-token}

You can omit the branch parameter and use the default GitHub's repo branch instead.

Posts with the similar setups I was inspired by:

The Drone Telegram Plugin to send notifications of the build status to a chat in Telegram:

Get the token by creating and starting your bot
Chat ID for the 'to' parameter can be obtained by starting the @userinfobot or @get_id_bot in Telegram.

Credits 🙏

Inspired by and based on this repo.

Name		Name	Last commit message	Last commit date
Latest commit History 522 Commits
.vscode		.vscode
_manual/images		_manual/images
client		client
drone		drone
elk		elk
server		server
traefik		traefik
.drone-prod-stack.example.yml		.drone-prod-stack.example.yml
.drone.yml		.drone.yml
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
_.env		_.env
default.dev.conf		default.dev.conf
docker-compose.production.swarm.example.yml		docker-compose.production.swarm.example.yml
docker-compose.yml		docker-compose.yml
docker-discourse.yml		docker-discourse.yml
robots.txt		robots.txt

License

galakhov/tg-channelposts-aggregator

Folders and files

Latest commit

History

Repository files navigation

Telegram channels' posts aggregator

Features & Technologies 💡

Installation

The big picture of CI/CD

The setup of Continuous Deployment and prerequisites

Docker Hub's Automated Builds vs GitHub Actions

The deployment step with Drone CI/CD

Docker Registry Webhooks & Drone.io insights

Docker Hub's outgoing Webhooks can be received by Drone CI/CD

Which Docker Hub's Webhook URLs are valid to use with the Drone.io's API?

Posts with the similar setups I was inspired by:

Posts on Docker Swarm Clusters

Posts on Traefik v2

Other Drone.io sources

Credits 🙏

About

Resources

License

Stars

Watchers

Forks

Languages