Skip to content

An implementation using Kafka as a Service

License

Notifications You must be signed in to change notification settings

enixdark/ai-kafka

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ai-kafka

An implementation using Kafka as a Service

Application Description

The ai-kafka library does the following actions:

  • monitors website availability over the network,
  • produces metrics about the website availability,
  • persists the events passing through an Aiven Kafka instance into an Aiven PostgreSQL database.

For this, it implements a Kafka producer which periodically checks the target websites and sends the check results to a Kafka topic. A Kafka consumer storing the data to an Aiven PostgreSQL database. For this local setup, these components run in the same machine.

The website checker should perform the checks periodically and collect the HTTP response time, error code returned, as well as optionally checking the returned page contents for a regexp pattern that is expected to be found on the page.

For the database writer we expect to see a solution that records the check results into one or more database tables and could handle a reasonable amount of checks performed over a longer period of time.

Pre-Implementation installations

This application was developed and tested on Ubuntu 20.04.1 LTS.

(Note: This application is tested with Python 3.6 or higher)

  • sudo apt-get install postgresql
  • sudo apt-get install libpq-dev
  • sudo pip3 install psycopg2
  • sudo pip3 install kafka-python
  • sudo pip3 install python-requests

Installing the ai-kafka application

$ git clone https://github.com/mmnelemane/ai-kafka
$ cd ai-kafka
$ sudo python3 setup.py install

A binary aikafka is installed in /usr/local/bin/ on the host.

Running the application

  1. Ensure that Aiven Kafka and Aiven PostgreSQL services are running.

  2. Download and store the certificates for Aiven Kafka and PostgreSQL services. The files are expected to be stored in the following directory structure:

    certs/
        kafka/
            ca.pem
            service.key
            service.cert
        pgsql/
            ca.pem
    
  3. Update the Config file with the details about the Aiven services. A sample ai-kafka.conf.sample is found in the package. Refer to this file for help on filling up the config.

  4. Write an input file in the format of `weburls.json' listing all the URLs and a searchable text.

  5. Start aikafka application as:

    $ aikafka --configfile <configfile_name> --inputfile <inputfile_name>
    

Additional commands to help

  1. To check if the configuration has been read properly
$ aikafka --configfile <configfile_name> --inputfile <inputfile_name> --printconfig
  1. To print help text for the application
$ aikafka --help
  1. Shortcuts for options
"--configfile"  == "-c"
"--inputfile"   == "-i"
"--printconfig" == "-p"
"--help"        == "-h"
  1. The recorded website information can be obtained by logging into the pgsql database defaultdb The entries are recorded in web_metrics table which can be fetched with:
    SELECT * from web_metrics;
    

Features yet to be implemented

Enhancing the Application

  1. Create an events table which will record changes in the web_metrics. The entries in the events table could be done through a trigger in web_metrics table.

  2. A cleaner way for user to fetch database tables

  3. A completed Debian or RPM package (.deb or .rpm) to install on several platforms.

  4. To be able to run aikafka as a systemd service daemon.

  5. An aikafkactl API that can interact with a systemd daemon to provide functionalities for the user

  6. A way to clear old entries (e.g: older than a few days) to ensure scalability.

  7. An improved logging mechanism with multithreading to help troubleshooting.

  8. Complete and improve tests.

References

  • Basic parts of the producer and consumer code were taken from:
  • Basic parts of the postgresql client was taken from:
  • Several stack overflow and python blogs were used to learn about specific usage syntax

About

An implementation using Kafka as a Service

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%