Skip to content

GSoC2014 Proposal: ZMQ transport, source and destination (lmesz)

lmesz edited this page Mar 20, 2014 · 5 revisions

Connection of Syslog-NG and ZeroMQ:

Today syslog-ng is the de facto logging standard used on a lot of diferent platforms and knows a lot of destinations. However it is compiled to a lot of platforms it is also fast and reliable.

ZeroMQ is a very popular, "more practical" socket implementation (some usage example: ). MQ means message queue but in fact it is a more practical useage of sockets. It hides a lot of overhead from developers. (For example auto-reconnect) and give possiblities to use a lot of message patterns (pub-sub, req-rep, push-pull, router-dealer)

It would be good if syslog-ng could send and receive logs from/to zmq destination/source and could use it in inner communication it could exchange logtransport and logproto which are handle network issues.

One of the problem that the idea could solve:

Sysadmins around the world install syslog-ng to host and servers too, servers generate a lot of logs in every sec. Admins don't have time for every server to configure so they create a general config with which they can send every log to a central log collector without any filtering, but with zmq pub-sub mechanism they could filter on server side based on a simple string. With zmq, admins don't need to create a destination for every server, one is enough and servers need to subscribe for that and after that they will get all the messages on which they subscribed.

Benefits:

People could easier debug one of the very popular config management tool called Salt, debug theirs inner tools that uses zmq, send data to LogStash via this protocol or monitoring theirs infrastructure.

The other benefit of that project is the performance if syslog-ng could use zmq in their transport layer. Syslog-ng uses logproto and logtransport in its internal works. These has a lot of overhead because of the nature of a socket that can be replecable with a zmq interprocess socket, it has a lot of advantages (speed, performance overhead, mantenability (thanks due to less complexity), easier understanding of internal works)

Zmq evolve from day to day there are new features that come into their code base (e.g.: authentication and encryption) not part of gsoc but later I would like to use in syslog-ng (for example encrypted connection between two syslog-ng via zmq socket)

Aims:

There are 2 different aim that I would like to reach with that.

Implement zmq transport

If it is possible the type of transport became optional.

Implement zmq source and destination.

Simple source like tcp/udp, file etc, and destination like sql, tcp, reddis, amqp ...

Implementation:

To reach the aims I would like to use libzmq which is written in C and can be used easily with syslog-ng code base.

I think implementation contains testing too. I would like to write functional tests which proove the work of destination and source and write unit test to show if the function one by one work fine.

Timeline:

  • May 19th - Apr 13th: Gain more knowledge about syslog-ng inner codebase

  • Get deeper knowledge about LogTransport and LogProto, how can exchange them and evaluate them .

  • Apr 14th - May 4th: Implement zmq source

  • Write an end-to-end functional test: It checks if pub-sub, req-rep, pull-push works properly (probably a simple C application)

  • Implement the main functions zmq_sd_init_instance(), zmq_sd_init_method(), zmq_sd_deinit_method(), zmq_sd_notify() ( I am not realy sure still if these are the whole methods)

  • Implement the helper functions, setters.

  • Create unit tests that will be able to prove if something break later (kind a regression tests)

  • May 5th - May 25th: Implement zmq destination

  • Write an end-to-end functional test: It checks if pub-sub, req-rep, pull-push works properly (probably a simple C application)

  • Implement necessary functions zmq_dd_new(), zmq_dd_init(), zmq_dd_connect(), zmq_dd_disconnect(), zmq_worker_insert()

  • Implement helper functions zmq_set_socket_type(), zmq_set_host(), zmq_set_port(), zmq_set_template(), etc...

  • Add unit tests that prove the correctness of the functions

  • May 26th - June 14th: Measures of code:

  • Performance, code coverage, "leak hunting"

  • June 15th - Aug 3th: Implement the transport layer

  • In case user would like to use zmq, the logtransport and proto may changed to an underlaying zmq interprocess socket which's result is a faster logrelay in syslog-ng.

  • Aug 4th - Aug 19th: Review and code cleanup

  • Added new tests

  • Make the code cleaner. Talkative functions and variables names, decrease code duplication

  • Measure performance and check bottlenecks (perf, valgrind ...)

(These times contain some buffer if any unexpected happen)


About me:

I am a student at Széchenyi István University, learning software engineering (I will graduate next year). I began to use linux from 2008 at home and at work. I mostly use ubuntu, but I also used red hat, centos, suse. I used solaris only user level. I have knowledge about network security,

I have experience in programming c, java, python, php, bash. My favourite from scripting language is python because, despite it is very easy to use, it is easy to build complex systems also based on that language, if I have to choose a typical not-interpreted language I choose C because of its efficiency.

First I met logs at Nokia Siemens Networks where I had to create a webservice based on PHP which could easily parse nightly build results, that were stored in xml, in really huge xmls (I haven't seen such a big xmls since then). There I heard about syntax of logs and why those are so important and useful .

I know syslog-ng from 2008. I like its config syntax, its usage and how to debug it. I started to get to know its code base from last year, read its mailing list. I still didn't contribute on any open source project, but one of my aim to do it and I think it is a good initial point. I belive that an open source project can be popular or useful only if it solves at least one problem and this project could solve at least one. The other part why I choose that project is the community around syslog-ng and zeromq.

Clone this wiki locally