Skip to content

GSoC2014 Proposal: ZMQ transport (sanidhya)

Sanidhya edited this page Mar 21, 2014 · 4 revisions

ZMQ transport

Abstract

In this project, a new transport layer is developed using zeroMQ which is a high performance messaging library. With the help of ZeroMQ, it will be easy to provide an interface different kinds of communication processes such as inter-process, in-process and inter-node, which is achieved by using several existing messaging models that are prevalent in communication patterns.

Background

In this project, a new transport layer is developed using zeroMQ which is a high performance messaging library. With the help of ZeroMQ, it will be easy to provide an interface different kinds of communication processes such as inter-process, in-process and inter-node, which is achieved by using several existing messaging models that are prevalent in communication patterns. 1. Unix based file descriptors 2. sockets a. UDP (SOCK_DGRAM) b. TCP (SOCK_STREAM)

The basic structure to transfer the data is LogTransport which in turn is used by different structures such LogProtoServer, LogProtoClient and indirectly by various server and client based structures specified in the afsocket module. With the help of LogProto mechanism, a message is transformed into stream and then the LogTransport transfers it to fd/sockets.

The LogTransport is not the only mechanism which is the core element for asynchronous transfer of the data to a destination via Unix fd which is achieved by using queue (to store the data) and polling functionality (indicating that Unix fd is writable).

Currently, everything is maintained by syslog-ng with the help of ivykis library which helps in asynchronous transfer of data.

Besides this, there are two kinds of destination types: one which uses LogProto mechanism (eg. affile, afsocket) and other based on threads which only relies on LogMessage along with a LogTemplate and transfers the data to the underlying library (eg. afsql, redis, afsmtp).

Design and Implementation details

In this project, we aim to provide a ZMQ version of transport mechanism for transferring the data between multiple interfaces. The major reason to do is the following:

  • ZMQ is highly scalable and high performance asynchronous library for data transfer supporting almost every interface that is required by syslog-ng transfer mechanism.
  • By using ZMQ, syslog will not have to maintain the networking code that has been done till now.

There are four major contributions that need to be done in the project:

  • implement a generic zmq client model which will be either of the form of fan-out/publisher or push model.
  • implement its counterpart - a generic zmq server model which is fan-in/subscriber or pull based model.
  • change the LogWriter mechanism to incorporate the zmq support for the message transfer.
  • provide the thread based destinations to use the LogWriter/LogProto mechanism, as it will not only provide generic api for adding new modules to syslog-ng but also result in better performance because of the removal of the thread management that needs to be done by the LogThrDstDriver

In order to provide the assistance of ZMQ, the LogTransport needs to be modified which is used by source and destination. We need to keep the current LogTransport and provide an abstraction such that the LogTransport becomes the base struct and the earlier basic support or ZMQ support is provided during the compile time. What I mean is that when a user configures the system, he/she will provide the option to either use zmq or the existing one that has been used by the current community.

So, the version of struct will be of the below mentioned type:

struct LogZMQTransport {
    void *zctx; //gint fd won’t be required. `
    GIOCondition cond;
    gssize (*read)(LogZMQTransport *self, gpointer buf, gsize count, LogTransportAuxData *aux);
    gssize (*write)(LogZMQTransport *self, const gpointer buf, gsize count);
    void (*free_fn)(LogZMQTransport *self);
}

Here, I guess, we don’t need to use fd, as it will be already taken care by the zmq. I have used the zctx which is a high level c binding for libzmq.

The above mentioned approach is of one kind. Another type can be to only change the LogProto mechanism which include the direct data transfer instead of playing with the LogTransport mechanism. This technique will not be a stream based but on the flip side, ZMQ will offer the reliability guarantee.

Currently, there already exists a support for UDP transmission but this is not provided by ZMQ. Thus, UDP based transfer mechanism cannot be written using ZMQ and no support will be provided for that. After having a discussion with Viktor, they will remove the UDP support in future.

Another aspect is to implement the LogWriter mechanism which uses both Unix based fd to write the data to disk or buffer it using queue. The LogWriter relies on ivykis library for polling on the fd. This can be easily solved by ZMQ as it is designed in such way that it can queue the data before transferring.

Deliverables road-map:

Week 1,2 -- Get more familiar with the latest code base of syslog-ng and discuss about the actual implementation details with the community.

Week 3,4 -- Modify the LogProto mechanism and make the required changes to the client and server, which can run only with ZMQ.

Week 5, 6 -- Implement a ZMQ based LogWriter mechanism.

Week 7 -- Submit the patch for review and write the relevant test suites

Week 8, 9, 10 -- Port some of the thread based destination types to LogProto mechanism based destination.

Week 11 -- Test the code extensively and send the code for review and incorporate the suggestions given by the community.

Week 12 -- Buffer period for unexpected events.

About me:

Name:

Sanidhya Kashyap

IRC Nick:

halohell

Programming languages (fluent):

C, C++, python, java, bash (scripting)

Version control systems:

git, svn

Relevant experiences:

  • I have worked in the area of virtualization from the past two years. I have been mostly exploring the live migration and its current state of the art code in QEMU/KVM.
  • I have also designed my own hybrid migration technique (pre-copy + post-copy) addressing the reliability of the post-copy algorithm as well. This work is under submission.
  • I have also designed a distributed hybrid live migration technique which is also under submission. In order to make the distributed live migration work, I designed a system - NTracker, which tracks all the memory updates of each VM running on physical machines inside a cluster. The information is distributed and ZeroMQ was used for the data transfer.
  • Besides this, I do have a very basic idea about device drivers. Some of my project details can be found on my webpage.
Clone this wiki locally