Skip to content

GSoC2014 Proposal: XMPP Destination (Shonali Balakrishna)

ShonaliB edited this page Mar 21, 2014 · 14 revisions
  • Name: Shonali Balakrishna
  • Date of birth: 02-02-1990
  • Sex: Female
  • Nationality: Indian
  • Email: shon.balakrishna@gmail.com
  • Location: Bangalore, India
  • University: TU Delft
  • Major: Electrical Engineering (Telecommunications and Sensing Systems)
  • Year of Graduation : 2016
  • Degree: Master of Science
  • Phone number: +919036083277

Project Proposal The goal of this project is to deliver an XMPP destination which pushes log messages as XMPP messages to XMPP servers. This will be achieved in two parts – firstly, with a simple message push model and secondly, with publish/subscribe model. The Libstrophe library will be used to achieve the same. The Libstrophe library is well documented, easy to use, easy to extend and has minimum dependencies, is configurable on various environments. It is reasonably easy to integrate with syslog-ng as well.

Benefits of proposed project XMPP used in syslog can serve a useful alert and event notification purpose. A tremendous advantage is that XMPP has been designed to send all messages in real-time using a very efficient push mechanism. XMPP is pure XML and fully extensible, highly scalable, network-aware, firewall-friendly, and manageable. The use of XML streams leads to improved scalability and performance. XMPP flows via XML streams over TCP sockets, so congestion control is guaranteed. Another advantage of XMPP is that in its very nature it tackles security issues.

Project implementation The Extensible Messaging and Presence Protocol (XMPP) is an application profile of the Extensible Markup Language [XML] that enables the near-real-time exchange of structured yet extensible data between any two or more network entities. The purpose of XMPP is to enable the exchange of relatively small pieces of structured data (called "XML stanzas") over a network between any two (or more) entities. XMPP is typically implemented using a distributed client-server architecture, wherein a client needs to connect to a server in order to gain access to the network and thus be allowed to exchange XML stanzas with other entities.

Simple Message Push model: The process whereby a client connects to a server, exchanges XML stanzas, and ends the connection is:

  1. Determine the IP address and port at which to connect, typically based on resolution of a fully qualified domain name.
  2. Open a Transmission Control Protocol [TCP] connection.
  3. Open an XML stream over TCP.
  4. Preferably negotiate Transport Layer Security [TLS] for channel encryption.
  5. Authenticate using a Simple Authentication and Security Layer [SASL] mechanism.
  6. Bind a resource to the stream.
  7. Exchange an unbounded number of XML stanzas with other entities on the network.
  8. Close the XML stream.
  9. Close the TCP connection

The general flow of a client request:

  1. A client opens a socket to the server.
  2. The server and client exchange stream stanzas. The server advertises TLS and SASL stream features.
  3. The client negotiates TLS with the server, creating a secure channel.
  4. The client sends a SASL stanza for channel authentication.
  5. The server advertises Bind and Session stream features.
  6. The client sends a bind stanza, then waits for the response.
  7. The client sends a session stanza, then waits for the response.
  8. At this point, the client is ready and can send XML streams.

Publish Subscribe model: XMPP Publish Subscribe technology uses the classic "publish-subscribe" or "observer" design pattern: a person or application publishes information, and an event notification (with or without payload) is broadcasted to all authorized subscribers. In general, the relationship between the publisher and subscriber is mediated by a service that receives publication requests, broadcasts event notifications to subscribers, and enables privileged entities to manage lists of people or applications that are authorized to publish or subscribe. The focal point for publication and subscription is a "node" to which publishers send data and from which subscribers receive event notifications. Nodes can also maintain a history of events and provide other services that supplement the pure pubsub model. The basic idea is simple:

  1. The entity publishes information to a node at a publish-subscribe service.
  2. The pubsub service pushes an event notification to all entities that are authorized to learn about the published information.

Why Libstrophe? Libstrophe is a lightweight XMPP client library written in C. It has minimal dependencies and is configurable for various environments. It runs well on both Linux, Unix, and Windows based platforms. Its goals are usable quickly, well documented and reliable. While most XMPP libraries and implementations are focused on chat-based applications, Libstrophe takes a grander view. It has been used to implement real-time games, notification systems, search engines, as well as traditional instant messaging. The implementations are production ready, well documented, easy to use, and easy to extend. Since we intend to use XMPP in syslog as essentially a notification system, and since syslog has its own event loop, this library is most suited for this project among the options available, as it allows for such a custom implementation and is fairly integrable with syslog. Since syslog has its own event loops, xmpp_run_once() can be called from there and then can respond and react to events as they come in. These functions manage the Strophe event loop:

  1. void xmpp_run_once (xmpp_ctx_t *ctx, const unsigned long timeout) Run the event loop once.
  2. void xmpp_run (xmpp_ctx_t *ctx) Start the event loop.
  3. void xmpp_stop (xmpp_ctx_t *ctx) Stop the event loop.

Integration with Syslog-ng: Syslog-ng is an enhanced log daemon, which receives and sends RFC3164 and RFC5424 style syslog messages, supporting a wide range of input and output methods. Source: The various sources from where syslog-ng gets its log messages include internal messages, from text files, from named pipes, from accounting logs on Linux, from external applications, using the IETF syslog protocol, the system-specific log messages of various platforms, from remote hosts using the BSD syslog protocol, messages from UNIX domain sockets. For each of these sources, polling is done in the event loop and every time the log is updated, the xmpp_run_once() can be called. Connection handlers can be defined for each of these events, from where stanza’s (log notifications) can be sent to the XMPP server with configurable node, jid, password, and destination jid.

Once done with the implementation, I will write code for functional test units to test the working of my XMPP destination. For the purpose of testing the XMPP destination, I will be setting up an XMPP server on my system. Having gone through the available options, I have decided to set up an ejabberd XMPP server.

Deliverable A simple XMPP destination to communicate with an XMPP server, and deliver messages using a simple message push model. Code flexible enough to support the publish/subscribe model later or code that is at least easy enough to adapt and improve upon.

Nice-to-have An XMPP destination that also communicates with the XMPP server using the publish/subscribe model.

Timeline and Schedule

  • Community bonding period: a. Getting familiar with the mentors. b. Studying the Libstrophe library and the syslog source code, writing sample code. c. Discussing the ways of implementation with mentor.
  • 19th May - 1st June: a. Develop a simple XMPP client using Libstrophe. b. Develop a XMPP server for testing. c. Set up a working environment.
  • 2nd - 23th June: Write code for xmpp connection handlers for each part/module of syslog event loop.
  • 23th – 27th June: Mid Term evaluations
  • 28th June - 13th July: Write test cases and comprehensively test the code written.
  • 14th - 27th July: Buffer/Nice to have (Write code for publish/subscribe model).
  • 28th July - 10th Aug: Buffer/Nice to have (Implement and test publish/subscribe model).
  • 11th – 18th Aug: a. Look for bugs in the code, tests, code samples, evaluation. b. Start wrapping it up for merging. c. Write documentation. d. Code submission.
  • 18th – 22nd Aug: Final Evaluation

Academic & Industry experience I will be starting graduate study at TU Delft this fall (Sept 2014) in Electrical Engineering (Telecommunications and Sensing Systems) (2014-2016(expected)). I hold an undergraduate Bachelor of Engineering degree in Electronics and Telecommunication Engineering (2008-2012) from PES Institute of Technology, Bangalore, India and my undergraduate studies has led to a strong inclination towards computer networking.

During my undergraduate study, I had two courses in C programming and as a part of the advanced second course, I programmed for a 2-player game of Tic-Tac-Toe using Turbo C/C++ 3.0 IDE. I also had courses in Computer Communication Networks – Theory and Laboratory, Unix programming – Theory and Laboratory and Embedded Systems – Theory and Laboratory, which I believe will help me greatly while working on this project.

I have always been interested and active in programming projects. Through my undergraduate degree, I was involved in two intensive research projects in the fields of Wireless Sensor Networking (Optimal sensor power allocations for robustness of detection) and Computer Vision (Algorithms for Real-Time Video Tracking) respectively, involving extensive programming in C and C++. The Computer Vision project involved programming for real time object tracking using a combination of detection and prediction algorithms in C++ on Microsoft Visual Studio with OpenCV libraries and then programming an autonomous robot (Atmega 128 controller) to track and follow a soccer ball using the same algorithm. These research projects eventually led to two publications (one at the IEEE level) and greatly honed my programming skills.

I’m currently working on scripting/programming (Shell, Perl, Java) for automation tools for network and cloning related tasks in the Cloud division at Oracle (June 2012-present). One such automation project involved working with XMPP and XML to create an automated notification system. I have also completed my CCNA certification in this period. This industry exposure has given me a strong hold on systems and networking protocols and programming in general and on Linux networking in particular.

To sum up, I am proficient in programming/scripting in C, C++, Java and Perl. I have hands on experience with working on XML and XMPP through an automation project at Oracle. I have user level familiarity with syslog-ng and have been familiarizing myself with the syslog-ng source code in the last month. Most importantly, I enjoy learning and am always willing to learning something new.

How much time will I have during GSoC to work on your project? If my proposal is accepted, I intend to immediately give my notice at work. Since there is a notice period of just one month and I start graduate school at TU Delft during September, I will be free to work on GSoC for the entire duration of GSOC(May to August). My main focus during this period will be GSoC, and I will be able to devote my full attention and time to this project. I am willing to put in as many hours as demanded by this project during this period and can guarantee at least 40 hours per week to this project.

What other things I will be doing during GSoC (vacation, exams, travel)? As of now, I have absolutely no plans to travel and no exams to give during this period. I will be devoting all my time in this period to this project.

Why work on this project? Firstly, my experiences through my undergraduate degree and through work have led to a deep inclination and interest towards computer networking and protocol implementations. Having already worked with the XMPP protocol through an automation project at work, I am keen to work on this project and delve deeper into this protocol. Secondly, at Oracle, the script fixes, automation tool testing and validation that I work on often involves working with logging solutions and deciphering the problem at hand from there and finding a fix. As such, the work done by syslog is that much more interesting to me, as I can see the benefits of each feature or project and would love to contribute in whatever way I can towards the development of syslog.

References: http://www.balabit.com/sites/default/files/documents/syslog-ng-ose-3.3-guides/en/syslog-ng-ose-v3.3-guide-admin-en/html/ http://xmpp.org/extensions/xep-0060.html https://tools.ietf.org/html/rfc6120 https://github.com/balabit/syslog-ng/wiki/GSoC2014-idea-&-project-list#xmpp-destination http://strophe.im/libstrophe/doc/

Clone this wiki locally