Skip to content

GSoC 2020 Proposal : Add support for the template() syntax in the kafka() destination in C based implementation(vivinperis)

vivinperis edited this page Apr 27, 2020 · 20 revisions

Google Summer of Code - 2020 Project Proposal : Syslog-ng

Background and Motivation

I’m Vivin, a final year Mechanical Engineering undergraduate student at NIT Surathkal. My coding journey began two years back when I interned at IIT Madras. The project was numerical modelling of the asymmetry of the lung using Monte Carlo simulations to model viral disease spread. It was a challenge for me as I had to learn FORTRAN in a span of one week and finally completed the entire project successfully in six weeks. There has been no looking back since then.

This experience helped me secure an internship last summer at a start-up where I got to scrape data using scrapy and process the same using Kafka and Flink to build a database with billions of entries of IP data in the world and visualize the flow of data streams using Grafana and Prometheus. I realized how important logs are in debugging system issues and errors from fundamentals. I also realized how stream processing had changed the development world from the earlier ETL processing. This is where I fell in love with open source technology. Over the span of two years, I have done courses in Machine learning, High Performance Computing Architecture, DSA, Web development, Operating systems, Compiler design and Automata Theory and a plethora of projects in languages like C,C++, Java, Scala, Python. I have also solved thousands of questions on Interviewbit, leetcode, codechef and hackerrank.

Experience with Syslog-ng

To be honest, Syslog-ng was the first open source organization I started contributing to and the experience has been amazing. Firstly, thanks to the brilliant mentors, Mr. Furiel, Mr. Attila and Mr. Bazsi who have helped me immensely in every aspect of getting familiar with the codebase. In the span of one week, I managed to fix a bug from scratch which was merged with the codebase, build the code using docker image and by compiling the source code, learn about how kafka integrates well with Syslog-ng, change configuration of syslog-ng.conf file and play around with different possibilities and debug with the help of mentors. I also learned the importance of writing succinct code avoiding redundancies. I can surely say that the learning curve has been great and hope to work here over the summer. I also actively contribute to conversations on gitter

Pull Requests (March 2020 - Present)

No. Description Link
1 logpipe-source: disable multi-line-timeout #3192
2 example-message-generator: add support for values(name1 => value1, ...) syntax where value follows template syntax #3237
3 python: added support for arrow syntax #3247
4 Makefile.am: Eliminate warning for discarded qualifiers #3250
5 Makefile.am: Eliminate warnings for multiple errors #3262
6 Kafkareadme: Updated the Kafka destination readme file with the current syntax #3268
7 crypto: fixes hang on boot due to lack of entropy #3271

Project Title : Add support for template() syntax for kafka destination

Add support to syslog-ng template syntax in the topic() parameter for the C based kafka implementation which was added recently due to lower memory usage compared to the earlier java implementation. Also, provide support for performance testing and load testing functionality available in Kafka.

Abstract

The addition of syslog-ng standard template syntax is going to enhance the functionality offered by the new kafka implementation in C offering benefits in the topic parameter which were earlier part of the java implementation. This would make it better for developers and users in terms of ease of use. Also, features like filters can be used giving the desired format.

Why the project is interesting to me

Having worked on a project involving transfer and processing of humongous data during my previous internship, I know the importance of log processing in solving system issues and debugging, and the use of Kafka in the transfer of huge amounts of data and contributing to an open source organization which deals with log collection and management is a great opportunity to serve the developers community

What the project needs

  • Familiarity with Kafka

  • Familiarity with Syslog-ng

  • Familiarity with C language, Lex, Yacc and parallel programming/mutexes.

Areas I am familiar with

I have extensively worked with Kafka in my previous internship where I had to implement a 3 node multi cluster kafka architecture,coded the same in Scala and performed load testing on the same using performance testing parameters and JMeter. Also, with the help of mentors, I’ve been able to successfully build Syslog-ng from source files and using Docker and played around with configuration files to take logs from different files. I know C language from the first year of engineering and have done competitive programming in the same. I’ve also worked with Lex and Yacc as I build a simple parser to take tokens and process them. I have a basic understanding of mutexes as I had taken a course on Operating systems as a MOOC on NPTEL last semester. This is where I need to improve upon

Approach

  • Phase I Get familiar with the working of the codebase of Syslog-ng, along with intricacies of librdkafka now that I know how kafka works, have got a PR merged for a bug and managed to build the source code from repository and using docker.

  • Phase II Start coding the changes in the kafka destination and implement the functionality to the feature. Also try for additional features such as support for multi-broker cluster and test of tolerance of the kafka server using functionality provided by kafka

  • Phase III Intensive testing and fixing bugs if any. Make updates to the gitbook documentation to ensure changes available to developers and users with the right syntax

Detailed Work Plan

Community Bonding Period

  • Increase familiarity with the mentor and the community

  • Improve my understanding of the working of kafka destination

  • Make a foolproof design plan

Implementation Period

June 1st - 7th

  • Familiarize myself with the codebase

June 8th - 14th

  • Understand working of librdkafka and mutex’s

June 15th - 21st

  • Code the template functionality

June 22nd - 29th

  • Code the template functionality

June 29th - 3rd July

  • Phase 1 evaluation

July 3rd - 10th

  • Phase 1 review and changes

July 11th - 18th

  • Make further enhancements and fix bugs

July 19th - 25th

  • Make further enhancements and fix bugs

July 27th - July 31st

  • Phase 2 evaluation

August 1st - 7th

  • Complete the code and document the changes

August 7th - 14th

  • Review the code and documentation

August 15th - 31st

  • Fix code reviews of PR and make final submission

Others

Timezone

I will be present in India and will be working on the project from 3:00 AM to 2:00 PM UTC

Commitments

I can put in 50+ hours a week during the GSoC period as I have no other constraints. However, there might be exams for one week in between. But, I will try to finish everything beforehand

Post GSoC

If there are parts of the project that are left unimplemented (which is highly unlikely) ,I will try to complete them post-GSoC. I would also like to work on integrating other interesting functionalities and features and fixing bugs and issues in syslog-ng as an when they arise. I am looking forward to a long term association with Syslog-ng Community. I plan to actively maintain and review the kafka module.

Why me?

Firstly, Being a Mechanical Engineering undergraduate at a university which doesn’t allow for minors in Computer Science, I’ve had a lot of motivation to work towards my passion which is computer science. Other than the C programming course in my college, I learned fundamental courses in computer science on my own right from DSA, Computer Architecture, Operating Systems, Compiler design and Automata theory, Machine Learning taking rigorous courses online alongside 174 credits I had to finish from my major branch. With this motivation and enthusiasm, I am ready to commit fully to GSoC and give my best possible. I really hope I get this opportunity as it is my final year and it would be great to work with such amazing mentors towards an impactful open source project at Syslog-ng.

Clone this wiki locally