Skip to content

GSoC 2016 Proposal: Kafka source in Java (VIthulan)

Vithulan edited this page Mar 25, 2016 · 2 revisions

Project: Kafka source in Java


Description


Syslog-ng can read messages from the sources. It process them with filters, rewrite rules, parsers and finally send messages to the destinations.Syslog-ng has a Kafka destination,which is implemented in Java. Reading messages from kafka will make it as a queue between two Syslog-ng instances.

Motivation


I have earlier contributed to LogAnalyzer, a log analyzing product and also have worked with Logstash, Elasticsearch and Kibana. As I already have a domain knowledge and also I'm much interested in this data analytics domain, I'm very much interested to contribute to Syslog-ng for this GSoC and since I already familiar with Kafka, I've decided to do the Kafka-source project.

Goal of the project


  • We should implement a Kafka consumer which uses the high level Kafka group consuming API.
  • We have to process data read from the Kafka.
  • After reload or restart Syslog-ng have to be able to continue reading messages from the last read message.
  • If there are more Syslog-ng's reading the same Kafka input (these Syslog-ng's have the same group name) avoid message loosing or message duplication as much as possible

Benefits of the project


Kafka is a distributed messaging system provides fast and highly scalable and redundant messaging through a pub-sub model. It allows large amount of ad-hoc consumers and also it self heals from the errors and highly available in resilient. Therefore if we implement kafka in Syslog-ng, It will improve the clustering performance of Syslog-ng, Atomicity of data and communication between two Syslog-ng instances.

Knowledge Area


  • Java - Proficient
  • Kafka - Familiar
  • Syslog-ng - Beginner (Improving)
  • Github - Proficient
  • Linux - Proficient

Time schedule


March and April

  • Getting familiar with Syslog-ng as a User (doing)
  • Getting familiar with code base (doing)
  • Increase the familiarity with Kafka (doing)
  • Setting up the environment for the development (done)
  • Start implementing basic scenario.

Aprill 22 - May 22 (Community bonding period)

  • Increase the familiarity with Syslog-ng
  • Go through the relevant documentation thoroughly
  • Increase the familiarity with code base
  • Clarify any doubts
  • Continue implementing basic scenario.

May 22 - May 29

  • Set-up Kafka
  • Test the basic implementation
  • Get a review from mentor
  • Plan on implementation procedure using review

May 30 - June 13

  • Implementing Kafka consumer
  • Testing and improvements
  • Get feedback from mentor
  • Start integration with Syslog-ng

June 13 - June 21

  • Continue on integration
  • Testing

June 21 - June 28

  • Mid-term evaluation

June 28 - August 1st

  • Improve undone features
  • Bug fixes
  • End to end tests
  • Code re factors
  • Review from mentor

August 1st - August 24th

  • Bug fixing
  • Completing all tests
  • Documentation
  • Review from mentor
  • Final release

About me


I’m Vithulan, an undergraduate student at the Department of Computer Science and Engineering in University of Moratuwa. I have three years of academic experience in Computer Science and Engineering. I'm a self motivated individual, looking to achieve excellence through dedicated hard work as dynamic role player along with my technical and social skills. I love competitive programming and I have participated in IEEE programming competitions in past two years. I have contributed to some products of WSO2 (analytics-apim, LogAnalyzer) [7]. I have a excellent knowledge in Java, Maven, C, open source products and I’m playing around in Syslog-ng these days. Therefore I think I can successfully complete this project within the time period.

Contact Information

E-mail: vithulanmv.12@cse.mrt.ac.lk Github: https://github.com/VIthulan Linkedin:https://lk.linkedin.com/in/vithulamv

Clone this wiki locally