GSoC2016 Idea & Project list

juhaszviktor edited this page Mar 17, 2016 · 47 revisions
Clone this wiki locally

Table of contents

  1. Guidelines

  2. Hosting, license and other bits of information

  3. Adding a new idea

  4. Submitting a proposal

  5. Ideas

  6. Releasing

    1. Project: Automated release generation for syslog-ng
  7. Monitoring

    1. Project: Websocket for syslog-ng
  8. Configuration

    1. Project: Web-based syslog-ng configuration editor
  9. New tool

    1. Project: syslog-ng as a command line tool
  10. New sources and destinations

    1. Project: Wildcard file source
    2. Project: Kafka source in java
    3. Project: Kafka source using librdkafka
    4. Project: Implement JDBC source and destination drivers
    5. Project: Implement ODBC source and destination drivers
  11. Core features

    1. Project: Extend syslog-ng Java language binding with late acknowledgement mechanism
  12. Library ecosystem

    1. Project: Implement Python support in syslog-ng's correlation library

Guidelines

The ideas herein were contributed by the syslog-ng OSE community, by developers, users, and interested students. Some of them may be vague or incomplete. If you are a student and would like to apply to the Google Summer of Code, ask about any of these ideas either on the mailing list, or contact the person proposing it.

Being accepted as a Google Summer of Code (GSoC) student is not an easy task, it is competitive. Research the desired topic in depth, and contact the mentor and the community. If you have a new idea and would like to add it to the list, talk to the community at large, and the developers too, to ensure that there will be a mentor for the project, if selected.

In case no specific contact is given for a particular idea, questions can be asked on the mailing list or on Gitter.

Hosting, license and other bits of information

As required by the Google Summer of Code program, all contributions must be available under an open source license. In case of syslog-ng, we use two licenses: the GNU General Public License (GPL) and the GNU Lesser General Public License (LGPL), the outcome of the GSoC projects will need to use one of these licenses. Consult the mentor of the idea for details.

We also prefer to do development in the open, with communication happening on the mailing list or on Gitter, and code hosted on GitHub, where the main repository is. Depending on the proposal, students will be asked to fork either syslog-ng itself, or the syslog-ng Incubator.

Adding a new idea

Before adding a new idea, consult the community and the mentors (see above), then follow the template set by other ideas: A title, a brief description, expected results, skills required, difficulty, and topics the student may learn. See the existing ideas below.

Submitting a proposal

To submit your proposal, create a new Wiki page for it, as described in the GSoC 2016 Proposals document. Do not forget to also record your proposal on the Google Summer of Code page by posting a link to the previously created Wiki page.


Ideas

Releasing

Project: Automated release generation for syslog-ng

Brief description

We want to be able to release syslog-ng in a fully automated way.

My idea is to create a Python project that performs the following tasks:

  • prepare a new release on the Release branch
    • increase the version number
    • generate changelog (collect contributors, closed issues, commit messages, and so on)
  • create syslog-ng distribution tarball
  • tag syslog-ng release in GitHub
  • prepare a release draft in GitHub
    • attach changelog
    • attach distribution tarball to the release draft
  • upload distribution tarball to my OBS project
  • test the packages
  • when OBS has succesfully finished the build, publish the release (and also send a mail to the mail list)
  • generate standalone Debian packages
  • nice to have: support RPM packages

Keep in mind that this is not a simple Python script: it should follow SOLID principles, you should create classes (for example: Release class, and so on). If you have the time, you can also implement a web interface on top of the Python code (by using Django, for example).

Proposed by: Laszlo Budai

Mentor: Laszlo Budai (co-mentor: Laszlo Várady)

Difficulty: Medium (changelog generation could be an issue :-) )

Deliverables of the project:

  • A tool that can help us to release syslog-ng more often.

Desirable skills:

  • Familiarity with syslog-ng on at least a user level
  • Familiarity with GitHub
  • Familiarity with Debian packaging
  • Python

What the student will learn:

  • How to release an open source project with open source tools.
  • GitHub API.
  • Basics of Debian packaging.
  • OBS (Open Build Service) knowledge

Monitoring

Project: WebSocket for syslog-ng

Brief description

The goal of this project is to create a WebSocket destination, which can push log messages to a web server using the [WebSocket protocol] websocket:rfc. The secure connection should be configurable. This destination will be used for alerting by syslog-ng.

Proposed by: Laszlo Meszaros

Mentor: Laszlo Meszaros (co-mentor: Viktor Juhasz)

Difficulty: Easy-Medium

Deliverables of the project:

  • A sample Web server that is able to receive and handle messages coming from syslog-ng
  • A simple WebSocket destination to communicate with the sample Web server and to deliver messages.

Desirable skills:

  • Familiarity with syslog-ng on at least a strong user level
  • Familiarity with the WebSocket protocol
  • Familiarity with the C language (deep knowledge not required, but C coding experience is strongly recommended)
  • Prior knowledge of the syslog-ng code base is recommended, but not required.

What the student will learn:

  • Working with an existing, well-established software project
  • Simple, but efficient multi-threaded programming in C
  • WebSocket protocol knowledge
  • Basic Bison & Flex skills, an introduction into writing parsers and grammars.

Configuration

Project: Web-based syslog-ng configuration editor

Brief description

The syslog-ng application has a powerful configuration language. Powerful for administrators and sometimes difficult for users who are not familiar with it, and have not read the documentation in detail.

The goal of this project is to provide a Web-based application for creating syslog-ng configuration files using drag&drop technique. The application should be written using an arbitrary Web framework with JavaScript. Data model will be provided.

Proposed by: Viktor Juhasz

Mentor: Viktor Juhasz

Difficulty: Medium

Deliverables of the project:

  • A user-friendly configuration creator Web-based application.

Desirable skills:

  • UI design
  • JavaScript
  • Web frameworks (for example Django)
  • syslog-ng configuration file language

What the student will learn:

  • How to create complex but user-friendly Web application

New tool

Project: syslog-ng as a command line tool

Brief description

There are situations when users need a tool that is able to read log messages from stdin, transform them, and then write them to to stdout without having to run a daemon process or collect logs (like several standard UNIX command-line tools).

This is an offline syslog-ng:

  • reads input from stdin (that is, stdin source)
  • writes output to stdout (that is, stdout destination)
  • can read the syslog-ng configuration file from command line

Proposed by: Balazs Scheidler

Mentor: Laszlo Budai (co-mentor: Viktor Juhasz )

Difficulty: Medium

Deliverables of the project:

  • Extending syslog-ng functionality to the command line world
  • Make it possible to use AFL with syslog-ng.

Desirable skills:

  • Familiarity with syslog-ng on at least a strong user level
  • Familiarity with the C language

What the student will learn:

  • How to implement a source in syslog-ng (stdin source)
  • How to create a command line tool from an application designed for being a server app.

New sources and destinations

Project: Wildcard file source

Brief description

The syslog-ng application already has file source. This way, syslog-ng can collect messages from a plain-text file. However, if you want to read more than one file, you have to create a source for each file. This is not really comfortable and you probably do not know the all the file names either. The goal of this project is to create a new file source that is able to read from several files in a directory (and its sub-directories) matching the given filename pattern (for example: directory is: /var/log/ pattern is: *.log)

Proposed by: Viktor Juhasz

Mentor: Viktor Juhasz (co-mentor: Laszlo Budai)

Difficulty: Hard

Deliverables of the project:

  • New type of file source logging from several files

Desirable skills:

  • Familiarity with syslog-ng on at least a strong user level
  • Familiarity with the C language
  • Familiarity with some of file event handling methods (at least on Linux)

What the student will learn:

  • How to implement a source driver in syslog-ng (file source)
  • Simple, but efficient multi-threaded programming in C

Project: Kafka source in Java

Brief Description

The syslog-ng application can send messages into Kafka. However, it cannot read messages from Kafka. The goal of this project is to be able to read messages from Kafka. Reading messages from Kafka makes it possible to use Kafka as the queue between syslog-ng instances. This could be a first step for the horizontal scalability of syslog-ng.

Proposed by Viktor Juhasz

Mentor: Viktor Juhasz

Difficulty: Medium

Deliverables of the project:

  • syslog-ng instances can read messages from Kafka without duplicating them.
  • flexible configuration

Desirable skills:

  • Kafka Technic (especially the consumer side)
  • Programming with Java (or C)
  • TDD

Project: Kafka source using librdkafka

Brief Description

The syslog-ng application can send messages into Kafka. However, it cannot read messages from Kafka. The goal of this project is to be able to read messages from Kafka. Reading messages from Kafka makes it possible to use Kafka as the queue between syslog-ng instances. This could be a first step for the horizontal scalability of syslog-ng.

Proposed by Viktor Juhasz

Mentor: Viktor Juhasz

Difficulty: Medium

Deliverables of the project:

  • syslog-ng instances can read messages from Kafka without duplicating them.
  • flexible configuration

Desirable skills:

  • Kafka Technic (especially the consumer side)
  • Programming with C
  • librdkafka knowledge
  • TDD

Project: Implement JDBC source and destination drivers

Brief description

The goal is to connect to database servers that are directly not supported by syslog-ng. Not only the traditional SQL databases have JDBC drivers, some of the BigData players also have them.

As we have a Java language binding, it is possible to support JDBC drivers. Part of the task is to provide a detailed performance measurement.

Tasks:

  • implement JDBC destination driver
  • implement JDBC source driver
  • performance measurement
  • performance measurement on the final result (JDBC vs. libdbi)

Proposed by: Laszlo Budai

Mentor: Laszlo Budai (co-mentor: Viktor Juhasz )

Difficulty: Medium

Deliverables of the project:

  • support a wider range of databases

Desirable skills:

  • Familiarity with syslog-ng
  • Familiarity with C and Java
  • Familiarity with JDBC

What the student will learn:

  • how to implement a syslog-ng source
  • how to implement a syslog-ng destination
  • how to use SQL source and destinations in syslog-ng

Project: Implement ODBC source and destination drivers

Brief description

The goal is to connect to database servers that are directly not supported by syslog-ng. Not only the traditional SQL databases have ODBC drivers, some of the BigData players also have.

As we have a existing libdbi-based SQL destination, it could be a good starting point Part of the task is to provide a detailed performance measurement.

Tasks:

  • implement ODBC destination driver
  • implement ODBC source driver
  • performance measurement
  • performance measurement on the final result (ODBC vs. libdbi)

Proposed by: Laszlo Budai

Mentor: Laszlo Budai (co-mentor: Viktor Juhasz )

Difficulty: Medium

Deliverables of the project:

  • support a wider range of databases

Desirable skills:

  • Familiarity with syslog-ng
  • Familiarity with C
  • Familiarity with ODBC

What the student will learn:

  • how to implement a syslog-ng source
  • how to implement a syslog-ng destination
  • how to use SQL source and destinations in syslog-ng

Core features

Project: Extend syslog-ng Java language binding with late acknowledgement mechanism

Brief description

Late acknowledgement (ACK) is needed by Java destinations that support asynchronous/bulk message sending. Currently when a message is delivered to a Java destination that forwards messages in asynchronous way, it is acknowledged immediately and the LogMessage at C side will be destroyed. When the asynchronous/bulk message sending is finished at Java side, we cannot send and ACK/nACK to the C side.

Tasks:

  • extend Java language binding with ACK
  • connect Java LogMessages to C LogMessages (keep the reference)
  • use ACK mechanism in existing Java destinations

Proposed by: Laszlo Budai

Mentor: Laszlo Budai (co-mentor: Viktor Juhasz )

Difficulty: Medium (understanding ACK mechanism is not an easy task)

Deliverables of the project:

  • Improving reliable message sending to Java destinations.
  • Support reliable message sending when using Kafka in asynchronous mode
  • Support reliable message sending when using ElasticSearch in bulk mode

Desirable skills:

  • Familiarity with syslog-ng
  • Familiarity with C and Java (JNI)

What the student will learn:

  • syslog-ng acknowledgment mechanism
  • how to develop a multi-language project
  • how to implement Java extensions to C code

Library ecosystem

Project: Implement Python support in syslog-ng's correlation library

Brief description

The syslog-ng application has an external library for log message correlation. Basically, log messages can be grouped and operations can be executed on the grouped messages. These operations can generate artificial messages that can contain information about the whole group (for example the number of messages, the value of the "HOST" key in the last message, and so on).

The artificial messages are built with the Handlebars templating library. Although it served well for the initial purposes, its syntax is alien to system administrators and we do not have any consolidation function implemented (for example avg, min, max).

The aim of this project is to use the Python language as a template library and integrate it into the correlation library.

Tasks:

  • implement Python bindings for the relevant types (for example LogMessage)
  • implement the consolidation functions
  • integration into the correlation library

Bonus: support SQL-like syntax where the performance-critical functions are written in Rust, for exaple: select(max("foo"), where=streq("bar", "beer"))

Proposed by: Tibor Benke

Mentor: Tibor Benke (co-mentor: Viktor Juhasz )

Difficulty: Medium

Deliverables of the project:

  • Improving data enrichment capabilities of syslog-ng's correlation library
  • Improving the template syntax of syslog-ng's correlation library
  • Improving the flexibility of syslog-ng's correlation library

Desirable skills:

  • Familiarity with Rust and Python

What the student will learn:

  • Development in Rust
  • Using Python from other languages through FFI
  • How to write memory safe code