Skip to content

GSoC2015 Proposal: syslog ng in Java(Johnson Li)

Johnson Li edited this page Mar 27, 2015 · 4 revisions

Goal

The syslog-ng application is a flexible and highly scalable system logging application that is ideal for creating centralized and trusted logging solutions. And Java is one of the most popularity programming language nowadays. Although the syslog-ng has a possibility to write destinations in Java, it's far from enough. The goal of this project is to extend this functionality, and make it possible to write filters, parsers, rewrite rules, template functions, and even sources in Java.

Benefits

For syslog-ng

Syslog-ng is powerful mostly for its flexible configuration. In The syslog-ng Open Source Edition 3.6 Administrator Guide, a demo configuration to receive and parse log from Tomcat is provided. But it's truly more convenient if the user can write source and parse in java. By this way, it's also possible to write a highly structured message. On the other hand, a java programmer can easily extent the functionality of syslog-ng by using java APIs. And that feature can make great use of existing java library, like string parsing, socket communication, file operation and so on.

In all, this project will improve syslog-ng's functionality by providing flexible api for java, and thus make it take advantage of so many existing java libraries, as well as features provided by java.

For me

To be honest, I am a java programmer because the encapsulation that java provides helps a lot when developing software. But I never think a programmer using high level language like java can ignore how it's implemented in the low level. So I always concern for these low level details behind java. For example, when using java nio, I first google selector and epoll in c. And in my mind, a good java programmer must also be good at c. So I am really interested in this project, the binding of c and java is just what I want to learn. This project should give me a great chance to gain project experience in both c and java. And with jni, I can also get a better understanding on java.

Details

Structure in language level

The task of this project can be briefly split into two parts: java and c. Java is responsible for providing high level interface or API to concrete java code wirten by users. On the other hand, c is supposed to play the role of an interpreter, making it possible for java code to communicate with syslog-ng core. And it's also important to transfer syslog-ng 'objects' like logmsg and logpipe into java classes, by this way java code is able to have full control over syslog-ng.

Java files are compiled and packaged into SyslogNg.jar, then deployed to syslog-ng module directory. And c files are compiled into both static and dynamic library files, then deployed to syslog-ng module directory too. Both of them will be loaded by system-ng when it's started(moer accurately, the latter is responsible for initiating jvm and loadding jar). The work of this project is none of official repository, but is a module for syslog-ng, as described in syslog-ng-incubator home page.

More technically, java creates native methods and c implements them using jni. These c codes are called proxy in structure, meaning that they only transfer data from java code to syslog-ng core or the other direction. But to complete these proxies, it's essential to have a global view over syslog-ng and know which function or variable to use when delivering 'message'.

Structure in functionality level

Syslog-ng provides flexible configuration options and what I care in this project are destination, filter, parser, rule, and template function. As described before, destination functionality has already been implemented, so the rest are most of my work, and they should all follow the design pattern of destination.

But source functionality is much complex and may be different. For functionality like destination, the running syslong-ng process will invoke its code when needed. So it's synchronized. As to source functionality, it's not the same. Java code should notify syslog-ng process that it has a message, which means that procedure is asynchronized. In operating system we know that cross process communication is much complex than that in one single process. And my design is to use JMX, but it's not decided yet. A further discussion is required.

Structure in module level

Structure of module java

  • module/java/
  • module/java/native/
  • module/java/proxy
  • Java files (module/java folder) This folder contains of all the java file needed by this project, including syslog-ng objects, extendable functionality classes and java class loader.

    • Syslog-ng objects are java classes that represent syslog-ng c structures. For example, LogMessage.java is related to _LogMessage structure in logmsg.h. And they should make java codes able to change the inner state of these structures.

    • Extendable functionality classes are grouped by functionality. For example, LogDestination.java represents destination functionality in syslog-ng. And they should provide basic operations for each functionality. Users should extend them if they want to define their own implementation.

    • Java class loader is responsible for loading jar file from disk. JVM has provided a very convenient way to load external java classes, so there is no need to worry about this problem, it's already solved here.

  • Native files (module/java/native folder) Native files are the main part of this project, they are related to syslog-ng configuration parsing, plugin registry, jvm setup and native c parts.

    • Syslog-ng configuration parsing is written in java-gramma.ym and java-parser.c. Syslog-ng uses yacc and tex to analyze configuration file, and java module use the same method. When compiling, java-gramma.ym is used to generate java-gramma.c and java-gramma.h which are used to generate library for module java. In java-gramma.ym, I need to import 'token' defined in java-parser.c and map c functions to handle each configuration key words defined in java-parser.c. And in java-parser.c, there are a structure to record each configuration key word.

    • For plugin registry, each functionality should be defined separately in java-plugin.c as a member of Plugin structure. And the type field of each plugin is set to identify its functionality. For example, LL_CONTEXT_PARSER is set to parser.

    • JVM setup is used to start up java virtual machine, in order to invoke java code in c. These basic functions have already be finished in java_machine.c and java-class-loader.c, so I don't need to do any modification.

    • Native c parts are tightly related to syslog-ng code. They should define date structure to extend functionality to java plugins. When setting up, syslog-ng will read configuration file, the initiating each module and every objects according to the file. Then native c parts should provide functions to initiate my modules and java classes correctly. When running, syslog-ng will use this part to invoke java codes. For example, when a log message comes to the stage of destination, and the structure LogMessage which represents this log message is sent to java_dd_send_to_object function of java_destination.c, this function should send the message to proxy afterwords.

  • Proxy files (module/java/proxy folder) As the name indicates, this part transfers data from c to java and also java to c. Each syslog-ng structure related operating in java should be implemented here. And whenever a native c code want to invoke java function, it has to invoke proxy functions instead. This part is designed to make the structure more clear, and is of not too much work because most codes here are the same.

Timeline

  • May 10th - May 24th
    • Preparation, autoconfig tools, figure out how to modify configure files according to added files.
    • Preparation, base structure of syslog-ng, find out how proxy code is triggered. Possibly analyze from destination code.
  • May 25th - June 10th
    • Design, create java class and methods to meet requirement of the functionality. Some demo user java code may also be required.
    • Design, c source file and function for every functionality, mainly proxy files and jni function related to java native method
  • June 11st - July 10th
    • Coding, start from filter, as suggested by Juhász. This part is truly the start of major parts of the project and every single function should be tested. So unit test code should be created here.
  • July 11th - July 21st
    • Coding, parser and rule functionality, also numerous test cases are required and all of them should be kept.
  • July 22th - July 31st
    • Coding, template functionality. The requirement is the same as previous coding. Besides, if time permits I can try to implement source functionality, but it's optional.
  • Aug 1st - Aug 12th
    • Integration Test, some documentation if needed
    • Review, get feedback from mentor and improve design in global view
  • Aug 13th - Aug 20th
    • Cleaning, clear unnecessary code, mainly annotation and debug message
    • Summary, last version committed

About me

Contacting Info:

I'm a third year b.s. student at Fudan University, School of Computer Science and I'm good at c and Java. In operating system class, I completed ucore lab independently, which requires students to complete part of linux kernel based on version 2.4. That proves my ability in c. As to java, I gained much experience in software development during my internship. My work experience is related to Continuous Integration, Continuous Delivery, web development based on spring and so on. Besides, as a linux user, I'm familiar with shell script as well as some useful tools, including syslog-ng.

It's the first time that I work for open source community. But since the first time that I know coding, I have used numerous open source projects. And I's really exciting if I can contribute to syslog-ng.

Additional Information

Clone this wiki locally