Skip to content

GSoC 2019 Proposal: Analyze and improve JSON performance (WqyJh)

Qiying Wang edited this page Apr 9, 2019 · 1 revision

Project Overview

The goal of this project is to improve the performance of JSON parsing/formatting in syslog-ng. The task is to design a benchmark, compare the currently used json-c to other existing JSON libraries and choose a better one to replace json-c.

syslog-ng is a high performance log daemon providing performance levels comparable to a large cluster when running on a single node. In a heavy load use case, it's performance will be greatly influenced by parsing and formatting. In the many of the supported data format, json is widely used for many destinations such as elasticsearch. This project will enhance the performance and make syslog-ng more competitive when compared with other log shippers.

Why am I interested

I am an undergraduate student and I have about 4 years experience in Linux and C Programming Language. I have designed and implemented a proxy system for a company, which consists of an Nginx proxy module, a Windows Network Driver and some Linux programs, all of which are written in C.

I'm also really love opensource. I often come up with some interesting ideas and open source my works. sshx is one of them, which can remember ssh accounts and establish ssh connections and is cross-platform. I've also contributed to several opensource projects.

I'm interested in project management, too. I have been leader of several projects, during which I decide the collaborating flow, project shedule and project releases.

Recently I have a project of an monitor system, which is to collect the system metrics and visualize them. I choose to use elasticsearch to store the data, grafana to visualize them, and syslog-ng to ship the logs. I got familiar with syslog-ng because of this project and want a further exploration. I want to learn syslog-ng from the aspect of source code, which is helpful when encountering difficulties. Learning the project collaborating and management are also what I want. Therefore, I want to contribute to it and join in the community.

Goal of this project

  1. Design a benchmark for analyzing the performance of JSON parsing and formatting in syslog-ng.
  2. Design a benchmark for comparing JSON libraries.
  3. Define selection criteria that makes a JSON lib the best.
  4. Select the best library and integrate it with syslog-ng. Since there are many high performance json libraries are implemented by C++, we need to design a wrapper layer. This is also useful for the stability of the JSON API, leading to a better scalability and easier to switch to another json library.
  5. Evaluate the performance of JSON parsing/formatting of syslog-ng with the selected library using the pre-designed benchmark and compare the performance improvement.

Knowledge required

  • c programming language
  • linux
  • git
  • autotools
  • unittests
  • profiling

familiar

  • c programming language: many project experience such as Nginx module and Windows Network Driver
  • linux: about 4 years experience of using linux as my main system
  • git: git commands, git workflow, github PRs

improve

  • autotools: get familiar with the autotools' configurations
  • unittests: get familiar with criterion library
  • profiling: learn to profiling and benchmarking

Time shedule

May 7 - May 27 (Preparation)

  • Get familiar with the codebase
  • Searching for some potential json libraries
  • Improve the profiling skills
  • Get familiar with the mentor and the community

May 27 - June 10

  • Design a benchmark for analyzing the performance of JSON parsing and formatting in syslog-ng.

June 10 - June 24

  • Design a benchmark for comparing JSON libraries.
  • Define selection criteria that makes a JSON lib the best.
  • Select the best library.

June 24 - July 8

  • First evaluation and fixes
  • Design a JSON API wrapper layer, integrate with the original json-c api.

July 8 - July 22

  • Integrate with the chosen json library
  • Create unittests

July 22 - Aug 9

  • Second evaluation and fixes

  • Evaluate the performance of JSON parsing/formatting of syslog-ng with the selected library using the pre-designed benchmark and compare the performance improvement.

Aug 9 - Aug 19

  • Review code
  • Write documentation

Aug 19 - Aug 27

  • Submit code and documentations by pull request
  • Fix code reviews of pull request

Contact Informations

Name Qiying Wang
Email qiyingwangwqy@gmail.com
University Huazhong University of Science and Technology
Phone will be provided in gsoc proposal
Location Wuhan, China (GMT +8 hours)
Clone this wiki locally