Skip to content

noizu-labs/StreamingJsonParser

Repository files navigation

Noizu Streaming JSON Parser

A C library for efficiently parsing JSON data from a stream, without needing to load the entire JSON document into memory first. It works in conjunction with the trie_gen library to simplify token management and to optimize parsing performance.

It was written to deal with parsing large json payloads on an IOT project where the underlying chip just couldn't keep up with the inflow fast enough to avoid buffer over-runs/timeouts/crashes.

Key Features

  • Efficient: Low memory footprint, ideal for resource-constrained environments.
  • Streaming: Parses JSON data as it arrives, suitable for large documents or real-time applications.
  • Callback-based: User-defined callbacks for handling parsed JSON elements.
  • Customizable: Supports different trie implementations (generated by trie_gen) for flexible tokenization.
  • Portable: Pure C implementation for broad compatibility.

Getting Started

1. Install trie_gen

Follow the instructions in the trie_gen repository to install the trie generator tool.

2. Generate Trie Definition

Create a text file containing your JSON keys and corresponding tokens (one key-token pair per line, separated by a pipe |). Then, use the trie_generator.exe tool to generate a C header file with the trie definition:

trie_generator.exe keys.txt my_trie.h my_trie ARRAY  // Choose ARRAY, STRUCT, or COMPACT

3. Include Headers

In your main C file, include the generated trie header and the Noizu Streaming JSON Parser header:

#include "my_trie.h"
#include "streaming-parser.h"

4. Define Callback Function

Create a callback function to process parsed JSON elements. Use the generated tokens from my_trie.h to identify the keys:

// Example structure to hold parsed data
struct parsed_data {
    int value1;
    char value2[100];
};

// Callback function to handle parsed JSON elements
jsp_cb_command my_callback(json_parse_state state, json_parser* parser) {
    struct parsed_data* data = (struct parsed_data*)parser->output;
    switch (state) {
        case PS_COMPLETE:
            if (parser->token == MY_TRIE_TOKEN_VALUE1) {
                json_parser__extract_sint31(parser, &data->value1);
            } else if (parser->token == MY_TRIE_TOKEN_VALUE2) {
                json_parser__extract_string(parser, &data->value2);
            }
            break;
        // Handle other parsing states as needed (PS_LIST_ITEM_COMPLETE, etc.)
    }
    return JSPC_PROCEED;
}

5. Initialize and Parse

Create an offset buffer, initialize the parser with the generated trie and your callback, and start parsing:

int main() {
    // Example JSON data
    char json[] = "{\"value1\": 123, \"value2\": \"abc\"}";

    // Create an offset buffer
    offset_buffer req;
    req.buffer = json;
    req.buffer_size = strlen(json) + 1;

    // Create a parsed_data structure to store results
    struct parsed_data data;

    // Initialize the parser with the generated trie
    json_parser* parser = init_json_parser(&req, my_trie, my_callback, &data);

    // Start parsing
    json_streaming_parser(parser);

    // Access parsed data
    printf("value1: %d\n", data.value1);
    printf("value2: %s\n", data.value2);

    // Free the parser
    free_json_parser(parser, TRUE);

    return 0;
}

Use Cases

  • IoT devices
  • Embedded systems
  • Real-time data processing
  • Large JSON document parsing

Benefits

  • Efficiency: Low memory usage and fast parsing.
  • Flexibility: Supports various trie implementations and parsing options.
  • Ease of Use: trie_gen simplifies token management and code readability.

Additional Resources

About

Support for streaming parsing of JSON code.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages