Skip to content

basiliqio/messy_json

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

50 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Messy Json

Rust JSON Parser for dynamically structured documents

Introduction

The rust ecosystem allows for very good compile-time implementation of JSON deserializer to rust structure, however, things get a bit more sparse when it come to run-time deserialization of dynamically structured objects. This crate approaches this problems in a simple manner, resembling serde_json's Value.

Example

	use messy_json::*;
	use serde::de::DeserializeSeed;

    let nested_string = MessyJson::from(MessyJsonInner::String(MessyJsonScalar::new(false)));
    let schema: MessyJson = MessyJson::from(MessyJsonInner::Obj(MessyJsonObject::from(MessyJsonObjectInner::new(
        vec![(arcstr::literal!("hello"), nested_string)]
            .into_iter()
            .collect(),
        false,
    ))));
    let value = r#"
	{
		"hello": "world"
	}
	"#;

	let mut deserializer = serde_json::Deserializer::from_str(value);
	let parsed: MessyJsonValueContainer = schema.builder(MessyJsonSettings::default()).deserialize(&mut deserializer).unwrap();
	
	println!("{:#?}", parsed)

Performance

This crate is more effecient than serde_json's Value when all the fields are required. The performance par with serde_json's Value when some fields are optional.

However this crate is far behind deserializing using the proc-macro from serde (which is not dynamically structured at all).

This gap could be filled using a custom arena-based allocator, like Bumpalo when the Allocator trait is merged into stable.

This crate implements benchmarks. The following graphs were run on a machine with the following specs:

  • CPU : Intel i9-9900K @ 4.7Ghz
  • RAM : 32 Gb RAM @ 2133 Mhz
  • Kernel : 5.11.16-arch1-1
  • Rust : rustc 1.51.0 (2fd73fabe 2021-03-23)

In the following benchmarks, the messy_json crate is compared with deserializer from the serde_json's Value and macro-generated deserializer using serde's derive.

Dummy object

The following benchmark consists of deserializing the JSON Document

{
	"hello":
	{
		"hola": "world"
	}
}

the accepted schema should looks like the following:

use std::borrow::Cow;

struct DummyObjNested<'a> {
    hola: Cow<'a, str>,
}

struct DummyObj<'a> {
    hello: DummyObjNested<'a>,
}

The results show that messy_json is slower than macro-generated deserializer but faster than using serde_json's Value.

Partial object

The following benchmark consists of deserializing the JSON Document

{
	"hello":
	{
		"hola": "world"
	}
}

the accepted schema should looks like the following:

use serde::{Serialize, Deserialize};
use std::borrow::Cow;

#[derive(Serialize, Deserialize)]
struct PartialObjNested<'a> {
    hola: Cow<'a, str>,
}

#[derive(Serialize, Deserialize)]
struct PartialObj<'a> {
    hello: PartialObjNested<'a>,
    coucou: Option<Cow<'a, str>>,
    coucou1: Option<Cow<'a, str>>,
    coucou2: Option<Cow<'a, str>>,
}

The results show that messy_json is slower than macro-generated deserializer and on par with serde_json's Value. When using optional values, this crate has to check it has met all of the mandatory values for each object, hence the performance regression. In the future, when the alloc_api of the Rust language is merged into stable, optimizations could be put in place reducing the time necessary to check for missing fields.

Simple object

The following benchmark consists of deserializing the JSON Document

{
	"hello": "world"
}

the accepted schema should looks like the following:

use std::borrow::Cow;
use serde::{Serialize, Deserialize};

#[derive(Serialize, Deserialize)]
struct SimpleObj<'a> {
    hello: Cow<'a, str>,
}

The results show that messy_json is slower than macro-generated deserializer but is still faster than serde_json's Value.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published