Skip to content

stanford-futuredata/sparser

sparser-openso…
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 
 
 

sparser

This code base implements Sparser, raw filtering for faster analytics over raw data. Sparser can parse JSON, Avro, and Parquet data up to 22x faster than the state of the art. For more details, check out our paper published at VLDB 2018.

See the demo-repl directory for a brief example. To run it:

# update rapidjson submodule
git submodule init
git submodule update
cd demo-repl
make
./bench /path/to/large/file.json

Then enter 1 at the Sparser> prompt.

Sparser itself is just a header file and only depends on standard C libraries available on most systems.

About

Sparser: Raw Filtering for Faster Analytics over Raw Data

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published