Skip to content
View chronowave's full-sized avatar

Block or report chronowave

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
chronowave/README.md

ChronoWave Search

Distributed schema agnostic search & analytics solution for append only machine generated data. It is built on self compressed full text search engine, capable of handling ad hoc query and machine learning work load. Focusing on getting application faster to the market for an agile business while lowering the cost of development & operation.

ChronoWave can be used for:

  • distributed tracing
  • logging and log analytics
  • infrastructure and application performance monitoring
  • machine learning on IOT data
  • data archiving
  • utf-8 encoded multi-byte text document search, ie. fluent search on Chinese, Korean and Japanese characters.

Design Goal: Towards Simple, Small and Performant

ChronoWave batches input data stream into segments (or blocks) and transforms the semi structured data into columnar formats, followed by creating self compressed index to produce a succinct index data structure.

Simple to develop, use and operate. Machine generated data can be in different shape and size. ChronoWave simplifies development cycle with its support of schema agnostic semi structured data. Operation is also fairly easy. ChronoWave is engineered on succinct self-compressed index data structure, the basis of shared nothing architecture that can scale out on load.

ChronoWave uses only single copy of machine generated data to support major use cases, like Analytics, AI and Real Time Monitoring. The size of self-compressed index is only a fraction of the original. ChronoWave requires only the index to filter/extract information or restore entire data set with SSQL, Semi Structured Query Language.

ChronoWave transforms semi structured data into columnar formats, leverages modern CPU vector instructions to execute queries and full text search.

ChronoWave In Action

SSQL

command line example

  1. build command line executable
cd cmd/waverider
go build

sample.json is a distributed trace data generated by HotRod app captured in Jaeger. Only partial data listed.

{
   "tags" : [
      {
         "type" : "string",
         "key" : "http.url",
         "value" : "http://0.0.0.0:8083/route?dropoff=577%2C322&pickup=516%2C208"
      }
   ],
  "startTime" : 1601613777130370,
  "spanID" : "2ef6e3c30af421ea",
   "traceID" : "464382d9a88849ff"
}
  1. construct index: /startTime as time partition required, /traceID, /spanID will be used latter as K/V usage (optional)
./waverider index ./testdata/sample.json -d data -t '/startTime' -k '/traceID' -k '/spanID'
  1. query data: timeframe is a SSQL keyword that tells ChronoWave searches data between the required time range. ChronoWave supports partial words and wild card full text search.
./waverider query -d data 'find $log where [$log /logs][/startTime timeframe(1601613777130350, 1801613777130470)] [/tags [/key contain("http.url")] [/value contain("dropoff*pickup")]]'
  1. key/value lookup: key is a SSQL keyword that tells ChronoWave lookups by the key. The JSON path of the key must be provided at the time of building index.
./waverider query -d data 'find $a where [$a /process][/traceID key("464382d9a88849ff")]'

Version

feature Open Source Community Edition
search & analytics
single node / embedded
multi-nodes cluster
label index selection
SIMD / AVX512
License Apache v2.0 coming soon

Questions or Suggestions

Please comment on Gitter

Popular repositories Loading

  1. chronowave chronowave Public

    A schema agnostic data store for append only IOT data

    Go 2

  2. gateway gateway Public

    Instrument web app with opentelemetry using WebWorker

    Go 1

  3. ext ext Public

    external auxiliary functions

    Go

  4. opentelemetry opentelemetry Public

    Go

  5. xorf xorf Public

    Forked from ayazhafiz/xorf

    Xor filters - efficient probabilistic hashsets. Faster and smaller than bloom and cuckoo filters.

    Rust

  6. fbs fbs Public

    Flatbuffers schema definition files

    Rust