Home

Introduction

Introduction
- Limitations

Applications that run on unstructured or semi-structured data spend considerable amount of their execution time parsing the data. Sparser strives to address this bottleneck by introducing the concept of filtering data before parsing.

There are two key observations that reinforce the idea of filtering:

High selectivity: in Sparser's paper, authors show that queries most of the time have high selectivity. Not having to consider a large portion of data that you're querying can truly bring some performance gain.
Modern hardware: vectorized instructions of modern hardware can be utilized to make filtering/parsing faster. This observation is irrelevant to environments like JMV where (at least currently) you don't have low-level control of hardware.

Limitations

Limitations on predicate support:

Doesn't support equality for data types which can be encoded in different ways. For example, in JSON integer equality is not supported if an integer can be both "3.4" and "34e-1".
Doesn't support inequality for string values(???).
Key-Value Match filter is only valid for data formats such as JSON where keys explicitly exist in the record.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home

Introduction

Limitations

Clone this wiki locally