New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor JsonQueryFactory so that parameters are not expanded as text in the query string. #146

Merged
merged 2 commits into from Sep 10, 2013

Conversation

Projects
None yet
2 participants
@swallez
Contributor

swallez commented Aug 19, 2013

This allows preserving the actual types (e.g. longs aren't converted to ints, which was a problem for my use case) and speeds things up by avoiding a round trip through a text representation.

swallez added some commits Aug 19, 2013

Refactor JsonQueryFactory so that parameters are not expanded as text…
… in the query string. This allows preserving the actual types (e.g. longs aren't converted to ints) and speeds things up by avoiding a round trip through a text representation.
@bguerout

This comment has been minimized.

Owner

bguerout commented Aug 23, 2013

Sorry for the late response, thanks for the pull-request.
We are going to have a look this week-end :-)

@swallez

This comment has been minimized.

Contributor

swallez commented Aug 24, 2013

No worries for the delay!

A bit more context to what led me to this change: I store time series in a structure that looks like this

{
  start : 134523, // first timestamp
  end : 134532,   // last timestamp
  time : [NumberLong(134523), NumberLong(134532), NumberLong(0), NumberLong(0), ...],
  value : [1.2, 2.3, 0.0, 0.0, ...]
}

The purpose is to have a single document containing a large number of values, and fill it with partial updates of the arrays. This reduces database size by storing values in chunks by time ranges and increases transfer speed. Time series are well suited for that since individual values are rarely read in isolation.

An important point here is that when a new chunk is created, its arrays are pre-filled with typed zeros (long for timestamp and double for values) so that document structure and size don't change over time when values are added. This allows the database to update documents in place since the BSON layout hasn't changed.

This means just a few bytes are updated rather than rewriting the whole document, possibly with a different size that would require to relocate it on disk.

With the current QueryFactory implementation, update instructions are serialized as text before being parsed as json. The consequence is that numbers are converted to the smallest representation that can fit them, e.g. a value of "0.0" will result in an int32 rather than a double, thus changing the document layout.

My refactoring fixes this by avoiding the roundtrip through a textual representation, thus keeping the original number types.

@swallez

This comment has been minimized.

Contributor

swallez commented Sep 9, 2013

Ping! Hey there, did you have a chance to look at the code? ;-)

@bguerout

This comment has been minimized.

Owner

bguerout commented Sep 10, 2013

Hello,
Your pull request is awesome. I've just renamed JsonQuery* into BsonQuery* and renamed $jpParam into $umarshall
I'm going to merge you pullr request this evening.

@swallez

This comment has been minimized.

Contributor

swallez commented Sep 10, 2013

It actually started as BsonQuery in my project to live besides JsonQuery. So it goes back to its original name :-)

Thanks!

@bguerout bguerout merged commit 674f0df into bguerout:master Sep 10, 2013

@bguerout

This comment has been minimized.

Owner

bguerout commented Sep 10, 2013

Previous JsonQuery has been moved here : https://github.com/bguerout/jongo-extras

@swallez

This comment has been minimized.

Contributor

swallez commented Sep 11, 2013

Thanks again!

@bguerout

This comment has been minimized.

Owner

bguerout commented Sep 12, 2013

This pull request has been included into an early release.
This version must be considered as a SNAPSHOT of future 0.5 release

<dependencies>
  <dependency>
    <groupId>org.jongo</groupId>
    <artifactId>jongo</artifactId>
    <version>0.5-early-20130912-1506</version>
  </dependency>
</dependencies>
...
<repositories>
    <repository>
       <id>cloudbees-jongo-early-release</id>
        <url>http://repository-jongo.forge.cloudbees.com/release</url>
        <releases>
            <enabled>true</enabled>
        </releases>
        <snapshots>
            <enabled>false</enabled>
        </snapshots>
    </repository>
</repositories>

@swallez swallez deleted the swallez:bson-query branch Sep 19, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment