Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor JsonQueryFactory so that parameters are not expanded as text in the query string. #146

Merged
merged 2 commits into from
Sep 10, 2013

Conversation

swallez
Copy link
Contributor

@swallez swallez commented Aug 19, 2013

This allows preserving the actual types (e.g. longs aren't converted to ints, which was a problem for my use case) and speeds things up by avoiding a round trip through a text representation.

… in the query string. This allows preserving the actual types (e.g. longs aren't converted to ints) and speeds things up by avoiding a round trip through a text representation.
@bguerout
Copy link
Owner

Sorry for the late response, thanks for the pull-request.
We are going to have a look this week-end :-)

@swallez
Copy link
Contributor Author

swallez commented Aug 24, 2013

No worries for the delay!

A bit more context to what led me to this change: I store time series in a structure that looks like this

{
  start : 134523, // first timestamp
  end : 134532,   // last timestamp
  time : [NumberLong(134523), NumberLong(134532), NumberLong(0), NumberLong(0), ...],
  value : [1.2, 2.3, 0.0, 0.0, ...]
}

The purpose is to have a single document containing a large number of values, and fill it with partial updates of the arrays. This reduces database size by storing values in chunks by time ranges and increases transfer speed. Time series are well suited for that since individual values are rarely read in isolation.

An important point here is that when a new chunk is created, its arrays are pre-filled with typed zeros (long for timestamp and double for values) so that document structure and size don't change over time when values are added. This allows the database to update documents in place since the BSON layout hasn't changed.

This means just a few bytes are updated rather than rewriting the whole document, possibly with a different size that would require to relocate it on disk.

With the current QueryFactory implementation, update instructions are serialized as text before being parsed as json. The consequence is that numbers are converted to the smallest representation that can fit them, e.g. a value of "0.0" will result in an int32 rather than a double, thus changing the document layout.

My refactoring fixes this by avoiding the roundtrip through a textual representation, thus keeping the original number types.

@swallez
Copy link
Contributor Author

swallez commented Sep 9, 2013

Ping! Hey there, did you have a chance to look at the code? ;-)

@bguerout
Copy link
Owner

Hello,
Your pull request is awesome. I've just renamed JsonQuery* into BsonQuery* and renamed $jpParam into $umarshall
I'm going to merge you pullr request this evening.

@swallez
Copy link
Contributor Author

swallez commented Sep 10, 2013

It actually started as BsonQuery in my project to live besides JsonQuery. So it goes back to its original name :-)

Thanks!

@bguerout bguerout merged commit 674f0df into bguerout:master Sep 10, 2013
@bguerout
Copy link
Owner

Previous JsonQuery has been moved here : https://github.com/bguerout/jongo-extras

@swallez
Copy link
Contributor Author

swallez commented Sep 11, 2013

Thanks again!

@bguerout
Copy link
Owner

This pull request has been included into an early release.
This version must be considered as a SNAPSHOT of future 0.5 release

<dependencies>
  <dependency>
    <groupId>org.jongo</groupId>
    <artifactId>jongo</artifactId>
    <version>0.5-early-20130912-1506</version>
  </dependency>
</dependencies>
...
<repositories>
    <repository>
       <id>cloudbees-jongo-early-release</id>
        <url>http://repository-jongo.forge.cloudbees.com/release</url>
        <releases>
            <enabled>true</enabled>
        </releases>
        <snapshots>
            <enabled>false</enabled>
        </snapshots>
    </repository>
</repositories>

@swallez swallez deleted the bson-query branch September 19, 2013 17:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants