Skip to content

Commit

Permalink
Chunkstore performance enhancements (pandas-dev#182)
Browse files Browse the repository at this point in the history
- new serializer for ChunkStore
- supports by column serialization
- Significantly faster than the record serializer for this use case
- Supports DataFrames and Series only
- Changes to chunker that boost performance
- Ability to read subset of columns
- Also fixes pandas-dev#164
  • Loading branch information
bmoscon committed Aug 2, 2016
1 parent 5ffd781 commit 7624809
Show file tree
Hide file tree
Showing 13 changed files with 761 additions and 440 deletions.
5 changes: 5 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
## Changelog

### 1.27

* Bugfix: #187 Compatibility with latest version of pytest-dbfixtures
* Feature: #182 Improve ChunkStore read/write performance

### 1.26 (2016-07-20)

* Bugfix: Faster TickStore querying for multiple symbols simultaneously
Expand Down
16 changes: 16 additions & 0 deletions arctic/chunkstore/_chunker.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
START = 's'
END = 'e'


class Chunker(object):

def to_chunks(self, data, *args, **kwargs):
Expand Down Expand Up @@ -60,3 +64,15 @@ def exclude(self, data, range_obj):
data, filtered by range_obj
"""
raise NotImplementedError

def chunk_to_str(self, chunk_id):
"""
Converts parts of a chunk range (start or end) to a string. These
chunk ids/indexes/markers are produced by to_chunks.
(See to_chunks)
returns
-------
string
"""
raise NotImplementedError
Loading

0 comments on commit 7624809

Please sign in to comment.