Can I use the columnar store extension with timescales #89

tr8dr · 2017-06-09T11:46:29Z

I have "wide" tables, where for some analytical queries would greatly benefit from columnar storage & other efficiencies. Can timescaledb work with the columnar extension? For example:

https://github.com/citusdata/cstore_fdw

stalltron · 2017-06-09T23:40:34Z

The answer at this point in time is no. The data partitioning (chunking) TimescaleDB uses is optimized for indexing the data so that queries, especially as they increase in complexity, are performant across larger volumes of data. With a columnar store you lose almost-all indexing (i.e., there is no B-tree support at all) so it doesn't make sense to combine the two models given our decisions. We've had some internal engineering discussions about some ideas for columnar storage, but it is not on any shorter-term roadmap.

akulkarni · 2017-06-10T00:42:36Z

Also - If you feel comfortable sharing the general structure of the data you are storing (and the relevant queries), we can also take a closer look / make suggestions on how we'd recommend to best store that data in Timescale.

tr8dr · 2017-06-10T10:51:54Z

I recognize that columnar storage is poor for certain workloads and better for others. My main issue is the cost of table scans when a given column-narrow ad-hoc query cannot be resolved by an index.

I suspect the biggest win for my sort of queries would be if could apply parallel disk read + filtering (in this case for 1 server with a 10-way disk array and may cores). This would be, without the hardware, similar to what Netezza does, i.e. parallel reads with filtration based on what part of a query can be run on a chunk of data, on each tightly coupled cpu <-> disk.

At the moment, short of creating numerous indices across many rows, some queries will involve a linear table scan. Linear scan can work reasonably well with parallelism.

mfreed · 2017-07-06T01:08:04Z

Hi @tr8dr, sorry for the delay in responding.

One of the lesser-advertised features in our recent 0.1.0 release is the ability to associate multiple Postgres tablespaces with a single hypertable, so that this single "table" can reside across multiple disks, and chunks belonging to this hypertable can be then queried in parallel.

Better documentation is forthcoming for the new attach_tablespace() API command, but if you are interested in the details:

71c5e78

mfreed · 2017-08-15T01:10:15Z

@tr8dr I'm going to close out this issue unless there's anything else?

89: Adding compound aggregate for uddsketch r=WireBaron a=WireBaron This change also fixes some errors in the udd sketch combining code. Co-authored-by: Brian Rowe <brian@timescale.com>

mfreed added enhancement question and removed enhancement labels Jul 6, 2017

mfreed closed this as completed Sep 20, 2017

min-mwei mentioned this issue Oct 9, 2017

server crash, stack overflow #245

Closed

syvb pushed a commit to syvb/timescaledb that referenced this issue Sep 8, 2022

Merge timescale#89

83c66f4

89: Adding compound aggregate for uddsketch r=WireBaron a=WireBaron This change also fixes some errors in the udd sketch combining code. Co-authored-by: Brian Rowe <brian@timescale.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can I use the columnar store extension with timescales #89

Can I use the columnar store extension with timescales #89

tr8dr commented Jun 9, 2017 •

edited

stalltron commented Jun 9, 2017

akulkarni commented Jun 10, 2017

tr8dr commented Jun 10, 2017 •

edited

mfreed commented Jul 6, 2017

mfreed commented Aug 15, 2017

Can I use the columnar store extension with timescales #89

Can I use the columnar store extension with timescales #89

Comments

tr8dr commented Jun 9, 2017 • edited

stalltron commented Jun 9, 2017

akulkarni commented Jun 10, 2017

tr8dr commented Jun 10, 2017 • edited

mfreed commented Jul 6, 2017

mfreed commented Aug 15, 2017

tr8dr commented Jun 9, 2017 •

edited

tr8dr commented Jun 10, 2017 •

edited