Skip to content

Releases: chdb-io/chdb

v1.0.0rc2

20 Nov 14:28
Compare
Choose a tag to compare
v1.0.0rc2 Pre-release
Pre-release

What's Changed

Full Changelog: v1.0.0rc1...v1.0.0rc2

v1.0.0rc1

17 Nov 04:50
Compare
Choose a tag to compare
v1.0.0rc1 Pre-release
Pre-release

What's Changed

  • Fix changing default database by USE in session mode by @auxten in #133

Full Changelog: v0.16.0rc2...v1.0.0rc1

v0.16.0rc2

11 Nov 13:25
Compare
Choose a tag to compare
v0.16.0rc2 Pre-release
Pre-release

What's Changed

Full Changelog: v0.16.0rc1...v0.16.0rc2

v0.16.0rc1

10 Nov 10:39
Compare
Choose a tag to compare
v0.16.0rc1 Pre-release
Pre-release

chdb Release Summary

chdb 0.16 based on clickhouse 23.10

Query Enhancements

  • Vector Addition:

    • python3 -m chdb "SELECT [1, 2, 3] + [4, 5, 6]".
  • Omit file() Function:

    • python3 -m chdb "SELECT * from '/home/Clickhouse/bench/hits_0.parquet' limit 10".
  • NumPy as Input Format:

    • Support for NumPy as an input format with the query SELECT * FROM 'data.npy'.
  • Parquet Optimizations:

    • Writing parquet files is 10x faster, it's multi-threaded now. Almost the same speed as reading.
    • Parquet filter pushdown. I.e. when reading Parquet files, row groups (chunks of the file) are skipped based on the WHERE condition and the min/max values in each column.
    • Optimize reading small row groups by batching them together in Parquet.
  • Condition Pushdown for ORC:

    • Using data skipping indices in ORC, similarly to Parquet.
  • PRQL Support:

    • Added support for PRQL as a query language.
  • urlCluster Function:

    • Add urlCluster table function.

New Features

  • Introducing arrayFold for applying a lambda function to multiple arrays.
  • Extended support for asynchronous inserts with external data via the native protocol.
  • Introduced function jsonMergePatch for merging JSON strings.
  • Continued support for Kusto Query Language dialect with Phase 1 implementation.
    - Introduced a new SQL function arrayRandomSample for sampling elements from an input array.
    - Added support for dropping cache for Protobuf format with SYSTEM DROP SCHEMA FORMAT CACHE [FOR Protobuf].
  • Conditions on arguments of a table with a space-filling curve in its key can now be used for indexing.
  • New setting force_optimize_projection_name checks that a projection is used in the query.
  • Added aggregation function lttb using the Largest-Triangle-Three-Buckets algorithm for downsampling data.
  • CHECK TABLE query has better performance and usability, supporting checking particular parts.
    - Introduced function byteSwap for reversing the bytes of unsigned integers.
    - Added functions formatQuery and formatQuerySingleLine for formatted SQL query output.
    - Introduced DWARF input format for reading debug symbols from an ELF file.
    - Introduced SHOW SETTING setting_name as a simpler version of SHOW SETTINGS.
    - Added fields substreams and filenames to the system.parts_columns table.
    - Introduced a setting create_table_empty_primary_key_by_default for default ORDER BY ().

Performance Improvements

  • Fixed contention on Context lock, significantly improving performance for short-running concurrent queries.
  • Improved the performance of inverted index creation by 30%.
  • Optimized memory consumption for external aggregation with many temporary files.
  • Added option query_plan_preserve_num_streams_after_window_functions to preserve the number of streams after evaluating window functions.
  • Released more streams if data is small, optimizing resource usage.
  • Optimized RoaringBitmaps before serialization.
  • Optimized inverted index posting lists to use the smallest possible representation.
  • Set a reasonable size for the marks cache for secondary indices by default.
  • Avoided unnecessary reconstruction of index granules when reading skip indexes.
  • Cached CAST function in set during execution to improve the performance of function IN when set element type doesn't match column type.
  • Improved write performance to EmbeddedRocksDB tables.
  • Improved overall resilience for ClickHouse in case of many parts within a partition.
  • Reduced memory consumption during loading of hierarchical dictionaries.
  • All dictionaries now support the setting dictionary_use_async_executor.
  • Prevented excessive memory usage when deserializing AggregateFunctionTopKGenericData.
  • Reduced CPU consumption for AsyncMetrics threads on a Keeper with lots of watches.
  • Experimental inverted indexes now do not store tokens with too many matches, saving space.
  • Improved write performance to EmbeddedRocksDB tables.
  • Improved write performance to hierarchical dictionaries.

v0.15.0

01 Nov 09:45
0fe3f98
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.14.2...v0.15.0

v0.14.2

06 Sep 07:42
92024bd
Compare
Choose a tag to compare

Keep debug info in shared lib

v0.14.1

04 Sep 13:54
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.13.0...v0.14.0

v0.14.0

04 Sep 09:36
409350e
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.13.0...v0.14.0

v0.13.0

17 Aug 11:04
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.12.0...v0.13.0

v0.12.0

17 Aug 06:12
c330249
Compare
Choose a tag to compare

What's Changed

  • Query on multiple Pandas DataFrame by @auxten in #89

Full Changelog: https://github.com/chdb-io/chdb/commits/v0.12.0