Releases · StarRocks/starrocks

20 Dec 03:57

bearmmx

3.1.6

fcc0c6b

3.1.6

Release date: December 18, 2023

New Features

Added the now(p) function to return the current date and time with the specified fractional seconds precision (accurate to the microsecond). If p is not specified, this function returns only date and time accurate to the second. #36676
Added a new metric max_tablet_rowset_num for setting the maximum allowed number of rowsets. This metric helps detect possible compaction issues and thus reduces the occurrences of the error "too many versions". #36539
Supports obtaining heap profiles by using a command line tool, making troubleshooting easier.#35322
Supports creating asynchronous materialized views with common table expressions (CTEs). #36142
Added the following bitmap functions: subdivide_bitmap, bitmap_from_binary, and bitmap_to_binary. #35817 #35621

Parameter Changes

The FE dynamic parameter enable_new_publish_mechanism is changed to a static parameter. You must restart the FE after you modify the parameter settings. #35338
The default retention period of trash files is changed to 1 day from the original 3 days. #37113
A new FE configuration item routine_load_unstable_threshold_second is added. #36222
A new BE configuration item enable_stream_load_verbose_log is added. The default value is false. With this parameter set to true, StarRocks can record the HTTP requests and responses for Stream Load jobs, making troubleshooting easier. #36113
A new BE configuration item enable_lazy_delta_column_compaction is added. The default value is true, indicating that StarRocks does not perform frequent compaction operations on delta columns. #36654

Improvements

A new value option GROUP_CONCAT_LEGACY is added to the session variable sql_mode to provide compatibility with the implementation logic of the group_concat function in versions earlier than v2.5. #36150
The Primary Key table size returned by the SHOW DATA statement includes the sizes of .cols files (these are files related to partial column updates and generated columns) and persistent index files. #34898
Queries on MySQL external tables and the external tables within JDBC catalogs support including keywords in the WHERE clause. #35917
Plugin loading failures will no longer cause an error or cause an FE start failure. Instead, the FE can properly start, and the error status of the plug-in can be queried using SHOW PLUGINS. #36566
Dynamic partitioning supports random distribution. #35513
The result returned by the SHOW ROUTINE LOAD statement now includes the timestamps of consumption messages from each partition. #36222
The result returned by the SHOW ROUTINE LOAD statement provides a new field OtherMsg, which shows information about the last failed task. #35806
The authentication information aws.s3.access_key and aws.s3.access_secret for AWS S3 in Broker Load jobs are hidden in audit logs. #36571
The be_tablets view in the information_schema database provides a new field INDEX_DISK, which records the disk usage (measured in bytes) of persistent indexes #35615

Bug Fixes

Fixed the following issues:

The BEs crash if users create persistent indexes in the event of data corruption. #30841
If users create an asynchronous materialized view that contains nested queries, the error "resolve partition column failed" is reported. #26078
If users create an asynchronous materialized view on a base table whose data is corrupted, the error "Unexpected exception: null" is reported. #30038
If users run a query that contains a window function, the SQL error "[1064] [42000]: Row count of const column reach limit: 4294967296" is reported. #33561
The FE performance plunges after the FE configuration item enable_collect_query_detail_info is set to true. #35945
In the StarRocks shared-data mode, the error "Reduce your request rate" may be reported when users attempt to delete files from object storage. #35566
Deadlocks may occur when users refresh materialized views. #35736
After the DISTINCT window operator pushdown feature is enabled, errors are reported if SELECT DISTINCT operations are performed on the complex expressions of the columns computed by window functions. #36357
The BEs crash if the source data file is in ORC format and contains nested arrays. #36127
Some S3-compatible object storage returns duplicate files, causing the BEs to crash. #36103

Assets 2

26 Dec 09:01

bearmmx

3.2.0

758f3d7

3.2.0

Release date: December 1, 2023

New Features

Shared-data cluster

Supports persisting indexes of Primary Key tables to local disks.
Supports even distribution of Data Cache among multiple local disks.

Materialized View

Asynchronous materialized view

The Query Dump file can include information of asynchronous materialized views.
The Spill to Disk feature is enabled by default for the refresh tasks of asynchronous materialized views, reducing memory consumption.

Data Lake Analytics

Supports creating and dropping databases and managed tables in Hive catalogs, and supports exporting data to Hive's managed tables using INSERT or INSERT OVERWRITE.
Supports Unified Catalog, with which users can access different table formats (Hive, Iceberg, Hudi, and Delta Lake) that share a common metastore like Hive metastore or AWS Glue.
Supports collecting statistics of Hive and Iceberg tables using ANALYZE TABLE, and storing the statistics in StarRocks, thus facilitating optimization of query plans and accelerating subsequent queries.
Supports Information Schema for external tables, providing additional convenience for interactions between external systems (such as BI tools) and StarRocks.

Storage engine, data ingestion, and export

Added the following features of loading with the table function FILES():
- Loading Parquet and ORC format data from Azure or GCP.
- Extracting the value of a key/value pair from the file path as the value of a column using the parameter columns_from_path.
- Loading complex data types including ARRAY, JSON, MAP, and STRUCT.
Supports unloading data from StarRocks to Parquet-formatted files stored in AWS S3 or HDFS by using INSERT INTO FILES. For detailed instructions, see Unload data using INSERT INTO FILES.
Supports manual optimization of table structure and data distribution strategy used in an existing table to optimize the query and loading performance. You can set a new bucket key, bucket number, or sort key for a table. You can also set a different bucket number for specific partitions.
Supports continuous data loading from AWS S3 or HDFS using the PIPE method.
- When PIPE detects new or modifications in a remote storage directory, it can automatically load the new or modified data into the destination table in StarRocks. While loading data, PIPE automatically splits a large loading task into smaller, serialized tasks, enhancing stability in large-scale data ingestion scenarios and reducing the cost of error retries.

Query

Supports HTTP SQL API, enabling users to access StarRocks data via HTTP and execute SELECT, SHOW, EXPLAIN, or KILL operations.
Supports Runtime Profile and text-based Profile analysis commands (SHOW PROFILELIST, ANALYZE PROFILE, EXPLAIN ANALYZE) to allow users to directly analyze profiles via MySQL clients, facilitating bottleneck identification and discovery of optimization opportunities.

SQL reference

Added the following functions:

String functions: substring_index, url_extract_parameter, url_encode, url_decode, and translate
Date functions: dayofweek_iso, week_iso, quarters_add, quarters_sub, milliseconds_add, milliseconds_sub, date_diff, jodatime_format, str_to_jodatime, to_iso8601, to_tera_date, and to_tera_timestamp
Pattern matching function: regexp_extract_all
hash function: xx_hash3_64
Aggregate functions: approx_top_k
Window functions: cume_dist, percent_rank and session_number
Utility functions: dict_mapping and get_query_profile

Privileges and security

StarRocks supports access control through Apache Ranger, providing a higher level of data security and allowing the reuse of existing services of external data sources. After integrating with Apache Ranger, StarRocks enables the following access control methods:

When accessing internal tables, external tables, or other objects in StarRocks, access control can be enforced based on the access policies configured for the StarRocks Service in Ranger.
When accessing an external catalog, access control can also leverage the corresponding Ranger service of the original data source (such as Hive Service) to control access (currently, access control for exporting data to Hive is not yet supported).

For more information, see Manage permissions with Apache Ranger.

Improvements

Data Lake Analytics

Optimized ORC Reader:
- Optimized the ORC Column Reader, resulting in nearly a two-fold performance improvement for VARCHAR and CHAR data reading.
- Optimized the decompression performance of ORC files in Zlib compression format.
Optimized Parquet Reader:
- Supports adaptive I/O merging, allowing adaptive merging of columns with and without predicates based on filtering effects, thus reducing I/O.
- Optimized Dict Filter for faster predicate rewriting. Supports STRUCT sub-columns, and on-demand dictionary column decoding.
- Optimized Dict Decode performance.
- Optimized late materialization performance.
- Supports caching file footers to avoid repeated computation overhead.
- Supports decompression of Parquet files in lzo compression format.
Optimized CSV Reader:
- Optimized the Reader performance.
- Supports decompression of CSV files in Snappy and lzo compression formats.
Optimized the performance of the count calculation.
Optimized Iceberg Catalog capabilities:
- Supports collecting column statistics from Manifest files to accelerate queries.
- Supports collecting NDV (number of distinct values) from Puffin files to accelerate queries.
- Supports partition pruning.
- Reduced Iceberg metadata memory consumption to enhance stability in scenarios with large metadata volume or high query concurrency.

Materialized View

Asynchronous materialized view

Supports automatic refresh for an asynchronous materialized view created upon views or materialized views when schema changes occur on the views, materialized views, or their base tables.
Data consistency:
- Added the property query_rewrite_consistency for asynchronous materialized view creation. This property defines the query rewrite rules based on the consistency check.
- Add the property force_external_table_query_rewrite for external catalog-based asynchronous materialized view creation. This property defines whether to allow force query rewrite for asynchronous materialized views created upon external catalogs.
- For detailed information, see CREATE MATERIALIZED VIEW.
Added a consistency check for materialized views' partitioning key.
- When users create an asynchronous materialized view with window functions that include a PARTITION BY expression, the partitioning column of the window function must match that of the materialized view.

Storage engine, data ingestion, and export

Optimized the persistent index for Primary Key tables by improving memory usage logic while reducing I/O read and write amplification. #24875 #27577 #28769
Supports data re-distribution across local disks for Primary Key tables.
Partitioned tables support automatic cooldown based on the partition time range and cooldown time. Compared to the original cooldown logic, it is more convenient to perform hot and cold data management on the partition level. For more information, see Specify initial storage medium, automatic storage cooldown time, replica number.
The Publish phase of a load job that writes data into a Primary Key table is changed from asynchronous mode to synchronous mode. As such, the data loaded can be queried immediately after the load job finishes. For more information, see enable_sync_publish。
Supports Fast Schema Evolution, which is controlled by the table property fast_schema_evolution. After this feature is enabled, the execution efficiency of adding or dropping columns is significantly improved. This mode is disabled by default (Default value is false). You cannot modify this property for existing tables using ALTER TABLE.
Supports dynamically adjusting the number of tablets to create according to cluster information and the size of the data for Duplicate Key tables created with the Radom Bucketing strategy.

Query

Optimized StarRocks' compatibility with Metabase and Superset. Supports integrating them with external catalogs.

SQL Reference

array_agg supports the keyword DISTINCT.
INSERT, UPDATE, and DELETE operations now support SET_VAR. #35283

Others

Added the session variable `large_decimal_underlying_type = "p...

Assets 2

13 Dec 06:32

bearmmx

2.5.16

8ddf3c7

2.5.16

Release date: December 1, 2023

Bug Fixes

Fixed the following issues:

Global Runtime Filter may cause BEs to crash in certain scenarios. #35776

Assets 2

13 Dec 06:25

bearmmx

2.5.15

8b17325

2.5.15

Release date: November 29, 2023

Improvements

Added slow request logs to track slow requests. #33908
Optimized the performance of using Spark Load to read Parquet and ORC files when there are a large number of files. #34787
Optimized the performance of some Bitmap-related operations, including:
Optimized nested loop joins. #34804 #35003
Optimized the bitmap_xor function. #34069
Supports Copy on Write to optimize Bitmap performance and reduce memory consumption. #34047

Compatibility Changes

Parameters

The FE dynamic parameter enable_new_publish_mechanism is changed to a static parameter. You must restart the FE after you modify the parameter settings. #35338

Bug Fixes

If a filtering condition is specified in a Broker Load job, BEs may crash during the data loading in certain circumstances. #29832
Failures in replaying replica operations may cause FEs to crash. #32295
Setting the FE parameter recover_with_empty_tablet to true may cause FEs to crash. #33071
The error "get_applied_rowsets failed, tablet updates is in error state: tablet:18849 actual row size changed after compaction" is returned for queries. #33246
A query that contains a window function may cause BEs to crash. #33671
Running show proc '/statistic' may cause a deadlock. #34237
Errors may be thrown if large amounts of data are loaded into a Primary Key table with persistent index enabled. #34566
After StarRocks is upgraded from v2.4 or earlier to a later version, compaction scores may rise unexpectedly. #34618
If INFORMATION_SCHEMA is queried by using the database driver MariaDB ODBC, the CATALOG_NAME column returned in the schemata view holds only null values. #34627
If schema changes are being executed while a Stream Load job is in the PREPARD state, a portion of the source data to be loaded by the job is lost. #34381
Including two or more slashes (/) at the end of the HDFS storage path causes the backup and restore of the data from HDFS to fail. #34601
Running a loading task or a query may cause the FEs to hang. #34569

Assets 2

13 Dec 06:22

bearmmx

3.1.5

5d8438a

3.1.5

Release date: November 28, 2023

New features

The CN nodes of a StarRocks shared-data cluster now support data export. #34018
Improvements
The COLUMNS view in the system database INFORMATION_SCHEMA can display ARRAY, MAP, and STRUCT columns. #33431
Supports queries against Parquet, ORC, and CSV formatted files that are compressed by using LZO and stored in Hive. #30923 #30721
Supports updates onto the specified partitions of an automatically partitioned table. If the specified partitions do not exist, an error is returned. #34777
Supports automatic refresh of materialized views when Swap, Drop, or Schema Change operations are performed on the tables and views (including the other tables and materialized views associated with these views) on which these materialized views are created. #32829
Optimized the performance of some Bitmap-related operations, including:
- Optimized nested loop joins. #34804 #35003
- Optimized the bitmap_xor function. #34069
- Supports Copy on Write to optimize Bitmap performance and reduce memory consumption. #34047

Bug Fixes

Fixed the following issues:

If a filtering condition is specified in a Broker Load job, BEs may crash during the data loading in certain circumstances. #29832
An unknown error is reported when SHOW GRANTS is executed. #30100
When data is loaded into a table that uses expression-based automatic partitioning, the error "Error: The row create partition failed since Runtime error: failed to analyse partition value" may be thrown. #33513
The error "get_applied_rowsets failed, tablet updates is in error state: tablet:18849 actual row size changed after compaction" is returned for queries. #33246
In a StarRocks shared-nothing cluster, queries against Iceberg or Hive tables may cause BEs to crash. #34682
In a StarRocks shared-nothing cluster, if multiple partitions are automatically created during data loading, the data loaded may occasionally be written to unmatched partitions. #34731
Long-time, frequent data loading into a Primary Key table with persistent index enabled may cause BEs to crash. #33220
The error "Exception: java.lang.IllegalStateException: null" is returned for queries. #33535
When show proc '/current_queries'; is being executed and meanwhile a query begins to be executed, BEs may crash. #34316
Errors may be thrown if large amounts of data are loaded into a Primary Key table with persistent index enabled. #34352
After StarRocks is upgraded from v2.4 or earlier to a later version, compaction scores may rise unexpectedly. #34618
If INFORMATION_SCHEMA is queried by using the database driver MariaDB ODBC, the CATALOG_NAME column returned in the schemata view holds only null values. #34627
FEs crash due to the abnormal data loaded and cannot restart. #34590
If schema changes are being executed while a Stream Load job is in the PREPARD state, a portion of the source data to be loaded by the job is lost. #34381
Including two or more slashes (/) at the end of the HDFS storage path causes the backup and restore of the data from HDFS to fail. #34601
Setting the session variable enable_load_profile to true makes Stream Load jobs prone to fail. #34544
Performing partial updates in column mode onto a Primary Key table causes some tablets of the table to show data inconsistencies between their replicas. #34555
The partition_live_number property added by using the ALTER TABLE statement does not take effect. #34842
FEs fail to start and report the error "failed to load journal type 118". #34590
Setting the FE parameter recover_with_empty_tablet to true may cause FEs to crash. #33071
Failures in replaying replica operations may cause FEs to crash. #32295

Compatibility Changes

Parameters

Added an FE configuration item enable_statistics_collect_profile, which controls whether to generate profiles for statistics queries. The default value is false. #33815
The FE configuration item mysql_server_version is now mutable. The new setting can take effect for the current session without requiring an FE restart. #34033
Added a BE/CN configuration item update_compaction_ratio_threshold, which controls the maximum proportion of data that a compaction can merge for a Primary Key table in a StarRocks shared-data cluster. The default value is 0.5. We recommend shrinking this value if a single tablet becomes excessively large. For a StarRocks shared-nothing cluster, the proportion of data that a compaction can merge for a Primary Key table is still automatically adjusted. #35129

System Variables

Added a session variable cbo_decimal_cast_string_strict, which controls how the CBO converts data from the DECIMAL type to the STRING type. If this variable is set to true, the logic built in v2.5.x and later versions prevails and the system implements strict conversion (namely, the system truncates the generated string and fills 0s based on the scale length). If this variable is set to false, the logic built in versions earlier than v2.5.x prevails and the system processes all valid digits to generate a string. The default value is true. #34208
Added a session variable cbo_eq_base_type, which specifies the data type used for data comparison between DECIMAL-type data and STRING-type data. The default value is VARCHAR, and DECIMAL is also a valid value. #34208
Added a session variable big_query_profile_second_threshold. When the session variable enable_profile is set to false and the amount of time taken by a query exceeds the threshold specified by the big_query_profile_second_threshold variable, a profile is generated for that query. #33825

Assets 2

20 Nov 06:52

bearmmx

3.2.0-rc01

9d64ad2

[Candidate] 3.2.0-rc01

Release date: November 15, 2023

New Features

Shared-data cluster

Supports the persistent index for Primary Key tables on local disks.
Supports the even distribution of Data Cache among multiple local disks.

Data Lake Analytics

Supports creating and dropping databases and managed tables in Hive catalogs, and supports exporting data to Hive's managed tables using INSERT or INSERT OVERWRITE.
Supports Unified Catalog, with which users can access different table formats (Hive, Iceberg, Hudi, and Delta Lake) that share a common metastore like Hive metastore or AWS Glue.

Storage engine, data ingestion, and export

Added the following features of loading with the table function FILES():
- Loading Parquet and ORC format data from Azure or GCP.
- Extracting the value of a key/value pair from the file path as the value of a column using the parameter columns_from_path.
- Loading complex data types including ARRAY, JSON, MAP, and STRUCT.
Supports the dict_mapping column property, which can significantly facilitate the loading process during the construction of a global dictionary, accelerating the exact COUNT DISTINCT calculation.
Supports unloading data from StarRocks to Parquet-formatted files stored in AWS S3 or HDFS by using INSERT INTO FILES. For detailed instructions, see Unload data using INSERT INTO FILES.

SQL reference

Added the following functions:

String functions: substring_index, url_extract_parameter, url_encode, url_decode, and translate
Date functions: dayofweek_iso, week_iso, quarters_add, quarters_sub, milliseconds_add, milliseconds_sub, date_diff, jodatime_format, str_to_jodatime, to_iso8601, to_tera_date, and to_tera_timestamp
Pattern matching function: regexp_extract_all
hash function: xx_hash3_64
Aggregate functions: approx_top_k
Window functions: cume_dist, percent_rank and session_number
Utility functions: dict_mapping and get_query_profile

Privileges and security

StarRocks supports access control through Apache Ranger, providing a higher level of data security and allowing the reuse of existing Ranger Service of external data sources. After integrating with Apache Ranger, StarRocks enables the following access control methods:

When accessing internal tables, external tables, or other objects in StarRocks, access control can be enforced based on the access policies configured for the StarRocks Service in Ranger.
When accessing an external catalog, access control can also leverage the corresponding Ranger service of the original data source (such as Hive Service) to control access (currently, access control for exporting data to Hive is not yet supported).

For more information, see Manage permissions with Apache Ranger.

Improvements

Materialized View

Asynchronous materialized view

Creation:
Supports automatic refresh for an asynchronous materialized view created upon views or materialized views when schema changes occur on the views, materialized views, or their base tables.
Observability:
Supports Query Dump for asynchronous materialized views.
The Spill to Disk feature is enabled by default for the refresh tasks of asynchronous materialized views, reducing memory consumption.
Data consistency:
- Added the property query_rewrite_consistency for asynchronous materialized view creation. This property defines the query rewrite rules based on the consistency check.
- Add the property force_external_table_query_rewrite for external catalog-based asynchronous materialized view creation. This property defines whether to allow force query rewrite for asynchronous materialized views created upon external catalogs.
  For detailed information, see CREATE MATERIALIZED VIEW.
Added a consistency check for materialized views' partitioning key.
When users create an asynchronous materialized view with window functions that include a PARTITION BY expression, the partitioning column of the window function must match that of the materialized view.

Storage engine, data ingestion, and export

Optimized the persistent index for Primary Key tables by improving memory usage logic while reducing I/O read and write amplification. #24875 #27577 #28769
Supports data re-distribution across local disks for Primary Key tables.
Partitioned tables support automatic cooldown based on the partition time range and cooldown time. For detailed information, see Set initial storage medium and automatic storage cooldown time.
The Publish phase of a load job that writes data into a Primary Key table is changed from asynchronous mode to synchronous mode. As such, the data loaded can be queried immediately after the load job finishes. For detailed information, see enable_sync_publish.

Query

Optimized StarRocks' compatibility with Metabase and Superset. Supports integrating them with external catalogs.

SQL Reference

array_agg supports the keyword DISTINCT.

Developer tools

Supports Trace Query Profile for asynchronous materialized views, which can be used to analyze its transparent rewrite.

Compatibility Changes

Parameters

Added new parameters for Data Cache.

Bug Fixes

Fixed the following issues:

BEs crash when libcurl is invoked. #31667
Schema Change may fail if it takes an excessive period of time, because the specified tablet version is handled by garbage collection. #31376
Failed to access the Parquet files in MinIO or AWS S3 via file external tables. #29873
The ARRAY, MAP, and STRUCT type columns are not correctly displayed in information_schema.columns. #33431
DATA_TYPE and COLUMN_TYPE for BINARY or VARBINARY data types are displayed as unknown in the information_schema.columns view. #32678

Assets 2

20 Nov 06:43

bearmmx

2.5.14

e4ca4dd

2.5.14

Release date: November 14, 2023

Improvements

The COLUMNS table in the system database INFORMATION_SCHEMA can display ARRAY, MAP, and STRUCT columns. #33431

Bug Fixes

Fixed the following issues:

The error java.lang.IllegalStateException: null is reported if the ON condition is nested with a subquery. #30876
The result of COUNT() is inconsistent among replicas if COUNT() is run immediately after INSERT INTO SELECT ... LIMIT is successfully executed. #24435
BE may crash for specific data types if the target data type specified in CAST is the same as the original data type. #31465
An error is reported if specific path formats are used during data loading via Broker Load: msg:Fail to parse columnsFromPath, expected: [rec_dt]. #32721
During an upgrade to 3.x, if some column types are also upgraded (for example, Decimal is upgraded to Decimal v3), BEs crash when Compaction is performed on tables with specific characteristics. #31626
When data is loaded by using Flink Connector, the load job is suspended unexpectedly if there are highly concurrent load jobs and both the number of HTTP and Scan threads have reached their upper limits. #32251
BEs crash when libcurl is invoked. #31667
Adding BITMAP columns to a Primary Key table fails with the following error: Analyze columnDef error: No aggregate function specified for 'userid'. #31763
Long-time, frequent data loading into a Primary Key table with persistent index enabled may cause BEs to crash. #33220
The query result is incorrect when Query Cache is enabled. #32778
Specifying a nullable Sort Key when creating a Primary Key table causes compaction to fail. #29225
The error "StarRocks planner use long time 10000 ms in logical phase" occassionally occurs for complex Join queries. #34177

Assets 2

06 Nov 03:17

bearmmx

3.1.4

0c4b2a3

3.1.4

Release date: November 2, 2023

New Features

Supports sort keys for Primary Key tables created in shared-data StarRocks clusters.
Supports using the str2date function to specify partition expressions for asynchronous materialized views. This helps facilitate incremental updates and query rewrites of asynchronous materialized views created on tables that reside in external catalogs and use the STRING-type data as their partitioning expressions. #29923 #31964
Added a new session variable enable_query_tablet_affinity, which controls whether to direct multiple queries against the same tablet to a fixed replica. This session variable is set to false by default. #33049
Added the utility function is_role_in_session, which is used to check whether the specified roles are activated in the current session. It supports checking nested roles granted to a user. #32984
Supports setting resource group-level query queue, which is controlled by the global variable enable_group_lelvel_query_queue (default value: false). When the global-level or resource group-level resource consumption reaches a predefined threshold, new queries are placed in queue, and will be run when both the global-level resource consumption and the resource group-level resource consumption fall below their thresholds.
- Users can set concurrency_limit for each resource group to limit the maximum number of concurrent queries allowed per BE.
- Users can set max_cpu_cores for each resource group to limit the maximum CPU consumption allowed per BE.
Added two parameters, plan_cpu_cost_range and plan_mem_cost_range, for resource group classifiers.
- plan_cpu_cost_range: the CPU consumption range estimated by the system. The default value NULL indicates no limit is imposed.
- plan_mem_cost_range: the memory consumption range estimated by the system. The default value NULL indicates no limit is imposed.

Improvements

Window functions COVAR_SAMP, COVAR_POP, CORR, VARIANCE, VAR_SAMP, STD, and STDDEV_SAMP now support the ORDER BY clause and Window clause. #30786
An error instead of NULL is returned if a decimal overflow occurs during queries on the DECIMAL type data. #30419
The number of concurrent queries allowed in a query queue is now managed by the leader FE. Each follower FE notifies of the leader FE when a query starts and finishes. If the number of concurrent queries reaches the global-level or resource group-level concurrency_limit, new queries are rejected or placed in queue.

Bug Fixes

Fixed the following issues:

Spark or Flink may report data read errors due to inaccurate memory usage statistics. #30702 #30751
Memory usage statistics for Metadata Cache are inaccurate. #31978
BEs crash when libcurl is invoked. #31667
When StarRocks materialized views created on Hive views are refreshed, an error "java.lang.ClassCastException: com.starrocks.catalog. HiveView cannot be cast to com.starrocks.catalog. HiveMetaStoreTable" is returned. #31004
If the ORDER BY clause contains aggregate functions, an error "java.lang.IllegalStateException: null" is returned. #30108
In shared-data StarRocks clusters, the information of table keys is not recorded in information_schema.COLUMNS. As a result, DELETE operations cannot be performed when data is loaded by using Flink Connector. #31458
When data is loaded by using Flink Connector, the load job is suspended unexpectedly if there are highly concurrent load jobs and both the number of HTTP threads and the number of Scan threads have reached their upper limits. #32251
When a field of only a few bytes is added, executing SELECT COUNT(*) before the data change finishes returns an error that reads "error: invalid field name". #33243
Query results are incorrect after the query cache is enabled. #32781
Queries fail during hash joins, causing BEs to crash. #32219
DATA_TYPE and COLUMN_TYPE for BINARY or VARBINARY data types are displayed as unknown in the information_schema.columns view. #32678

Behavior Change

From v3.1.4 onwards, persistent indexing is enabled by default for Primary Key tables created in new StarRocks clusters (this does not apply to existing StarRocks clusters whose versions are upgraded to v3.1.4 from an earlier version). #33374
A new FE parameter enable_sync_publish which is set to true by default is added. When this parameter is set to true, the Publish phase of a data load into a Primary Key table returns the execution result only after the Apply task finishes. As such, the data loaded can be queried immediately after the load job returns a success message. However, setting this parameter to true may cause data loads into Primary Key tables to take a longer time. (Before this parameter is added, the Apply task is asynchronous with the Publish phase.) #27055

Assets 2

18 Oct 06:56

bearmmx

2.5.13

a3b58a0

2.5.13

Release date: September 28, 2023

Improvements

Window functions COVAR_SAMP, COVAR_POP, CORR, VARIANCE, VAR_SAMP, STD, and STDDEV_SAMP now support the ORDER BY clause and Window clause. #30786
An error instead of NULL is returned if a decimal overflow occurs during queries on the DECIMAL type data. #30419
Executing SQL commands with invalid comments now returns results consistent with MySQL. #30210
Rowsets corresponding to tablets that have been deleted are cleaned up, reducing the memory usage during BE startup. #30625

Bug Fixes

Fixed the following issues:

An error "Set cancelled by MemoryScratchSinkOperator" occurs when users read data from StarRocks using the Spark Connector or Flink Connector. #30702 #30751
An error "java.lang.IllegalStateException: null" occurs during queries with an ORDER BY clause that includes aggregate functions. #30108
FEs fail to restart when there are inactive materialized views. #30015
Performing INSERT OVERWRITE operations on duplicate partitions corrupts the metadata, leading to FE restart failures. #27545
An error "java.lang.NullPointerException: null" occurs when users modify columns that do not exist in a Primary Key table. #30366
An error "get TableMeta failed from TNetworkAddress" occurs when users load data into a partitioned StarRocks external table. #30124
If users use CloudCanal to load data into table columns that are set to NOT NULL but have no default value specified, an error "Unsupported dataFormat value is : \N" is thrown. #30799
An error "current running txns on db xxx is 200, larger than limit 200" occurs when users load data via the Flink Connector or perform DELETE and INSERT operations. #18393
Asynchronous materialized views which use HAVING clauses that include aggregate functions cannot rewrite queries properly. #29976

Assets 2

25 Sep 07:08

bearmmx

3.1.3

384ba23

3.1.3

Release date: September 25, 2023

New Features

The aggregate function group_concat supports the DISTINCT keyword and the ORDER BY clause. #28778
Stream Load, Broker Load, Kafka Connector, Flink Connector, and Spark Connector support partial updates in column mode on a Primary Key table. #28288
Data in partitions can be automatically cooled down over time. (This feature is not supported for list partitioning.) #29335 #29393

Improvements

Executing SQL commands with invalid comments now returns results consistent with MySQL. #30210

Bug Fixes

Fixed the following issues:

If the BITMAP or HLL data type is specified in the WHERE clause of a DELETE statement to be executed, the statement cannot be properly executed. #28592
After a follower FE is restarted, CpuCores statistics are not up-to-date, resulting in query performance degradation. #28472 #30434
The execution cost of the to_bitmap() function is incorrectly calculated. As a result, an inappropriate execution plan is selected for the function after materialized views are rewritten. #29961
In certain use cases of the shared-data architecture, after a follower FE is restarted, queries submitted to the follower FE return an error that reads "Backend node not found. Check if any backend node is down". #28615
If data is continuously loaded into a table that is being altered by using the ALTER TABLE statement, an error "Tablet is in error state" may be thrown. #29364
Modifying the FE dynamic parameter max_broker_load_job_concurrency using the ADMIN SET FRONTEND CONFIG command does not take effect. #29964 #29720
BEs crash if the time unit in the date_diff() function is a constant but the dates are not constants. #29937
In the shared-data architecture, automatic partitioning does not take effect after asynchronous load is enabled. #29986
If users create a Primary Key table by using the CREATE TABLE LIKE statement, an error "Unexpected exception: Unknown properties: {persistent_index_type=LOCAL}" is thrown. #30255
Restoring Primary Key tables causes metadata inconsistency after BEs are restarted. #30135
If users load data into a Primary Key table on which truncate operations and queries are concurrently performed, an error "java.lang.NullPointerException" is thrown in certain cases. #30573
If predicate expressions are specified in materialized view creation statements, the refresh results of those materialized views are incorrect. #29904
After users upgrade their StarRocks cluster to v3.1.2, the storage volume properties of the tables created before the upgrade are reset to null. #30647
If checkpointing and restoration are concurrently performed on tablet metadata, some tablet replicas will be lost and cannot be retrieved. #30603
If users use CloudCanal to load data into table columns that are set to NOT NULL but have no default value specified, an error "Unsupported dataFormat value is : \N" is thrown. #30799

Assets 2

Releases: StarRocks/starrocks

3.1.6

New Features

Parameter Changes

Improvements

Bug Fixes

3.2.0

New Features

Shared-data cluster

Materialized View

Data Lake Analytics

Storage engine, data ingestion, and export

Query

SQL reference

Privileges and security

Improvements

Data Lake Analytics

Materialized View

Storage engine, data ingestion, and export

Query

SQL Reference

Others

2.5.16

Bug Fixes

2.5.15

Improvements

Compatibility Changes

Parameters

Bug Fixes

3.1.5

New features

Bug Fixes

Compatibility Changes

Parameters

System Variables

[Candidate] 3.2.0-rc01

New Features

Shared-data cluster

Data Lake Analytics

Storage engine, data ingestion, and export

SQL reference

Privileges and security

Improvements

Materialized View

Storage engine, data ingestion, and export

Query

SQL Reference

Developer tools

Compatibility Changes

Parameters

Bug Fixes

2.5.14

Improvements

Bug Fixes

3.1.4

New Features

Improvements

Bug Fixes

Behavior Change

2.5.13

Improvements

Bug Fixes

3.1.3

New Features

Improvements

Bug Fixes