Released on 2020/01/15.
Note
If you are upgrading a cluster, you must be running CrateDB 4.0.2 or higher before you upgrade to 4.1.0.
We recommend that you upgrade to the latest 4.0 release before moving to 4.1.0.
A rolling upgrade to 4.1.0 from 4.0.2+ is supported.
Before upgrading, you should back up your data.
Table of Contents
- Changed
arithmetic operations <arithmetic>
*
,+
, and-
of typesinteger
andbigint
to throw an exception instead of rolling over from positive to negative or the other way around. - Remap CrateDB
data-types-objects
array data type from the PostgreSQL JSON to JSON array type. That might effect some drivers that use the PostgreSQL wire protocol to insert data into tables with object array typed columns. For instance, when using theNpgsql
driver, it is not longer possible to insert an array of objects into a column of the object array data type by using the parameter of a SQL statement that has the JSON data type and an array of CLR as its value. Instead, use a string array with JSON strings that represent the objects. See theNpgsql
documentation for more details. Changed how columns of type
data-types-geo
are being communicated to PostgreSQL clients.Before, clients were told that those columns are double arrays. Now, they are correctly mapped to the PostgreSQL
point
type. This means that applications using clients likeJDBC
will have to be adapted to usePgPoint
. (See Geometric DataTypes in JDBC)- Changed the behavior of
unnest
to fully unnest multi dimensional arrays to their innermost type to be compatible with PostgreSQL.
- Deprecated the
node.store.allow_mmapfs
setting in favour ofnode.store.allow_mmap <node.store_allow_mmap>
.
- Allow user to limit the number of threads on a single shard that may be merging at once via the
merge.scheduler.max_thread_count <sql-create-table-merge-scheduler-max-thread-count>
table parameter. - Some
ALTER TABLE
operations now internally invoke a single cluster state update instead of multiple cluster state updates. This change improves resiliency because there is no longer a window where the cluster state could be inconsistent. - Changed the default garbage collector from Concurrent Mark Sweep to G1GC. This should lead to shorter GC pauses.
- Added a dynamic bulk sizing mechanism that should prevent
INSERT INTO ... FROM query
operations from running into out-of-memory errors when the individual records of a table are large. - Added the
cluster.routing.allocation.total_shards_per_node <cluster.routing.allocation.total_shards_per_node>
setting.
- Optimized
SELECT DISTINCT .. LIMIT n
queries. On high cardinality columns, these types of queries now execute up to 200% faster and use less memory. - The optimizer now utilizes internal statistics to approximate the number of rows returned by various parts of a query plan. This should result in more efficient execution plans for joins.
- Reduced recovery time by sending file-chunks concurrently. This change only applies when transport communication is secured or compressed. The number of chunks is controlled by the
indices.recovery.max_concurrent_file_chunks <indices.recovery.max_concurrent_file_chunks>
setting. - Added an optimization that allows
WHERE
clauses on top of derived tables containingtable functions <table-functions>
to run more efficiently in some cases. - Allow user to control how table data is stored and accessed on a disk via the
store.type <sql-create-table-store-type>
table parameter andnode.store.allow_mmap <node.store_allow_mmap>
node setting. - Changed the default table data store type from
mmapfs
tohybridfs
.
- Added support for the
lag <window-functions-lag>
andlead <window-functions-lead>
window functions as enterprise features. - Added support for
ROWS
frame definitions in the context of window functionswindow definitions <window-definition>
. - Added support for the
named window definition <window-definition-named-windows>
. This change allows a user to define a list of window definitions in theWINDOW <sql-select-window>
clause that can be referenced inOVER <sql-select-over>
clauses. - Added support for
offset PRECEDING
andoffset FOLLOWING
window definitions <window-definition>
.
- Added support for the
ALL <all_array_comparison>
operator for array andsubquery <gloss-subquery>
comparisons. - Added a
PG_GET_KEYWORDS <pg_catalog.pg_get_keywords>
table function. - Extended
CONCAT <scalar-concat>
to do implicit casts, so that calls likeSELECT 't' || 5
are supported. - Added support for casting values of type
object
totext
. This casting will cause the object to be converted to a JSON string. Added support for casting to
data-types-geo
,data-types-geo-shape
anddata-types-objects
array data types.For example:
cast(['POINT(2 3)','POINT(1 3)'] AS array(geo_point))
- Added the
PG_TYPEOF <scalar-pg_typeof>
system function. - Added the
INTERVAL <type-interval>
data type and extendedtable-functions-generate-series
to work with timestamps and the newINTERVAL <type-interval>
type. - Added
LPAD <scalar-lpad>
andRPAD <scalar-rpad>
scalar functions. - Added the
LTRIM <scalar-ltrim>
andRTRIM <scalar-rtrim>
scalar functions. - Added
LEFT <scalar-left>
andRIGHT <scalar-right>
scalar functions. - Added
TIMEZONE <scalar-timezone>
scalar function. - Added
AT TIME ZONE <type-timestamp-at-tz>
syntax. - Added support for the operator
ILIKE <sql_dql_like>
, the case insensitive complement toLIKE
. Added support for CIDR notation comparisons through special purpose
operator <gloss-operator>
<<
associated with type IP.Statements like
192.168.0.0 << 192.168.0.1/24
evaluate <gloss-evaluation>
as true, meaningSELECT ip FROM ips_table WHERE ip << 192.168.0.1/24
returns matchingIP addresses <type-ip>
.
- Added a
ANALYZE <analyze>
command that can be used to update statistical data about the contents of the tables in the CrateDB cluster. This data is visible in a newly addedpg_stats <pg_stats>
table. - Added a
PROMOTE REPLICA <sql-alter-table-reroute>
subcommand tosql-alter-table
. - Added support for the filter clause in
aggregate expressions <aggregation-expressions>
andwindow functions <window-function-call>
that areaggregates <aggregation>
. - Added support for using
ref-values
as a top-level relation.
- Added a
failures
column to thesys.snapshots <sys-snapshots>
table. Improved the error messages that were returned if a relation or schema is not found.
The error messages may now include suggestions for similarly named tables, which should make typos more apparent and help users figure out they are missing double quotes (e.g., when a table name contains upper case letters).
- Added a
seq_no_stats
and atranslog_stats
column to thesys.shards <sys-shards>
table. - Added new system table
sys.segments <sys-segments>
which contains information about the Lucene segments of a shard. - Added a
node
column tosys.jobs_log <sys-logs>
. - Statements containing limits, filters,
window functions <window-functions>
, ortable functions <table-functions>
will now be labelled accordingly insys-jobs-metrics
.
- Changed the default for
sql-create-table-write-wait-for-active-shards
fromALL
to1
. This update improves the out of the box experience by allowing a subset of nodes to become unavailable without blocking write operations. See the documentation linked above for more details about the implications. - Added
phonetic
token filter with following encoders:metaphone
,double_metaphone
,soundex
,refined_soundex
,caverphone1
,caverphone2
,cologne
,nysiis
,koelnerphonetik
,haasephonetik
,beider_morse
, anddaitch_mokotoff
. - Removed a restriction for predicates in the
WHERE
clause involvingpartitioned columns <gloss-partitioned-column>
, which could result in a failure response with the message:logical conjunction of the conditions in the WHERE clause which involve partitioned columns led to a query that can't be executed
. - Support implicit object creation in update statements. For example,
UPDATE t SET obj['x'] = 10
will now implicitly setobj
to{obj: {x: 10}}
on rows whereobj
wasnull
. - Added the
sql-create-table-codec
parameter tosql-create-table
to control the compression algorithm used to store data. - The
node
argument of theREROUTE <sql-alter-table-reroute>
commands ofsql-alter-table
can now either be the ID or the name of a node. - Added support for the PostgreSQL array string literal notation.