Skip to content

3.7.0

Latest

Choose a tag to compare

@andrross andrross released this 09 Jun 23:41
· 147 commits to main since this release
Immutable release. Only release title and notes can be modified.
72121f0

Version 3.7.0 Release Notes

Compatible with OpenSearch and OpenSearch Dashboards version 3.7.0

Features

  • Add dynamic properties support for pattern-based field definitions without cluster state mapping updates (#20816)
  • Add pluggable data format engine with DataFormatAwareEngine for multi-format indexing (#21181)
  • Add Lucene engine implementation for pluggable data formats (#21299)
  • Add merge support for Parquet data format plugin via streaming k-way merge sort (#21079)
  • Add directory and IndexInput layers for WritableWarm tiered storage (#21178)
  • Add server-side implementation for tiering status APIs (GetTieringStatus and ListTieringStatus) (#21220)
  • Add server-side implementation for HotToWarm, WarmToHot, and CancelTiering APIs (#21295)
  • Add prefetch settings and stored fields prefetch for WritableWarm tiered storage (#21285)
  • Add slow logs, per-query metrics, and migration metrics for WritableWarm tiered storage (#21332)
  • Add module wiring and integration tests for WritableWarm tiered storage (#21427)
  • Add tiered object storage crate for warm node file routing (#21204)
  • Add event-driven scheduler and stage execution for analytics engine (#21242)
  • Add coordinator-side DataFusion reduce with streaming Arrow batches (#21356)
  • Add distributed aggregation with partial/final mode for analytics engine (#21457)
  • Add distributed join planning and execution for analytics engine (#21639)
  • Add PPL append command support with multi-child stage runtime for Union (#21474)
  • Add PPL dedup command support via ROW_NUMBER window function (#21622)
  • Add PPL eventstats and streamstats window function support (#21734)
  • Add PPL top and rare command support via window functions (#21593)
  • Add PPL parse command with regex mode via Rust UDFs (#21573)
  • Add PPL rex command with sed and extract modes (#21550)
  • Add PPL spath command with auto-extract mode via json_extract_all UDF (#21664)
  • Add 7 PPL JSON scalar functions to analytics engine route (#21513)
  • Add 23 PPL datetime scalar functions to analytics engine route (#21556)
  • Add 14 additional PPL datetime functions (Wave A) including strftime, date_format, maketime (#21582)
  • Add 30+ PPL math scalar functions to analytics engine (#21520)
  • Add PPL string scalar functions to analytics engine (18 functions) (#21543)
  • Add PPL conditional functions (coalesce, isempty, isblank, case, if, ifnull) to analytics engine (#21643)
  • Add PPL conversion scalar functions (num, auto, memk, rmcomma, dur2sec, ctime, mktime) to analytics engine (#21628)
  • Add PPL cryptographic functions (md5, sha1, sha2, crc32) to analytics engine (#21611)
  • Add PPL array constructor and 8 multivalue functions to analytics engine (#21554)
  • Add PPL bucketing scalars (span_bucket, width_bucket, minspan_bucket, range_bucket) (#21621)
  • Add PPL TAKE, FIRST, LAST, LIST, VALUES aggregate functions (#21731)
  • Add Lucene filter delegation from DataFusion for full-text search predicates (#21555)
  • Add performance delegation to Lucene for selective filter predicates (#21701)
  • Add native Arrow transport path with zero-copy transfer for stream transport (#21253)
  • Stream Arrow batches on data-node fragment execution path (#21418)
  • Add support for extra_fields outside _source indexing for improved vector ingestion throughput (#20635)
  • Add gRPC support for Min, Max, and Terms aggregations (#21205)
  • Add partition strategy setting for flexible shard-to-partition mapping in pull-based ingestion (#21165)
  • Add SplitToFieldsProcessor for distributing split values to target fields (#21216)
  • Add native memory based admission control for transport request throttling (#21191)
  • Add native memory search backpressure for off-heap query cancellation (#21647)
  • Add unified native allocator framework for Arrow allocations with elastic rebalancing (#21703)
  • Add on-demand jemalloc heap profiling support via JMX CLI tool (#21599)
  • Add search.max_buckets to workload group settings for per-tenant bucket limits (#21721)
  • Add additional search settings and override_request_values to workload management groups (#21523)
  • Add hunspell dictionary hot-reload support via _refresh_search_analyzers API (#21559)

Enhancements

  • Add adaptive query budget for DataFusion engine with bounded memory and improved throughput (#21695)
  • Add DynamicLimitPool for runtime memory pool limit changes in DataFusion (#21286)
  • Add configurable coordinator buffer limit for per-query Arrow allocator (#21726)
  • Add CPU task cancellation for DataFusion queries (#21560)
  • Add IO task cancellation support for DataFusion queries (#21531)
  • Add DataFusion logical and physical plan logging at DEBUG level (#21646)
  • Add dynamic settings for indexed query execution path (#21522)
  • Add dedicated analytics_scheduler thread pool to prevent coordinator deadlock (#21771)
  • Add dedicated analytics_reduce thread pool for coordinator reduce drains (#21800)
  • Add native memory stats and task cancellation stats to node stats API (#21637)
  • Add current_application_duration_ms to cluster state download stats in node stats API (#20922)
  • Add segments and segment stats support for DataFormatAwareEngine (#21696)
  • Add DataFormat-aware NRT replication engine and remote-store wiring (#21311)
  • Add DataFormat-aware shallow snapshot v2 support (#21742)
  • Add DataFormat-aware read-only engine for warm primaries with tiering service improvements (#21720)
  • Add dynamic mapping support for pluggable data formats (#21444)
  • Add delete execution engine abstraction for DataFormatAwareEngine (#21313)
  • Add cluster-scope defaults for pluggable dataformat settings (#21435)
  • Add indexing support for metadata fields in pluggable data formats (#21585)
  • Add Lucene merge support for pluggable data format composite engine (#21422)
  • Add composite merge handler and merge policy for data-format-aware engine (#21128)
  • Add sort-on-refresh for composite engine with cross-format row-ID consistency (#21468)
  • Add warm+format directory wiring with per-format tiered directory routing (#21361)
  • Add block cache SPI and Foyer plugin for warm nodes (#21530)
  • Add REST API paths for block cache prune and detailed file cache stats (#21705)
  • Add cancellation checkpoints in field data loading and aggregation paths (#21318)
  • Add queryTimeout to IndexSearcher for KNN vector search timeout enforcement (#21316)
  • Add index-level authorization to analytics engine via ActionFilter dispatch (#21789)
  • Add /_analytics/ppl/_explain endpoint with stage profiling (#21660)
  • Add relevance function support (match_phrase, multi_match, query_string, etc.) to analytics engine (#21562)
  • Add relevance functions optional parameter support and new functions (wildcard_query, query, match_all) (#21661)
  • Add filter pushdown rules and Calcite rule metrics for profiling (#21684)
  • Add per-column encoding and compression configuration for Parquet data format (#21665)
  • Avoid repeated encoding and compression for sort column writes in Parquet (#21464)
  • Add pipeline execution metrics to PollingIngestStats for pull-based ingestion (#21024)
  • Add batching for persistent task cluster service to reduce cluster manager load (#21245)
  • Refactor BitsetFilterCache to node-level cache with configurable size limit (#21179)
  • Skip zone awareness when auto_expand_replicas is set to all (#21217)
  • Relax field-level meta validation constraints to allow any number of entries with string values (#20578)
  • Deprecate boolean constructor of FetchSourceContext in favor of static constants (#21235)
  • Add validation and deprecation warnings for ambiguous _source filtering (#21203)
  • Speed up Painless Script Engine initialization by ~10% (#21463)
  • Fix accumulation of file sizes when multiple files share the same extension in segment stats (#21000)
  • Improve native memory admission control precision with auto-derived budget and JVM non-heap subtraction (#21749)
  • Tighten DataFusion memory guard with RSS-based hard guard to prevent OOM under concurrent load (#21814)
  • Support indices_boost_2 array format for gRPC search (#21300)
  • Add configurable Kafka metadata timeout for pull-based ingestion (#21425)
  • Expose tokio-metrics as DataFusion plugin stats (#21303)
  • Add Lucene FFM callbacks to task resource tracking (#21610)

Bug Fixes

  • Fix YAML parser corrupting string values that resemble booleans after Jackson 3.x migration (#21294)
  • Fix map_unmapped_fields_as_text lost after dynamic mapping update in PercolatorFieldMapper (#21301)
  • Fix O(n²) removeAll in remote translog metadata cleanup causing CPU spikes (#21350)
  • Fix Rounding.isUTC() to recognize UTC timezone aliases for date histogram optimization (#21221)
  • Fix NPE in QueryPhaseResultConsumer when all shards fail (#21158)
  • Fix bulk request hang when index is deleted during primary phase (#21305)
  • Fix deadlock between engineMutex and writeLock during index close and engine reset (#21404)
  • Fix FlightOutboundHandler clearing caller's ThreadContext (#21167)
  • Fix IndicesRequestCacheCleanupIT flakiness by removing too-short assertBusy timeouts (#21494)
  • Fix negative fielddata stats by guarding against stale removals after shard reallocation (#21667)
  • Fix half_float ingest writing wrong fp16 bit pattern in Parquet (#21783)
  • Fix StringView buffer bloat in DataFusion stream_next FFI export causing 435x data amplification (#21753)
  • Fix Utf8View/Utf8 schema mismatch panic in indexed parquet path (#21826)
  • Fix memory leak in transport-reactor-netty4 plugin with persistent connections (#21788)
  • Fix ExitablePostingsEnum to extend FilterPostingsEnum for proper delegation (#21558)
  • Fix local recovery from flush for DataFormatAwareEngine (#21553)
  • Fix safe-commit info and replication checksum for DFA shards (#21787)
  • Fix DFA recovery failures: file-handle leak and reset-path crash (#21759)
  • Handle null scripted metric combine results (#21534)
  • Demote "No resource usage stats available for node" log from WARN to DEBUG (#21638)
  • Fix pull-based ingestion document mapper usage to reflect mapping updates (#21183)
  • Fix pull-based ingestion consumer factory to be stateless and prevent race conditions (#21652)
  • Fix pull-based ingestion multi-threaded writer batchStartPointer computation (#21697)
  • Fix Netty4Http3ServerTransport to use configured HeaderVerifier and Decompressor instances (#21281)
  • Convert varchar to str in analytics engine Project operations to fix DataFusion type errors (#21794)
  • Fix microsecond() function and add timestamp lower-bound validation in analytics engine (#21793)
  • Enforce write blocks for DFA hot-to-warm tiering to survive DiskThresholdMonitor removal (#21828)

Maintenance

  • Bump Netty to 4.2.14.Final (#21772)
  • Update Jackson to 2.21.3 / 3.1.3 (#21493)
  • Update ASM to 9.10 (#21764)
  • Update OpenTelemetry to 1.62.0 and SemConv to 1.41.0 (#21595)
  • Update Project Reactor to 3.8.5 and Reactor Netty to 1.3.5 (#21226)
  • Update bundled JDK to JDK 25.0.3 (#21353)
  • Update log4j2 to 2.25.4 (#21416)
  • Update httpclient5 to 5.6.1 (#21441)
  • Bump commons-configuration2 from 2.14.0 to 2.15.0 (#21806)
  • Bump org.apache.commons:commons-configuration2 from 2.13.0 to 2.14.0 (#21213)
  • Bump com.google.protobuf from 0.9.6 to 0.10.0 (#21291)
  • Bump org.apache.hadoop:hadoop-minicluster from 3.4.2 to 3.5.0 (#21138)
  • Bump org.codehaus.woodstox:stax2-api from 4.2.2 to 4.3.0 (#21137)
  • Bump org.jline:jline from 4.0.0 to 4.0.14 (#21471)
  • Bump org.jsoup:jsoup from 1.22.1 to 1.22.2 (#21290)
  • Bump com.nimbusds:nimbus-jose-jwt from 10.8 to 10.9 (#21214)
  • Remove Unsafe class injection from Java agent (#21542)
  • Replace mimalloc with jemalloc as global allocator for native sandbox plugins (#21497)
  • Upgrade DataFusion to v53 and Arrow to v58 (#21590)
  • Pin GitHub Actions to commit SHAs for supply chain security (#21808)
  • Update FIPS bootstrap check to use OpenSearch env var instead of BouncyCastle system property (#21415)