- Support for Snapshot Readonly transactions mode
- Better resource management for KQP Resource Manager (share information about nodes resources, avoid OOMs)
- ✅ Switch to New Engine for OLTP queries
- ✅ Support
not null
for PK (primary key) table columns - Aggregates and predicates push down to column-oriented tables
- Optimize data formats for data transition between query phases
- Index Rename/Rebuild
- KQP Session Actor as a replacement for KQP Worker Actor (optimize to reduce CPU usage)
- PostgreSQL compatibility
- Support PostgreSQL datatypes serialization/deserialization in YDB Public API
- PostgreSQL compatible query execution (TPC-C, TPC-H queries should work)
- Support for PostgreSQL wire protocol
- Support a single Database connection string instead of multiple parameters
- Support constraints in query optimizer
- Query Processor 3.0 (a set of tasks to be more like traditional database in case of query execution functionality)
- Support for Streaming Lookup Join via MVCC snapshots (avoid distributed transactions, scalability is better)
- Universal API call for DML, DDL with unlimited results size (aka StreamExecuteQuery, which allows to execute each query)
- Support for secondary indexes in ScanQuery
- Transaction can see its own updates (updates made during transaction execution are not buffered in RAM anymore, but rather are written to disk and available to read by this transaction)
- Computation graphs caching (compute/datashard programs) (optimize CPU usage)
- RPC Deadline & Cancellation propagation (smooth timeout management)
- DDL for column-oriented tables
- ✅ Get YDB topics (aka pers queue, streams) ready for production
- ✅ Turn on MVCC support by default
- ✅ Enable Snapshot read mode by default (take and use MVCC snapshot for reads instead of running distributed transaction for reads)
- ✅ Change Data Capture (be able to get change feed of table updates)
- Async Replication between YDB databases
- ✅ Background compaction for DataShards
- ✅ Compressed Backups. Add functionality to compress backup data
- Process of Extending State Storage without cluster downtime. If a cluster grows from, say, 9 nodes to 900 State Storage configuration stays the same (9 nodes), it leads to a performance bottleneck.
- Splite/Merge DataShards BY LOAD by default. Most users require this feature turned on by default
- Support PostgreSQL datatypes in tablet local database
- Basic histogram for DataShards (first step towards cost based optimizations)
- Transaction can see its own updates (updates made during transaction execution are not buffered in RAM anymore, but rather are written to disk and available to read by this transaction)
- Data Ingestion from topic to table (implement built-in compatibility to ingest data to YDB tables from topics)
- Support snapshot read over read replicas (consistent reads against read replicas)
- Transactions between topics and tables
- Datashard iterator reads via MVCC
- Switch to TRope (or don't use TString/std::string directly, provide zero-copy data passing between components)
- Avoid Node Broker as SPF (NBS must work without Node Broker under emergency conditions)
- Subscriptions in SchemeBoard (optimize interaction with SchemeBoard via subsription to updates)
- "One leg" storage migration without downtime (migrate 1/3 of the cluster from one AZ to another for mirror3-dc erasure encoding)
- ActorSystem 1.5 (dynamically reassign threads in different thread pools)
- Publish an utility for BlobStorage management (it's called ds_tool for now, improve it and open)
- Self-heal for degrated BlobStorage groups (automatic self-heal for groups with two broken disks, get VDisk Donors production ready)
- BlobDepot (a component for smooth blobs management between groups)
- Avoid BSC (BlobStorage Controller) as SPF (be able to run the cluster without BSC in emergency cases)
- BSC manages static group (reconfiguration of the static BlobStorage group must be done BlobStorage Controller as for any other group)
- (Semi-)Hard disk space separation (Better guarantees for disk space usage by VDisks on a single PDisk)
- Reduce space amplification (Optimize storage layer)
- Storage nodes decommission (Add ability to remove storage nodes)
- Log Store (log friendly column-oriented storage which allows to create 1+ million tables for logs storing)
- Column-oriented Tables (introduce a Column-oriented tables in additon to Row-orinted tables)
- Tiered Storage for Column-oriented Tables (with the ability to store the data in S3)
- Run the first version
- Support for all schema entities
- YDB Topics (add support for viewing metadata of YDB topics, its data, lag, etc)
- CDC Streams
- Secondary Indexes
- Read Replicas
- Column-oriented Tables
- Basic charts for database monitoring
- Use a single
ydb yql
instead ofydb table query
orydb scripting
- Interactive CLI
- Built-in load test for DataShards in YCSB manner
ydb workload
for topics- Jepsen tests support
- Try RTMR-tablet for key-value workload