Skip to content

Introduce MemTable for fast upsert and consistent read #3985

@jackye1995

Description

@jackye1995

Based on discussion #3842

Currently we have added the basic metadata structure to allow an internal implementation of LSM write. The newly created tasks are for contributing the implementation to open source Lance.

In short, we will create a new job type LogStructuredMergeJob. From user perspective, this is a long running job that users can call start and then continuously accept writes. The writes are batched and written to WAL asynchronously, and at the same time to a MemTable kept in memory (the MemTable is also a Lance table).

A few internal details:

  1. Each job works against a specific region of the table, region definition is up to the job creator (more details see point 5)
  2. When creating, the MemTable will have the same set of indexes as the source table, and every time a write happens, the MemTable indexes are updated accordingly.
  3. When the MemTable reaches a specific configurable size, it triggers a flush operation to flush the MemTable to disk
  4. We expect an asynchronous process (e.g. table maintenance process) to continuously merge the flushed MemTables into the source Lance table, and after this merge, the MemTable can be dropped from the index.
  5. This job is intended to be integrated with distributed engines like Ray, Spark, etc. that can launch distributed writers, and each "writer" can create a job for the specific region and continuously accept writes.

When reading data, the scanner exposes 2 options:

  1. use memwal index - will let the scanner look into all the flushed but not merged MemTables, and create a merged scan plan
  2. use job - will let user supply a running job to the scanner, so that scanner also gain access to the in memory MemTable for the merged scan plan.

Some future work also listed but not in the immediate plan:

  1. support delete marker in the job (the plan above will allow user to only write record batches similar to merge-insert)

Some identified bug fixes:

  1. primary key field IDs should be ordered

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions