Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement subjective database #710

Closed
swatanabe opened this issue May 16, 2024 · 0 comments · Fixed by #739
Closed

Implement subjective database #710

swatanabe opened this issue May 16, 2024 · 0 comments · Fixed by #739
Assignees
Labels
Node infra Related to the necessary infrastructure provided by all psibase infrastructure providers
Milestone

Comments

@swatanabe
Copy link
Collaborator

swatanabe commented May 16, 2024

The subjective database does not work as advertised and is currently disabled.

In addition, the defined semantics are problematic, because they do not handle concurrent writes to the subjective database from queries.

Proposed wasm API for managing concurrent access

// Starts a transaction for accessing the subjective database
// - The subjective database may not be read or written unless a transaction is active
// - There is a total ordering of all top-level transactions. Writes made by unsuccessful
//   transactions are not visible to other transactions.
// - nested transactions are permitted up to some maximum depth, which we need to specify.
void subjectiveCheckout();
// Attempts to commit changes to the subjective databases.
// - If changes were successfully committed, returns true and ends the transaction
// - If changes were not successfully committed, returns false and restarts the transaction
// - If this is a nested transaction, commit always succeeds
bool subjectiveCommit();
// Discards changes made to the subjective database and ends the transaction
void subjectiveAbort();

Expected Usage

subjectiveCheckout();
do {
   auto table = Tables{DbId::subjective}.open<MyTable>();
   // use table
} while(!subjectiveCommit());

Implementation

The core of the implementation is a new database primitive:

// If self and expected are equal over the range [lower, upper),
// then this range will be overwritten by the corresponding range
// in value.
bool triedent::write_session::compare_exchange_weak(
   std::shared_ptr<root>& self,
   const std::shared_ptr<root>& expected,
   std::shared_ptr<root>&& value,
   std::span<const char> lower,
   std::span<const char> upper);

This can be implemented efficiently by comparing node ids.

When accessing the subjective database, we keep track of the key ranges accessed by all database operations. commit iterates over all the ranges and runs compare_exchange. This compare_exchange loop requires holding a lock on the subjective database.

There are several alternate implementations that can work with the same interface

  • Checkout locks, commit always succeeds
  • Don't track ranges. Fail commit on any concurrent change, but acquire a lock after one failure.
  • Dynamically choose a locking strategy based on context

Rejected designs

Commit subjective data with regular transactions.

This is the simplest model. However, it is ruled out because it is too easy to prevent all parallelism. If we look up available time at the start of a query to set the timer and then bill available time at the end of the query, this will bracket every query in a way that forces serial execution.

Commit subjective data when returning from a subjective service

This will probably do the right thing most of the time, but it isn't obvious from reading the code when the commit actually happens. Logic errors from user misunderstanding are likely.

Pass a DbId to the transaction functions and manage each subjective database independently

At the moment there is only one subjective database. If there are ever other databases (i.e. nativeSubjective) with the same semantics, it makes more sense to commit them as a group than independently. Databases that are not accessed in a transaction will not significantly affect performance.

Implicitly start a transaction on access

This simplifies user code, because only commit is needed. The problem is that it's too easy to start accessing the database outside the retry loop.

@James-Mart James-Mart added this to the R2 milestone Jun 17, 2024
@James-Mart James-Mart added the Node infra Related to the necessary infrastructure provided by all psibase infrastructure providers label Jun 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Node infra Related to the necessary infrastructure provided by all psibase infrastructure providers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants