Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Database sharding #16629

Open
5 of 8 tasks
gtamasi opened this issue Oct 13, 2023 · 1 comment
Open
5 of 8 tasks

Database sharding #16629

gtamasi opened this issue Oct 13, 2023 · 1 comment
Labels
type: feature For issues and PRs. For new features. Never breaking changes.

Comments

@gtamasi
Copy link

gtamasi commented Oct 13, 2023

Issue Creation Checklist

  • I usnderstand that my issue will be automatically closed if I don't fill in the requested information
  • I have read the contribution guidelines

Feature Description

Describe the feature you'd like to see implemented

As a feature, Sequelize could handle a distributed environment with multiple database instances (shards) while the underlying schema is the same for each shard. This environment would consist of multiple primary databases along with their read replicas.

To achieve this, Sequelize needs to:

  • have a capability of establish connection to a database based on a provided shard identifier
  • create a transaction to the right shard
  • maintain a connection pool from where a connection to a specific shard can be acquired, released, etc.

Example

import Sequelize from "@sequelize/core";

const sequelize = new Sequelize({
  dialect: "postgres",

  // The sharding options containing configuration data for each shard
  // The sharding feature can be disabled by setting the sharding option to false
  // In case the sharding option is set, sequelize disregard the replication option
  sharding: {
    shards: [
      {
        shardId: "shard-0",
        writeConfig: a, // primary database config
        readConfig: a, // read replica config ,
      },
      {
        shardId: "shard-1",
        writeConfig: ..., // primary database config ,
        readConfig: ..., // read replica config ,
      },
    ],
  },
  replication: false,
});

// Use application logic to determine which database shard the Simpson family live in and pass it to transaction

const transaction = await sequelize.startUnmanagedTransaction({
  // Passing the identifier of the shard we want to create a transaction to
  shardId: "shard-0",
});

// Use the transaction with the shard identifier
try {
  const user = await User.create(
    {
      firstName: "Bart",
      lastName: "Simpson",
    },
    { transaction: t }
  );
} catch (error) {
  // custom error handling and rollback
}

Possible ideas:

After looking through the codebase, an actual idea is to extend the ReplicationPool (abstract/replication-pool.ts) capabilities by adding a new replication pool which can maintain shard configuration.

Describe why you would like this feature to be added to Sequelize

Currently, we're working on a sharding project to split up our primary Postgres database into multiple shards and distribute the data across the shards. While we aim to have multiple database instances (shards), we plan the application to have only one sequelize instance configured and instantiated. Since we've decided to move with the sharded database concept, our application layer needs to deal with determining the right shard and commit a transaction to that. As of now, Sequelize can only manage one primary database with read replicas, so we'd like to extend Sequelize to manage database shards.

After a quick search, I've found previous issues requesting this sort of feature.

#10154
#3806
#263

I deeply believe this feature will be appreciated across the community and bring extra value to Sequelize.

While we aim to tackle this problem ourselves, we are also dedicated to making this feature approved and upstreamable in the future.

Is this feature dialect-specific?

  • No. This feature is relevant to Sequelize as a whole.
  • Yes. This feature only applies to the following dialect(s):

Would you be willing to resolve this issue by submitting a Pull Request?

  • Yes, I have the time and I know how to start.
  • Yes, I have the time but I will need guidance (some feedback would be appreciated).
  • No, I don't have the time, but my company or I are supporting Sequelize through donations on OpenCollective.
  • No, I don't have the time, and I understand that I will need to wait until someone from the community or maintainers is interested in implementing my feature.

Indicate your interest in the addition of this feature by adding the 👍 reaction. Comments such as "+1" will be removed.

@gtamasi gtamasi added pending-approval Bug reports that have not been verified yet, or feature requests that have not been accepted yet type: feature For issues and PRs. For new features. Never breaking changes. labels Oct 13, 2023
@ephys ephys removed the pending-approval Bug reports that have not been verified yet, or feature requests that have not been accepted yet label Feb 2, 2024
@Bytedefined
Copy link

Would love this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: feature For issues and PRs. For new features. Never breaking changes.
Projects
None yet
Development

No branches or pull requests

3 participants