Introduce sharding rules to MongoDB collections #673

sejongk · 2023-11-09T02:18:52Z

Last updated at January 31, 2024
Last edited by @sejongk
If you find something deprecated, please feel free to add to or correct this issue.

What would you like to be added:
Support MongoDB Sharding feature in Yorkie. The previous issue about this is already at #530, which contains specification and initial discussion about it.

This issue is more about sharing the sharding related discussions and implements up to date, with the goal of getting more contributors involved in the work.

Why is this needed:
The Yorkie cluster stores most of data in MongoDB. Therefore, the most of loads are concentrated on DB clusters rather than application servers. For the case of real-world service scenarios, it is necessary to support sharding for distributing data and query loads across horizontally-scalable db clusters.

1. Background Knowledge

1.1. What is the Sharding?

When large data should be stored or high query throughput is required, data can be divided into several partitions and distributed to several nodes. This method is called as ”Sharding” or “Partitioning”. Sharding enables horizontal scaling since each shard only has to manage a subset of the data for a sharded collection. As a result, the data and the query loads can be distributed evenly to nodes.

1.2. Sharded Cluster Components

A MongoDB sharded cluster consists of the following components.

shard: Each shard contains a subset of the sharded data.
mongos: The mongos acts as a query router, providing an interface between client applications and the sharded cluster.
config servers: Config servers store metadata and configuration settings for the cluster.

For a production deployment, consider the following to ensure data redundancy and system availability.

Config Server (3 member replica set): config1,config2,config3
3 Shards (each a 3 member replica set):
- shard1-1,shard1-2, shard1-3
- shard2-1,shard2-2, shard2-3
- shard3-1,shard3-2, shard3-3
2 Mongos: mongos1, mongos2

1.3. Shard Key Selection

It is important to choose a good shard key to distribute data evenly across shards and achieve high efficiency and performance of operations.

1.3.1. Shard Key Selection Criteria

It is notable that only one of the following criteria does not, on its own, guarantee even distribution of data across the sharded cluter. Therefore, it is necessary to consider overall criteria to decide appropriate shard keys.

High Shard Key Cardinality
- Cardinality originally represents the number of elements within a set of values. Shard key cardinality determines the maximum number of chunks the balancer can create.
- For example, if the cardinality of a field is 7, it means there can be no more than 7 chunks within the sharded cluster, each storing one unique shard key value. Adding more than 7 shards would not provide any benefit.
Low Shard Key Frequency
- Shard key frequency represents how often a given shard key value occurs in the data.
Non-monotonically Changing Shard Keys
- Monotonically changing shard keys means that the shard key value is always increasing or decreasing.
- Common examples are timestamps and ObjectId values.
Shard Keys Suitable for Query Patterns
- Consider most common query patterns and whether a given shard key covers them to avoid “scatter-gather” queries.

1.3.2. Issues with Poorly Selected Shard Keys

If the selected shard keys has low cardinality or high value frequency, or is monotonically changing, the cluster may experience the following potential issues.

Uneven load distribution
- Shard keys with low cardinality, high value frequency, or monotonically changing property may cause data concentration on the small number of chunks and shards.
  - Low cardinality: Each unique shard key value can exist on no more than a single chunk at any given time. When the number of possible values is small, then just a few chunks are created for each possible value and can become a bottleneck.
  - High value frequency: If the majority of documents contain only a subset of the possible shard key values, then the chunks storing the documents with those values can become a bottleneck.
  - Monotonically changing value: It is more likely to distribute inserts to a single chunk within the cluster. For example, all new inserts are routed to the chunk with maxKey (or minKey) as the upper (or lower) bound. Before the MongoDB internal eventually rebalancing data, at any moment the cluster directs insert operations only to a single shard, there would be an insert throughput bottleneck.
- Chunks can grow beyond the specified chunk size, and these chunks are called as Jumbo chunks. Jumbo chunks can become a performance bottleneck as they continue to grow,
Scatter-gather query
- In a sharded cluster, if a query includes the shard key or the prefix of a compound shard key, the mongos routes to only the shards that contain the relevant data.
- On the other hand, if it doesn't contain the shard key, the queries are broadcast to every shard for evaluation. This makes the execution less efficient and the system unable to scale linearly when more shards are added.

1.3.3. Solutions for Issues with Poorly Selected Shard Keys

Make shard keys have high cardinality and low value frequency
1. Check the properties of fields that are going to be used as shard keys
2. Use a compound index, consisting of two or more keys, to increase the cardinality and use a unique or low frequency value.
Make shard keys non-monotonically changing
1. Use hashed sharding to map original values into a wide range of hashed values.
2. Split and migrate chunks by the balancer or manual (especailly the chunk with maxKey (or minKey)).
Make shard keys suitable for query patterns
1. Avoid using hashed shard key for range based queries
2. Find included fields of the most performance dependent queries.

1.4. Sharding Approach Selection

There are mainly two sharding approches, hashed sharding and ranged sharding.

Hashed sharding
1. provides a more even data distribution across the sharded cluster, thanks to the property of hashing
2. resolves distribution issues related to monotonically changing fields.
3. is ineffecient due to scatter-gather operations when ranged queries within a contiguous range are commonly used.
Ranged sharding
1. divdes data into contiguous ranges by the shard key values.
2. locates documents with close shard key values in the same chunk or shard.
3. prevents scatter-gather executions for a read operation within a contiguous range.

I believe storing related data in the same chunk has trade-off between the performance of ranged query and the creation of hot chunks. There is no silver bullet for this, so a shard approach should be determined under the consideration of the above various factors.

1.5. Specifying Unique Constraints

1.5.1. Unique Constraints on Indexes

The unique constraint on indexes ensures that only one document can have a value for a field in a collection. MongoDB can enforce a uniqueness constraint via the unique: true option on a ranged shard key index.

However, unique constraints cannot be specified on a hashed index due to potential hash collisions on the keys. Instead, creating an additional non-hashed secondary index with unique constraints is possible, then MongoDB can use that non-hashed index to enforce uniqueness on the chosen field.

1.5.2. Unique Constraints for Sharded collections

For a ranged sharded collection, MongoDB doesn't support unique indexes across shards because insert and indexing operations are local to each shard, except only the following indexes.

the index on the shard key
a compound index where the shard key is a prefix
- Although you can have a unique compound index where the shard key is a prefix, if using unique parameter, the collection must have a unique index that is on the shard key.
the default _id index; however, the _id index only enforces the uniqueness constraint per shard if the _id field is not the shard key or the prefix of the shard key.
- Although the _id index enforces the unique constraints globally in unsharded clusters, if the _id field is not the shard key or the prefix of the shard key, _id index only enforces the uniqueness constraint per shard and not across shards.

Through the use of a unique index on the shard key, MongoDB enforces uniqueness on the entire key combination and not individual components of the shard key. For example, with the shard key {x: 1, y: 1}, the entire combination (x, y) has an unique constraint, but each individual (x) or (y) does not.

1.6. Data Partitioning with Chunks

The data is partitioned into chunks owned by a specific shard. A chunk consists of a range of sharded data. The balancer automatically migrates data evenly between shards. Initially, for empty collections, a single chunk or more than one chunks are created depending on the sharding approach. After the initial chunk setup, the balancer migrates the chunks across the shards when necessary.

The default range size (=chunk size) is 128MB. The size is adjustable between 1 and 1024MB. Setting the chunk size for a specific collection is also possible via the configureCollectionBalancing option. Consider the implication that small ranges leads to a more even distribution at the expense of more frequent migrations like overheads of the networking and the query routing layer.

1.7. Sharded Cluster Balancer

1.7.1. Balancer

The balancer is a background process that manages data migrations. If the difference in amount of data for a single collection between the largest and smallest shard exceed the migration thresholds, the balancer begins migrating data across the cluster to ensure an even distribution. The migration threshold is three times of the configured range size. For the default range size of 128MB, two shards must have a data size difference for a given collection of at least 384MB for a migration to occur.

1.7.2. Relationship between Chunk Split and Chunk Migration

Starting in MongoDB 6.0, a sharded cluster only splits chunks when chunks must be migrated. This means the chunk size may exceed the configured chunk size. For example, you might see a 1TB chunk on a shard even though you have set the chunk size to 256MB. It is because larger chunks reduce the number of chunks on a shard and improve performance by reducing the time to update the shard metadata.
However, by default, MongoDB cannot move a range if the number of documents in the range is greater than 2 times the result of dividing the configured range size by the average document size. Therefore, it is better to prevent chunks being jumbo at an early stage.

2. Considerations

2.1. Relations between Collections

Cluster-wide: users, projects
Project-wide: documents, clients
Document-wide: changes, snapshots, syncedseqs

2.2. Goals

Shard Project-wide and Document-wide collections due to the large number of data count in each collection
- Cluster-wide: less than 10,000
- Project-wide: more than 1 million
- Document-wide: more than 100 million

2.3. Unique Constraint Requirements

Documents: (project_id, key) with removed_at: null
Clients: (project_id, key)
Changes: (doc_id, server_seq)
Snapshots: (doc_id, server_seq)
Syncedseqs: (doc_id, client_id)

2.4. Main Query Patterns

Project-wide collections

Project-wide collections contain range queries with a project_id filter.

Clients

cursor, err := c.collection(ColClients).Find(ctx, bson.M{
    "project_id": project.ID,
    "status":     database.ClientActivated,
    "updated_at": bson.M{
        "$lte": gotime.Now().Add(-clientDeactivateThreshold),
    },
}, options.Find().SetLimit(int64(candidatesLimit)))

Documents

filter := bson.M{
    "project_id": bson.M{
        "$eq": projectID,
    },
    "removed_at": bson.M{
        "$exists": false,
    },
}
if paging.Offset != "" {
    k := "$lt"
    if paging.IsForward {
        k = "$gt"
    }
    filter["_id"] = bson.M{
        k: paging.Offset,
    }
}

opts := options.Find().SetLimit(int64(paging.PageSize))
if paging.IsForward {
    opts = opts.SetSort(map[string]int{"_id": 1})
} else {
    opts = opts.SetSort(map[string]int{"_id": -1})
}

cursor, err := c.collection(ColDocuments).Find(ctx, filter, opts)

Document-wide collections

Document-wide collections mostly contain range queries with a doc_id filter.

Changes

cursor, err := c.collection(colChanges).Find(ctx, bson.M{
    "doc_id": encodedDocID,
    "server_seq": bson.M{
        "$gte": from,
        "$lte": to,
    },
}, options.Find())

Snapshots

result := c.collection(colSnapshots).FindOne(ctx, bson.M{
    "doc_id": encodedDocID,
    "server_seq": bson.M{
        "$lte": serverSeq,
    },
}, option)

3. Sharding Rules

3.1. Selected Shard Keys and Approaches

Select shard keys based on the query patterns and properties (cardinality, frequency) of keys.

Project-wide: project_id, ranged
Document-wide: doc_id, ranged

In addition, every unique constraint can be satisfied because each has the shard key as a prefix.

Documents: (project_id, key) with removed_at: null
Clients: (project_id, key)
Changes: (doc_id, server_seq)
Snapshots: (doc_id, server_seq)
Syncedseqs: (doc_id, client_id)

3.2. Changes of Reference Keys

Since the uniqueness of _id isn't guaranteed across shards, reference keys to indicate a single data in collections should be changed.

Documents: _id -> (project_id, _id)
Clients: _id -> (project_id, _id)
Changes: _id -> (project_id, doc_id, server_seq)
Snapshots: _id -> (project_id, doc_id, server_seq)
Syncedseqs: _id -> (project_id, doc_id, client_id)

Considering that MongoDB ensures the uniqueness of _id per shard, Documents and Clients can be identified with the combination of project_id and _id. According to these changes, the reference keys of document-wide collections also become changed.

3.3. Relations between Collections

4. Kubernetes Deployment

It is desirable to provide a Helm charts for providing handy deployment of sharded clusters in Kubernetes.

5. Performance Optimization

5.1. Effectiveness of Query Execution

After deploying a sharded cluster in production, measuring exact performance improvement compared to standalone is necessary.
It also is possible to check if the selected shard key is appropriate by checking query execution plans. The following result shows that the query is executed through index scans.

{
  "t": {
    "$date": "2024-01-26T11:58:08.140+00:00"
  },
  "attr": {
    "type": "command",
    "ns": "yorkie-meta.documents",
    "command": {
      "find": "documents",
      "filter": {
        "_id": {
          "$lt": {
            "$oid": "647c0ce0a7b73..."
          }
        },
        "project_id": {
          "$eq": {
            "$oid": "62f1f818a1e65..."
          }
        },
        "removed_at": {
          "$exists": false
        }
      },
      "sort": {
        "_id": -1
      },
    },
    "planSummary": "IXSCAN { _id: 1 }",
    "keysExamined": 38,
    "docsExamined": 38,

    "durationMillis": 2
  }
}

5.2. Measuring Performance in Production

It is necessary to evaluate performance of sharded clusters in production.
There are many aspects to assess, including query performance, rebalancing performance, frequency of hot chunks.

5. Risks

5.1. Limited Scalability due to High value frequency of `project_id`

When there are the limited number of projects, it's likely for data to be concetrated on the small number of chunks.
This may limit scalability of clusters, which means adding more shards becomes ineffective and meaningless.

5.1.1. Solution: Change the Shard Key

As previously said, using a composite shard key can resolve this issue. Specifically, use (project_id, key) as project-wide collections' shard key instead of project_id.

It's now possible to split large chunks by key values, and migrate the splitted chunks to newly added shards.

However, these changes make both actor_id and owner able to duplicate, and the duplication can devastate the consistency of document. The reason for this is that both actor_id and owner are currently using client_id as a value, which can now duplicate in the same project, contrary to the case in the previous sharding rules.
In the previous sharding rules, every client in the same project is located in the same shard, which prevents duplication
of client_id with the unique constraint per shard.

There are three approaches to resolve this issue:

Use client_key + client_id as a value.
- this may increase the size of CRDT metadata and the size of document snapshots.
Introduce a cluster-level GUID generator.
Depend on low possiblity of duplication in MongoDB ObjectID
- see details in the following contents.

5.2. Duplicate MongoDB ObjectID

Both client_id and doc_id use MongoDB ObjectID as a value.

When there are duplicate ObjectIDs, it works well due to the changed reference keys, until the MongoDB balancer migrates a chunk with the duplicate ObjectID. Such migrations are expected to create an error that doesn't harm the consistency of documents, but brings out a temporary failure of the cluster. This conflict should be manually handled by administrators.

However, the possiblity of duplicate ObjectIDs is extremely low in practical use cases due to its mechanism.

ObjectID uses the following format:

TimeStamp(4 bytes) + MachineId(3 bytes) + ProcessId(2 bytes) + Counter(3 bytes)

Duplicate ObjectIDs can be generated, when more than 16,777,216 documents/clients are created in a single second by a single machine and process. Considering Google processes over 99,000 searches every single second, it is unlikely to occur.

We can see more details in the following references.

Implementation in golang driver: https://github.com/mongodb/mongo-go-driver/blob/v0.0.18/bson/objectid/objectid.go#L40
Official post: https://www.mongodb.com/blog/post/generating-globally-unique-identifiers-for-use-with-mongodb

When we have to meet that amount of traffic, consider the following options:

Introduce a cluster-level GUID generator.
Give up the current policies for re-using doc_key and client_key and use (project_id, key) as a reference key.
Disable balancing chunks of documents and clients.
- Just isolate each shard for a single project.
- Manually handle chunk split and migration.

The text was updated successfully, but these errors were encountered:

sejongk added the enhancement 🌟 New feature or request label Nov 9, 2023

sejongk self-assigned this Nov 9, 2023

hackerwins added the hard 🧑‍🔬 Difficult to deal with or require research label Nov 9, 2023

sejongk mentioned this issue Nov 10, 2023

Introduce sharding rules to MongoDB collections #642

Closed

2 tasks

sejongk mentioned this issue Nov 24, 2023

Introduce sharding rules to MongoDB collections #689

Closed

2 tasks

sejongk mentioned this issue Jan 18, 2024

Introduce MongoDB sharding rules to Documents and Document-wide collections #734

Merged

2 tasks

hackerwins added protocol changed 📝 Whether the protocol has changed sdk ⚒️ labels Jan 19, 2024

hackerwins mentioned this issue Jan 19, 2024

Reflect interface changes of server DB sharding yorkie-team/yorkie-js-sdk#726

Merged

2 tasks

hackerwins closed this as completed in #776 Jan 29, 2024

sejongk mentioned this issue Feb 3, 2024

Add MongoDB sharding document #781

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce sharding rules to MongoDB collections #673

Introduce sharding rules to MongoDB collections #673

sejongk commented Nov 9, 2023 •

edited

Introduce sharding rules to MongoDB collections #673

Introduce sharding rules to MongoDB collections #673

Comments

sejongk commented Nov 9, 2023 • edited

1. Background Knowledge

1.1. What is the Sharding?

1.2. Sharded Cluster Components

1.3. Shard Key Selection

1.3.1. Shard Key Selection Criteria

1.3.2. Issues with Poorly Selected Shard Keys

1.3.3. Solutions for Issues with Poorly Selected Shard Keys

1.4. Sharding Approach Selection

1.5. Specifying Unique Constraints

1.5.1. Unique Constraints on Indexes

1.5.2. Unique Constraints for Sharded collections

1.6. Data Partitioning with Chunks

1.7. Sharded Cluster Balancer

1.7.1. Balancer

1.7.2. Relationship between Chunk Split and Chunk Migration

2. Considerations

2.1. Relations between Collections

2.2. Goals

2.3. Unique Constraint Requirements

2.4. Main Query Patterns

Project-wide collections

Clients

Documents

Document-wide collections

Changes

Snapshots

3. Sharding Rules

3.1. Selected Shard Keys and Approaches

3.2. Changes of Reference Keys

3.3. Relations between Collections

4. Kubernetes Deployment

5. Performance Optimization

5.1. Effectiveness of Query Execution

5.2. Measuring Performance in Production

5. Risks

5.1. Limited Scalability due to High value frequency of project_id

5.1.1. Solution: Change the Shard Key

5.2. Duplicate MongoDB ObjectID

sejongk commented Nov 9, 2023 •

edited

5.1. Limited Scalability due to High value frequency of `project_id`