Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sharding guide #7499

Closed
wants to merge 5 commits into from
Closed

Sharding guide #7499

wants to merge 5 commits into from

Conversation

nihaals
Copy link
Contributor

@nihaals nihaals commented Mar 6, 2022

To-do

  • Link to Discord docs
  • Include code examples
  • Include more info about differences between Client and AutoShardedClient
  • Insert newlines in long lines

@nihaals nihaals changed the base branch from master to docs/guide March 6, 2022 01:37
@Rapptz Rapptz added the guide This relates to the discord.py guide label Mar 9, 2022
@Rapptz Rapptz mentioned this pull request Mar 12, 2022
23 tasks
Copy link
Contributor

@Gorialis Gorialis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just copying over my old review comments since they still apply. Also, this document feels a bit short. Not sure what we can do about it at the moment, though.

Sharding
==========

For bots in a large number of guilds, sharding may be required. Sharding is where a subset of the bot's total guilds are processed in each gateway connection. This allows a bot to handle more events, by splitting them by connection and possibly process. Sharding is generally not recommended for bots in less than 1,000 guilds and is required by Discord when a bot is in over 2,500 guilds.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good explanation, but for heading paragraphs I generally recommend trying to break the information into individually digestable, small units. For beginners, it makes the text a bit more approachable, and for novices, it helps quickly determine if the page covers the topic they're expecting.

Not putting all my chips in on this particular presentation, but:

Suggested change
For bots in a large number of guilds, sharding may be required. Sharding is where a subset of the bot's total guilds are processed in each gateway connection. This allows a bot to handle more events, by splitting them by connection and possibly process. Sharding is generally not recommended for bots in less than 1,000 guilds and is required by Discord when a bot is in over 2,500 guilds.
When bots start to get considerably large, the amount of events they have to deal with can start to become problematic.
At high user and guild counts the incoming messages, typing events, and status updates can start to climb to being as frequent as hundreds per second.
To help deal with this large amount of traffic, Discord supports **sharding**, a feature where your bot can split its guilds amongst separate connections, reducing the amount of data each individual connection has to handle.
This not only helps Discord by reducing how much data they need to direct to one place, but also helps us, as we can split our bot's overall work across different connections, environments, or even entirely different machines.
Discord recommends using sharding once you get over 1,000 guilds, and once you reach beyond 2,500 guilds, it becomes a requirement.
Let's discuss what options we have for setting up sharding within discord.py.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I think rewriting it in a form where you can both quickly skim and process/understand it in "chunks" makes it more useful for everyone

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we already used "us" and "let's" in other guides? It seems like an important thing to check now to make sure our tone is consistent across the guides. I stuck to avoiding it and using a more typical docs-style tone initially, although since this is a guide, I think either works as long as it's consistent.

docs/guide/sharding/index.rst Outdated Show resolved Hide resolved
@nihaals
Copy link
Contributor Author

nihaals commented Mar 17, 2022

We still need to add clustering but I'm unsure how much we actually want to write for that, maybe it could be left for a future PR since the guides project is mostly WIP still and we haven't decided yet?

@nihaals
Copy link
Contributor Author

nihaals commented Apr 26, 2022

I'm not sure how in-depth we want this page to be beyond explaining the concept of sharding. With code examples, do we need to provide anything more than an example with shard_count and shard_ids?

There's also this suggestion but I'm not sure what we really need to include:

A thought about sharding: perhaps it could go over "[clustering]" as well, i.e. splitting shards between processes and having them communicate with each other.

Originally posted by @XuaTheGrate

Originally posted by @nihaals

@nihaals
Copy link
Contributor Author

nihaals commented Apr 26, 2022

"Clusters" are a term used for larger bots (20k+ guilds) that split up the processes into multiple sub-processes, each handling a handful of shards, for example:

  • Cluster 1 (3k guilds)
    • Shard 0 (1.5k guilds)
    • Shard 1 (1.5k guilds)
  • Cluster 2 (3k guilds)
    • Shard 2 (1.5k guilds)
    • Shard 3 (1.5k guilds)
  • and so on...

Some bots have the clusters "communicate" with each other (be it I.P.C. or sharing data through their database, or something rather).

This is used to reduce the CPU load overall as having all the shards clumped together can cause the process to hang, increasing latency and resources used to keep the bot running. Split processes use the resources more efficiently.

Originally posted by @XuaTheGrate

@nihaals
Copy link
Contributor Author

nihaals commented Apr 26, 2022

I think I can extend my explanation of using multiple processes with this to explain why you might actually do that and a bit of how you might do it

Originally posted by @nihaals

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
guide This relates to the discord.py guide
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants