Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add rate limit buffering support #430

Merged
merged 14 commits into from
Oct 7, 2017
Merged

Conversation

ErikMcClure
Copy link

My anti-spam bot needs to be able to detect when it is about to get rate limited, and then in response to this start aggregating small messages together into a single larger message. This is intended to deal with bot raids, which can trigger 300+ "user has joined" notifications at the same time.

Doing this in a thread-safe manner requires several modifications to discordgo, and the core "buffered message" function must also be implemented in the Session object because it must access the ratelimiter object. While this pull request tries to make the absolute minimum amount of changes necessary to support this feature, it could be shrunk even more by exposing the rate limiter object and internal requestInner function outside of discordgo, which would then allow the bot to implement the buffered Post function.

However, if that involves exposing too much of the internal library, the following is the minimum number of changes required to make this work:

  • Break request into requestInner
  • Break LockBucket into LockBucketObject
  • Change getBucket to GetBucket so it can be externally accessed
  • Add RequestBuffer interface
  • Add RequestBuffer pointer to Bucket struct
  • Add RequestPostWithBuffer() function to Session

If you believe RequestPostWithBuffer should, in fact, be added to Session, tests will need to be made to ensure it works properly, which are not currently included - suggestions for how to implement those tests or where similar tests can be found are appreciated.

voice.go Outdated
@@ -659,8 +658,6 @@ func (v *VoiceConnection) opusSender(udpConn *net.UDPConn, close <-chan struct{}
return
}

runtime.LockOSThread()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you PR this separately please.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this change was directly merged into the master branch, but never pulled into the develop branch, and it was accidentally swept up by my pull request. You should properly merge this into the develop branch, and then I can re-do my pull request and it won't show up anymore.

@ErikMcClure
Copy link
Author

I've updated this and decided to go for the more general route of exposing the rate limiter itself rather than force bots into using a specific buffering method. There is no longer any new functionality - a few functions have been split up and a few variables have been exposed. The Bucket object now exposes a generic interface{} pointer.

@Hinara
Copy link
Contributor

Hinara commented Aug 22, 2017

I'm not sure that exposing the rate limit is the best idea,
For your problem why don't you just use a reader/deleter go routine on one hand and the aggregating process on the other hand and make them communicate with a channel of messages?
So the aggregation will just take all the message that it receive has it came and the reader/deleter will be the only one locked by buckets.
If you want to send message from the output of the aggregating process without blocking it create a third goroutine which just send message passed by the aggregating process

@ErikMcClure
Copy link
Author

ErikMcClure commented Aug 22, 2017

That doesn't solve the problem. I only want to aggregate messages if the bot is about to be rate limited. I have to have access to the rate limit to be able to detect this. There is simply no possible way to only aggregate messages in response to a rate limit without having access to the rate limit.

If this change is not acceptable I will simply retract the pull request.

@Hinara
Copy link
Contributor

Hinara commented Aug 22, 2017

OK i think i got it so is your problem that you edit your message to aggregate the new one on each user join and so when your bot it the limit he need to wait but during this time new user can arrive causing your bot to quickly hit the limit again ?
If it's that yes a PR about things like that can be pretty cool however you can just prevent that by don't editing each time your message but editing it only for example each second with buffering all new messages of that second, this will be better for your bot and discord's servers however update will become delayed.
If you want them to be immediate you can effectively also expose Buckets Limits
Edit: If you want I can also send you by email a golang code that will work well and will agregate only when needed after my work.

@ErikMcClure
Copy link
Author

No, I want to specifically avoid that by detecting that the bot will be rate limited after the next message it sends, so that it sends 4 messages and then aggregates the rest of the messages into one giant 5th message that it sends before the rate limit is hit. Here is an example of how it behaves right now:
image
Because the bot should only start aggregating messages if the rate limit is about to be hit, the bot MUST have access to the rate limit. If absolutely necessary I can avoid adding Userdata to the bucket, which is only used because messages need to be aggregated on a per-channel basis, which is essentially the same as aggregating them on a per-bucket basis, so this lets the bot re-use the bucket lock. However, the rate limit and the request function and the bucket lock all have to be exposed no matter what. This is true for anyone implementing anything that responds to being rate limited.

I strongly oppose simply not allowing bots to access their rate limit status. This is just throwing information away that bots can use to effectively respond to being rate limited without resorting to simply sending a message everyone 1 second and impacting their responsiveness. There is no excuse for throwing away the rate limits that the discord API specifically says you should pay attention to. Anyone hardcoding a 1 second aggregation bucket is implicitely hardcoding assumptions about the rate limit into their code, which is exactly what the discord API says you shouldn't do.

Discordgo must expose it's rate limiting information. If someone would rather expose it in a different way, that's fine, but somehow, that information must be exposed in order for people to write well-behaved discord bots. Simply not exposing the rate limiting information is not an option.

@Hinara
Copy link
Contributor

Hinara commented Aug 22, 2017

Okay i understand i will just leave you a piece of code which hasn't got delay but however on Ratelimit block will send the big packet 😄 this piece of code just permit you to manage if too many messages arrive at the same time but it doesn't do it just before the ratelimit so yes I think you PR in some cases can be usefull ;)
PS: This piece of code currently juste take all messages the bot receive and send them to another channel and join them if they were send to close

package main

import (
	"flag"
	"fmt"
	"os"
	"os/signal"
	"syscall"

	"github.com/bwmarrin/discordgo"
)

// Variables used for command line parameters
var (
	Token      string
	ChannelID  string
	BufferChan chan string
)

func init() {

	flag.StringVar(&Token, "t", "", "Bot Token")
	flag.StringVar(&ChannelID, "c", "", "Channel")
	flag.Parse()
}

func main() {
	if Token == "" || ChannelID == "" {
		fmt.Println("Needed parameters")
		return
	}
	BufferChan = make(chan string, 200)
	// Create a new Discord session using the provided bot token.
	dg, err := discordgo.New("Bot " + Token)
	if err != nil {
		fmt.Println("error creating Discord session,", err)
		return
	}

	// Register the messageCreate func as a callback for MessageCreate events.
	dg.AddHandler(messageCreate)

	// Open a websocket connection to Discord and begin listening.
	err = dg.Open()
	if err != nil {
		fmt.Println("error opening connection,", err)
		return
	}

	// Wait here until CTRL-C or other term signal is received.
	fmt.Println("Bot is now running.  Press CTRL-C to exit.")
	sc := make(chan os.Signal, 1)
	signal.Notify(sc, syscall.SIGINT, syscall.SIGTERM, os.Interrupt, os.Kill)
	<-sc

	// Cleanly close down the Discord session.
	dg.Close()
	close(BufferChan)
}

// This function will be called (due to AddHandler above) every time a new
// message is created on any channel that the autenticated bot has access to.
func messageCreate(s *discordgo.Session, m *discordgo.MessageCreate) {
	BufferChan <- m.Content
}

func sender(msgChan chan string, s *discordgo.Session) {
	msg, ok := <-msgChan
	for ok {
		s.ChannelMessageSend(ChannelID, msg)
		msg, ok = <-msgChan
	}
}

func bufferizer(s *discordgo.Session) {
	sendChannel := make(chan string)
	sender(sendChannel, s)
	defer close(sendChannel)
	msg := ""
	ok := true
	for ok {
		if msg == "" {
			msg, ok = <-BufferChan
		} else {
			select {
			case msgAdd, state := <-BufferChan:
				msg += msgAdd
				ok = state
			case sendChannel <- msg:
				msg = ""
			}
		}
	}
}

@ErikMcClure
Copy link
Author

Aggregating messages AFTER the rate limit is not acceptable behavior. I also have to point out that your code fundamentally doesn't work in a multi-channel context, because the rate limits between channels are different. This is why the buckets must be exposed and why there is now Userdata on the buckets, so that messages can be aggregated per-bucket, which is what the API expects you to do.

The rate limits MUST be exposed to support this behavior.

@Xe
Copy link

Xe commented Aug 22, 2017

I've hit rate limits like this myself and having the rate limits exposed would be a great thing. I'm 👎 on the interface{} though.

@andersfylling
Copy link

I have a bot that often updates its status, and I have issues with ratelimit as well. So this change would be greatly appriciated.

@andersfylling
Copy link

it's been over a month, what's the status on this PR?

@ErikMcClure
Copy link
Author

@sciencefyll This PR appears to be completely dead. The maintainers do not want to expose the rate limit. If you want this feature, you will have to use the develop branch of my fork. I attempt to keep the develop branch in sync, but the master branch is probably out of date. Sorry.

@iopred
Copy link
Collaborator

iopred commented Sep 30, 2017

@jonas747 care to take a look?

@jogramming
Copy link
Contributor

looks fine to me

@ErikMcClure
Copy link
Author

If this pull request isn't accepted soon I'm going to close it instead of repeatedly merging changes into it over and over.

@andersfylling
Copy link

You use you own fork or? I guess I have to switch from the official to yours, so do you maintain your fork?

@ErikMcClure
Copy link
Author

I use my own fork. Every now and then I sync it with develop. I will not guarantee any support whatsoever for my fork because I shouldn't need it in the first place, so you can use it at your own risk.

@bwmarrin
Copy link
Owner

bwmarrin commented Oct 6, 2017

Currently my only hang up that is left here is the Userdata interface{} field. I'd love to see some discussion about why it is required and any possible alternatives that would also solve it's requirement.

@ErikMcClure
Copy link
Author

The Userdata field is mandatory because any bot that wants to do anything useful with rate limits must store information on a per-bucket basis, because any sort of buffering or waiting has to be done on the same bucket. My bot uses the Userdata field to implement a message buffer that merges text messages together.

The requirement here is that bots must be able to assign arbitrary user data to buckets. A hash cannot work because there's no way to tell when or if a bucket will be discarded, so the hash grows without bound. A "Bucket Creation Hook" might work if you allow the bot to create it's own bucket object that embeds the internal Bucket object and then casts back to it. Neither of these seem like compelling alternatives to the much simpler method of just having a Userdata {}interface field on the bucket.

@iopred iopred merged commit 97a510c into bwmarrin:develop Oct 7, 2017
@bwmarrin bwmarrin added this to the v0.18.0 milestone Dec 27, 2017
@bwmarrin bwmarrin added the feature Major feature implementation label Dec 27, 2017
ErikMcClure added a commit to ErikMcClure/discordgo that referenced this pull request Aug 4, 2020
* Fix bwmarrin#406: reconnect() can be called while still connected

* Add memberMap to speed up member queries

* Fix error return value and remove deletion

* Fix GuildAdd member map initialization edge case

* Add rate limit buffering support

- Break request into requestInner
- Break LockBucket into LockBucketObject
- Change getBucket to GetBucket so it can be externally accessed
- Add RequestBuffer interface
- Add RequestBuffer pointer to Bucket struct
- Add RequestPostWithBuffer() function to Session

* Remove internal implementation, export ratelimiter instead
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Major feature implementation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants