Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autocomplete: Expose rate limiting and other errors to users #634

Closed
2 tasks done
Tracked by #582 ...
toolmantim opened this issue Aug 9, 2023 · 15 comments · Fixed by #851
Closed
2 tasks done
Tracked by #582 ...

Autocomplete: Expose rate limiting and other errors to users #634

toolmantim opened this issue Aug 9, 2023 · 15 comments · Fixed by #851
Assignees

Comments

@toolmantim
Copy link
Contributor

toolmantim commented Aug 9, 2023

There are some rate limits in place, both for chats and autocomplete. For chat, we display error messages in the transcript, so users know that their chat message failed because of rate limit exhaustion. But for autocomplete, we give no visible feedback.

We need to:

  • Make it visible there's a rate limit happening
  • Give them enough information to understand what's happening
  • Know what they can do about it (wait, read the docs?)

From @eseliger’s Slack message:

For autocompletions, as far as I know we don’t indicate errors too heavily today.
We’ve been thinking of making the Cody navbar item turn red or something like that and on hover show it’s not available at the moment due to rate limits.

  • Design
  • Reviewed & ready to go
@toolmantim
Copy link
Contributor Author

@eseliger @philipp-spiess I've done a bit of testing locally with adding an additional icon, or seeing if it's possible to do the split/two-button style (it doesn't appear to be possible), and landed exactly back at what you suggested:

Screenshot 2023-08-09 at 11 32 18 pm

(This is from a slightly new design than what's on main, so please ignore the slightly different statusbar label and quickpick items)

Use the current icon, change the background and tooltip, and add an item to the top of the quickpick (selected, with a separator below).

@eseliger How does the rate limiting logic work? And what error message/info do we have available to show? An example of whatever string the backend spits out that we'd show to the user could do, just so I can sanity check it with reality.

As far as docs go, I found https://docs.sourcegraph.com/cody/explanations/cody_gateway#rate-limits-and-quotas — should we update it while we're here?

@eseliger
Copy link
Member

eseliger commented Aug 9, 2023

@eseliger How does the rate limiting logic work? And what error message/info do we have available to show? An example of whatever string the backend spits out that we'd show to the user could do, just so I can sanity check it with reality.

You will get 429 HTTP errors when making requests while throttled, with headers that tells when the limit resets and so forth:

	w.Header().Set("x-ratelimit-limit", strconv.Itoa(err.Limit))
	w.Header().Set("x-ratelimit-remaining", strconv.Itoa(max(err.Limit-err.Used, 0)))
	w.Header().Set("retry-after", err.RetryAfter.Format(time.RFC1123))

They can do:

  • For app, reach out to us on discord
  • For enterprise, talk to their sales rep
  • Or simply, wait

@toolmantim
Copy link
Contributor Author

toolmantim commented Aug 10, 2023

Thanks! In that case, I think the UI should probably just say "Retry again in {friendlyTimeDuration}" type thing (e.g. "Retry again in 2 minutes") — but in the documentation we can have that information for how to request a limit increase.

What are the time buckets? 1 minute?

@toolmantim
Copy link
Contributor Author

For the doc update:

For app, reach out to us on discord

Is there a specific channel/person/group they should talk to?

Does that also apply to sourcegraph.com users?

@philipp-spiess
Copy link
Contributor

Thanks! In that case, I think the UI should probably just say "Retry again in {friendlyTimeDuration}" type thing (e.g. "Retry again in 2 minutes") — but in the documentation we can have that information for how to request a limit increase.

What are the time buckets? 1 minute?

The buckets are 24 hours. And I think for autocomplete, retry is not really a useful term since the code change was likely already made. For that feature, we should say something like "Autocomplete will be available again in {friendlyTimeDurationi}. {Some CTA} now to increase your quota" or so.

@toolmantim
Copy link
Contributor Author

Thanks @philipp-spiess, will update.

(Is anyone hitting this on their first day of use? That's a long time to wait…)

@eseliger
Copy link
Member

(Is anyone hitting this on their first day of use? That's a long time to wait…)

Note that this is instance-wide, so for enterprise customers with 500 licensed users, all 500 combined have to on average use the per-user allocated number of requests.

Is there a specific channel/person/group they should talk to?

Not sure, @jdorfman usually handles these I think.

Does that also apply to sourcegraph.com users?

Yes, dotcom and app :)

@toolmantim
Copy link
Contributor Author

Thanks @eseliger, will get a design & docs update ready.

It might be a nice improvement to reduce the bucket duration to 30min at some point… 24h seems a bit unfriendly (it's more like a 1 day ban than a rate limit).

@eseliger
Copy link
Member

eseliger commented Aug 11, 2023

It might be a nice improvement to reduce the bucket duration to 30min at some point… 24h seems a bit unfriendly (it's more like a 1 day ban than a rate limit).

While our rate limits depend on requests made, not spend incurred on our end or so, and because we don't have rollover in additional windows, using a 1h window might actually make it worse for most of our customers, because:

  • To keep cost in a reasonable frame, we define a per user per month spend for these features
  • Since we don't track tokens due to API limitations, we estimated what the average request will cost, so our gating metric is reqs
  • Then we distribute that amount evenly over the month, currently by day

For an instance of 5000 users that can do 1000 requests per user per day, that means: All users combined can make 5M requests in a 24h window
If we switch to a 1h window, they will get 208k requests an hour, again, no rollover.

Now, when the work day starts and everyone's hacking away happily, they might want to burn through their 24h limit in the 8h of their workday. With the 24h window, they get all their 5M permitted requests during the work day. If it was an hourly window, they would only get to use 1/3 of that during their working hours.

There's more changes coming to billing here likely as we get closer to GA, do you think we need to revisit this before then? Ideally, 95%+ of users should not hit their limits, at least not enterprise users, those numbers were chosen carefully that most people get through the day without interruptions, favoring user experience, while keeping cost incurred fair.

@toolmantim
Copy link
Contributor Author

toolmantim commented Aug 11, 2023

Thanks for getting me up to speed. The Enterprise case makes sense as is. I think we'd only need to revisit if we see that free users are hitting it during their first days of use, and getting stuck for a whole day.

@toolmantim
Copy link
Contributor Author

toolmantim commented Aug 11, 2023

Here's an updated design, based on those words @philipp-spiess, and I reckon this is good to go:

image

Give it's a single line, I've left off the CTA in the text, and assume they'll discover to click the item and read the docs page.

@toolmantim
Copy link
Contributor Author

(I don't love "notice" either — suggestions welcome!)

@jdorfman
Copy link
Member

Is there a specific channel/person/group they should talk to?

Users usually just drop a message in #help, and me or a community member like @deepak2431 flags it and I'll takes care of the request.

Image 2023-08-11 at 9 02 15 AM png

@philipp-spiess
Copy link
Contributor

I think we should probably work on this sooner rather than later. Dotcom users are getting hit by this and are confused: https://github.com/sourcegraph/cody/discussions/660

@toolmantim Is this ready to be taken over from your side? 🙂

@toolmantim
Copy link
Contributor Author

Yeah this is good to go @philipp-spiess!

@toolmantim toolmantim changed the title Design: Expose autocomplete rate limiting status to users Expose autocomplete rate limiting status to users Aug 14, 2023
@toolmantim toolmantim added the design-ready Design is ready to go label Aug 14, 2023
@philipp-spiess philipp-spiess changed the title Expose autocomplete rate limiting status to users Autocomplete: Expose rate limiting and other errors to users Aug 21, 2023
philipp-spiess added a commit that referenced this issue Aug 31, 2023
Closes #634

This PR adds a UI indicator for autocomplete errors. We special case
rate limiting errors to provider more information.

- When a rate limit error is encountered, we show it only once per rate
limit interval (the UI indicator will reset with a VS Code restart).
When the option is selected in quick pick, we open
https://about.sourcegraph.com/blog/increasing-the-completions-rate-limit
(the only resource I found that we have for rate limits)
- When any other (e.g. a network error) is encountered, we show the
error message and open the output channel. We also add a new `error` API
similar to the `debug` one but that will always log errors to the output
channel.

## ToDo

- [x] Add a new `error` API similar to the `debug` one but that will
always log errors to the output channel.
- [x] Add test cases

## Test plan


https://github.com/sourcegraph/cody/assets/458591/efa5f660-706e-47db-96e1-63b5f7b2d2fc



<!-- Required. See
https://docs.sourcegraph.com/dev/background-information/testing_principles.
-->

---------

Co-authored-by: Tim Lucas <t@toolmantim.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants