-
Notifications
You must be signed in to change notification settings - Fork 190
Document fair EIS usage limits in free trial #4206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -76,6 +76,13 @@ | |||||||||||||
| * Scaling is limited for {{serverless-short}} projects in trials. Failures might occur if the workload requires memory or compute beyond what the above search power and search boost window setting limits can provide. | ||||||||||||||
| * We monitor token usage per account for the Elastic Managed LLM. If an account uses over one million tokens in 24 hours, we will inform you and then disable access to the LLM. This is in accordance with our fair use policy for trials. | ||||||||||||||
|
|
||||||||||||||
| **Inference tokens** | ||||||||||||||
|
|
||||||||||||||
| * You can use these models hosted by the Elastic {{infer-cap}} Service with the following limits: | ||||||||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||
| * **Elastic Managed LLM:** 100 million input tokens in a 24-hour period or 5 million output tokens in a 24-hour period | ||||||||||||||
| * **ELSER**: 1 billion tokens in a 24-hour period | ||||||||||||||
| * Access to some models may be paused temporarily if either of these limits are exceeded | ||||||||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this sentence only works in the context of the bullet above, so we should not make it its own bullet. is access ALWAYS paused when these limits are exceeded? can we be specific? also, does the pausing vary by model? if it doesn't, we should skip if the answer to both of these questions is yes:
Suggested change
if no:
Suggested change
|
||||||||||||||
|
|
||||||||||||||
| **Remove limitations** | ||||||||||||||
|
|
||||||||||||||
| Subscribe to [{{ecloud}}](/deploy-manage/cloud-organization/billing/add-billing-details.md) for the following benefits: | ||||||||||||||
|
|
||||||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
having this be a sibling to hosted/serverless is not ideal because they are "sibling" deployment types. we should add some applies tags to indicate that the inference tokens limitation applies to both deployment types