Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,13 @@
* Scaling is limited for {{serverless-short}} projects in trials. Failures might occur if the workload requires memory or compute beyond what the above search power and search boost window setting limits can provide.
* We monitor token usage per account for the Elastic Managed LLM. If an account uses over one million tokens in 24 hours, we will inform you and then disable access to the LLM. This is in accordance with our fair use policy for trials.

**Inference tokens**
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having this be a sibling to hosted/serverless is not ideal because they are "sibling" deployment types. we should add some applies tags to indicate that the inference tokens limitation applies to both deployment types

Suggested change
**Inference tokens**
**Inference tokens** {applies_to}`ess: ga` {applies_to}`serverless: ga`


* You can use these models hosted by the Elastic {{infer-cap}} Service with the following limits:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* You can use these models hosted by the Elastic {{infer-cap}} Service with the following limits:
* You can use the following models hosted by the Elastic {{infer-cap}} Service, with the following limits:

* **Elastic Managed LLM:** 100 million input tokens in a 24-hour period or 5 million output tokens in a 24-hour period
* **ELSER**: 1 billion tokens in a 24-hour period

Check notice on line 83 in deploy-manage/deploy/elastic-cloud/create-an-organization.md

View workflow job for this annotation

GitHub Actions / vale

Elastic.Acronyms: 'ELSER' has no definition.
* Access to some models may be paused temporarily if either of these limits are exceeded

Check notice on line 84 in deploy-manage/deploy/elastic-cloud/create-an-organization.md

View workflow job for this annotation

GitHub Actions / vale

Elastic.WordChoice: Consider using 'can, might' instead of 'may', unless the term is in the UI.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this sentence only works in the context of the bullet above, so we should not make it its own bullet.

is access ALWAYS paused when these limits are exceeded? can we be specific? also, does the pausing vary by model? if it doesn't, we should skip some

if the answer to both of these questions is yes:

Suggested change
* Access to some models may be paused temporarily if either of these limits are exceeded
Access to these models is paused temporarily if either of these limits are exceeded.

if no:

Suggested change
* Access to some models may be paused temporarily if either of these limits are exceeded
Access to some models might be paused temporarily if either of these limits are exceeded.


**Remove limitations**

Subscribe to [{{ecloud}}](/deploy-manage/cloud-organization/billing/add-billing-details.md) for the following benefits:
Expand Down
Loading