Skip to content

GPT‐RAG ‐ Pricing Model

Gonzalo Becerra edited this page Feb 21, 2024 · 10 revisions

In the following link will find the Pricing Model that include all the components of the solution

https://azure.com/e/7e2b3a9bf08645abba7fae1d8aa8ef9d

We should update the Parameters of the following components:

  • Number of Private Links
  • Azure Functions (Instances to escalate, default 1)
  • Storage Account (Currently 1TB by default)
  • Azure OpenAI (Prompt): Calculated for (2.000 tokens for Prompt) for 1M of Messages Monthly
  • Azure OpenAI (Completion) Calculated for (400 Characters = 136 Tokens) of response size.
  • Cognitive Search (1 Replica + 1 Partition)
  • Semantic Search (Based on 1M of Queries)
  • Cosmos DB (10 GB)

This is a reference model on pricing of the solution.

The following information is required to improve the accuracy of the calculation: Numbers based on the example above:

  • Number of Messages (100.000)
  • Prompt Size (2000 tokens)
  • Maximum numbers of characters in Response (400 characters)
  • Storage Required for all documents (1TB).

Calculation for Tokens in Azure OpenAI

Prompt:

GPT (Triage) = 450 GPT (Answer) = 221 + Sources = 2221 GPT (Not in source) = 106 tokens GPT (Is Grounded) = 99 + Sources = 2099

Prompt (tokens x1000): (2099 + 2221 + 450 + (106 * 0.1))*100000/1000 = 478,060

Completion:

GPT (Not in source) = 4 GPT (Is Grounded) = 4 GPT (Triage) = 10 GPT (Answer) = 800 Prompt (tokens x1000): (4 + 800 + 10 + (4 * 0.1))*100000/1000 = 81,440