docs: free tier cost management guide#4
Conversation
9134e59 to
002c09a
Compare
|
@OfficialAbhinavSingh , Kindly have a research over graphify tool it help in token reduction as well. |
| @@ -0,0 +1,177 @@ | |||
| # Free Tier Cost Management — Mergit | |||
|
|
|||
| > Target: **< $0.50/org/month** · Without optimization: ~$27/org/month · Required reduction: **98%** | |||
There was a problem hiding this comment.
What is this data based on ?
| ## Model Pricing (verified June 2026) | ||
|
|
||
| | Model | Input $/M | Output $/M | Use for | | ||
| |-------|----------|-----------|---------| | ||
| | groq/llama-4-maverick | $0.20 | $0.60 | Free tier primary (Researcher/Writer) | | ||
| | groq/llama-3.1-8b | $0.05 | $0.08 | Summarization only | | ||
| | claude-haiku-4-5 | $0.80 | $4.00 | Free tier Coder role | | ||
| | claude-sonnet-4-6 | $3.00 | $15.00 | Startup tier | | ||
| | claude-opus-4-8 | $15.00 | $75.00 | Enterprise only | | ||
|
|
||
| **Anthropic prompt cache:** read = **0.1×** · write 5-min = 1.25× · write 1-hour = 2× | ||
| **Anthropic Batch API:** **50% off** all tokens (free tier always uses this) | ||
| **Groq:** no prompt caching support |
There was a problem hiding this comment.
This planning is good for the free tier users but what would we do for the pro users that are paying for the service. As they will have access to change between models depending on their needs.
|
|
||
| ## Tier Limits | ||
|
|
||
| | | Free | Startup | Enterprise | |
There was a problem hiding this comment.
not startup we are to name it pro plan
| ## 5-Layer Cost Stack | ||
|
|
||
| ``` | ||
| Layer 0 Anthropic prefix cache cache_control: ephemeral on system prompt |
There was a problem hiding this comment.
avoid usnig caching of other provider that way we wont be able to use the data that our customer is giving for our improvement purpose.
| Layer 0 Anthropic prefix cache cache_control: ephemeral on system prompt | ||
| All free orgs share one prefix → 0.1× input cost after first write | ||
|
|
||
| Layer 1 Model routing by role Groq for Researcher/Writer, Haiku for Coder |
There was a problem hiding this comment.
How the hell is haiku better in coding part ??
Viscous106
left a comment
There was a problem hiding this comment.
@OfficialAbhinavSingh There is lot of ai slop in this pr kindly refrain from using ai for all things try to filter out things that are not true by reviewing your own changes before pushing or opening a pr .
Adds production-level free tier cost management reference covering 5-layer cost defense stack, model pricing (Groq/Haiku/Sonnet/Opus),
Claude.ai-style UX pattern (4 states, upgrade prompts, at-capacity handling), token budget ledger, shared cross-org semantic cache, context compression, Batch API 50% discount, and full implementation code.
Target: under .50/org/month for free tier.