Skip to content

bradAGI/fi

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

fi — free LLM calls

fi

A Claude Code skill that calls free LLMs with your keys.
No server. No Docker. Just text in, text out.

License: MIT npx skills stdlib only no docker


Install

npx skills add bradagi/fi

That's it. The skill installs into your agent's skills directory — SKILL.md, the CLI, and the provider catalog all arrive together.

Use it

Talk to your agent. The skill wires up the common workflows:

you> add my gemini key AIza…
claude> [stores as GEMINI_API_KEY_1]

you> what free models do I have?
claude> [runs ./fi models — shows your aliases across providers]

you> ask gemini to summarize this pdf
claude> [pipes the pdf text into ./fi call gemini-2.5-flash]

you> use a reasoning model to solve this problem
claude> [runs ./fi call deepseek/deepseek-r1:free --prompt "…"]

you> add my openrouter key sk-or-v1-… and refresh
claude> [./fi keys add + ./fi sync — discovers 30+ new models]

Under the hood, the skill drives a tiny Python CLI (fi) that makes direct HTTPS calls to each provider's OpenAI-compatible /v1/chat/completions endpoint. No localhost proxy, no Docker, no daemon.

Direct CLI usage

If you want to call it yourself instead of through an agent:

git clone https://github.com/bradAGI/fi
cd fi

./fi keys add gemini AIza...
./fi call gemini-2.5-flash --prompt "Hello"
./fi stream gemini-2.5-pro --prompt "Tell me a story"

echo "summarize this" | ./fi call llama-3.3-70b-groq

Commands

./fi call <alias> --prompt "..."        Make a call, print the response
./fi call <alias> --prompt "..." --system "..." --max-tokens 256 --temperature 0.2
./fi stream <alias> --prompt "..."      Stream tokens to stdout

./fi keys add <provider> <key>          Store a key (numbered slots auto-rotate)
./fi keys list                          Masked list of configured keys
./fi keys remove <provider> [--index N]

./fi providers                          Active vs inactive providers
./fi models [-g GROUP]                  Callable model aliases (filter by group)
./fi sync                               Refresh auto-discovered catalogs

Both call and stream accept the prompt via --prompt or stdin:

cat prompt.txt | ./fi call gemini-2.5-pro
./fi stream llama-3.3-70b-groq --prompt "Write a haiku"

Requirements

  • Python 3.11+ (stdlib only — no pip install)
  • Internet access (calls go directly to upstream providers)

That's it. No Docker. No server. No background process.

Providers

Auto-discovered (catalogs refresh live from upstream, cached 24h):

Provider Source Size
OpenRouter openrouter.ai/api/v1/modelspricing.prompt == 0 ~30 free models
NVIDIA NIM integrate.api.nvidia.com/v1/models ~135 models
Hugging Face router.huggingface.co/v1/models (live providers, $0.10/mo credits) ~120 models
Pollinations gen.pollinations.ai/v1/models (keyless) ~10 chat models

Static catalog (add models manually to providers.toml):

Gemini, Groq, Cerebras, Mistral, GitHub Models, Cohere, Together, Scaleway, Tencent Hunyuan, Chutes, LLM7, Ollama Cloud.

Groups

Filter models by capability with ./fi models -g <group>:

Group Matches
fast Low-latency small models
smart Flagship large models
code Coder-tuned models
reasoning Thinking models (R1, QwQ, o-series)
vision Multimodal text+image
embed Embedding models
rerank Rerankers
free Models with at least one zero-priced provider

Adding a provider

Append one TOML block, then ./fi sync (if auto-discovered) or you're done (if static):

[[provider]]
name = "your-provider"
key_env = "YOURPROVIDER_API_KEY"
base_url = "https://api.yours.example/v1"     # OpenAI-compatible endpoint
optional_key = false                           # true for keyless providers

[[provider.model]]
alias = "yours-flagship"
upstream = "their/model-id"
groups = ["smart", "vision"]

Then ./fi keys add your-provider <key> (unless optional_key = true) and ./fi call yours-flagship --prompt "…".

Why not a gateway?

Early versions of this repo were a local LiteLLM-backed gateway at localhost:4000 that spoke both OpenAI and Anthropic shapes to any client — useful if you're building an app that needs a persistent endpoint.

For agent use (Claude calling a model with your keys on your behalf), the gateway is overkill — the agent can just shell out directly. This repo is the simplified direct-call version.

If you do want the full gateway, it lives at bradAGI/fi-gateway.

Caveats

  • Providers that aren't OpenAI-compatible for chat (Voyage, Jina, Mixedbread, Nomic — mostly embedding-focused) aren't in this catalog yet. Add them to fi-gateway if you need translation.
  • Hugging Face models are "credit-gated free" — callable with HF's $0.10/mo free credit, not zero-cost per call. Use -g free to filter to HF models with at least one zero-priced provider.
  • ./fi stream uses SSE and requires the upstream to support stream: true on chat completions (almost all do).

License

MIT

About

Call free LLMs with your keys — Claude Code skill. No server, no Docker, just text in, text out.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages