jina-cli

All Jina AI APIs as Unix commands. Search, read, embed, rerank - with pipes.

This CLI is designed for both humans and AI agents. An agent with shell access needs only run(command="jina search ...") instead of managing 20 separate tool definitions. The CLI supports pipes, chaining (&&, ||, ;), and --help for self-discovery.

Install

pip install jina-cli
# or
uv pip install jina-cli

Set your API key:

export JINA_API_KEY=your-key-here
# Get one at https://jina.ai/?sui=apikey

Commands

Command	Description
`jina read URL`	Extract clean markdown from web pages
`jina search QUERY`	Web search (also --arxiv, --ssrn, --images, --blog)
`jina embed TEXT`	Generate embeddings
`jina rerank QUERY`	Rerank documents from stdin by relevance
`jina dedup`	Deduplicate text from stdin
`jina screenshot URL`	Capture screenshot of a URL
`jina bibtex QUERY`	Search BibTeX citations (DBLP + Semantic Scholar)
`jina expand QUERY`	Expand a query into related queries
`jina pdf URL`	Extract figures/tables/equations from PDFs
`jina datetime URL`	Guess publish/update date of a URL
`jina primer`	Context info (time, location, network)
`jina grep PATTERN`	Semantic grep (requires `pip install jina-grep`)

Pipes

The point of a CLI is composability. Every command reads from stdin and writes to stdout.

# Search and rerank
jina search "transformer models" | jina rerank "efficient inference"

# Read multiple URLs
cat urls.txt | jina read

# Search, deduplicate results
jina search "attention mechanism" | jina dedup

# Chain searches
jina expand "climate change" | head -1 | xargs -I {} jina search "{}"

# Get BibTeX for arXiv results
jina search --arxiv "BERT" --json | jq -r '.results[].title' | head -3

Usage

Read web pages

jina read https://example.com
jina read https://example.com --links --images
echo "https://example.com" | jina read

Search

jina search "what is BERT"
jina search --arxiv "attention mechanism" -n 10
jina search --ssrn "corporate governance"
jina search --images "neural network diagram"
jina search --blog "embeddings"
jina search "AI news" --time d          # past day
jina search "LLMs" --gl us --hl en     # US, English

Embed

jina embed "hello world"
jina embed "text1" "text2" "text3"
cat texts.txt | jina embed
jina embed "hello" --model jina-embeddings-v3 --task retrieval.query

Rerank

cat docs.txt | jina rerank "machine learning"
jina search "AI" | jina rerank "embeddings" --top-n 5

Deduplicate

cat items.txt | jina dedup
cat items.txt | jina dedup -k 10

Screenshot

jina screenshot https://example.com                        # prints screenshot URL
jina screenshot https://example.com -o page.png            # saves to file
jina screenshot https://example.com --full-page -o page.jpg

BibTeX

jina bibtex "attention is all you need"
jina bibtex "transformer" --author Vaswani --year 2017

PDF extraction

jina pdf https://arxiv.org/pdf/2301.12345
jina pdf 2301.12345                        # arXiv ID shorthand
jina pdf https://example.com/paper.pdf --type figure,table

JSON output

Every command supports --json for structured output, useful for piping to jq:

jina search "BERT" --json | jq '.results[0].url'
jina read https://example.com --json | jq '.data.content'

Exit codes

Code	Meaning
0	Success
1	User/input error (missing args, bad input, missing API key)
2	API/server error (network, timeout, server error)
130	Interrupted (Ctrl+C)

Useful for scripting and agent workflows:

jina search "query" && echo "success" || echo "failed with $?"

Environment variables

Variable	Description
`JINA_API_KEY`	API key for Jina services (required for most commands)

For AI agents

An agent with shell access can use this CLI directly:

result = run(command="jina search 'transformer architecture'")
result = run(command="jina read https://arxiv.org/abs/2301.12345")
result = run(command="jina search 'AI' | jina rerank 'embeddings'")

No tool catalog needed. The agent discovers capabilities via jina --help and jina search --help. Errors include actionable guidance.

Semantic grep

jina grep provides semantic search over files using local Jina embeddings on MLX. It requires a separate install:

pip install jina-grep

jina grep "error handling" src/
jina grep -r --threshold 0.3 "database connection" .
grep -rn "error" src/ | jina grep "retry logic"

Supports most GNU grep flags (-r, -n, -l, -c, -A/-B/-C, --include, --exclude) plus semantic flags (--threshold, --top-k, --model). Run jina grep --help for full options.

Server mode

For repeated queries, start a persistent embedding server to avoid model reload:

jina grep serve start    # background server, model stays in GPU memory
jina grep serve stop     # stop when done

Local mode

jina embed and jina rerank support --local to run on Apple Silicon via the jina-grep embedding server instead of the Jina API. No API key needed.

# Start the local server first
jina grep serve start

# Local embeddings
jina embed --local "hello world"
cat texts.txt | jina embed --local --json

# Local reranking (cosine similarity on local embeddings)
cat docs.txt | jina rerank --local "machine learning"

Local mode uses jina-embeddings-v5-nano by default. Override with --model jina-embeddings-v5-small.

Requires pip install jina-grep and jina grep serve start.

Design principles

Inspired by CLI is All Agents Need:

One tool, not twenty. A single run(command="jina search ...") replaces a sprawling tool catalog. Less tool selection overhead, more problem solving.
Unix pipes are the composition model. stdout is data, stderr is diagnostics. Commands chain with |, &&, ||. No SDK needed.
Progressive --help for self-discovery. Layer 0: command list. Layer 1: usage + examples. Layer 2: full options. The agent fetches only what it needs, saving context budget.
Error messages that course-correct. Every error says what went wrong and exactly how to fix it. One bad command should not cost more than one retry.
stderr is the agent's most important channel. When a command fails, stderr carries the fix. Never discard it. Never mix it with data.
Consistent output format. Same structure every time so the agent learns once, not every time. --json for structured, plain text for pipes.
Meaningful exit codes. 0 success, 1 user error, 2 API error, 130 interrupted. Scripts and agents branch on these, not on parsing error strings.
Layer 1 is raw Unix, Layer 2 is for LLM cognition. Pipe internals stay pure (no metadata, no truncation). Formatting and context only at the final output boundary.

License

Apache-2.0

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github/workflows		.github/workflows
jina_cli		jina_cli
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

jina-cli

Install

Commands

Pipes

Usage

Read web pages

Search

Embed

Rerank

Deduplicate

Screenshot

BibTeX

PDF extraction

JSON output

Exit codes

Environment variables

For AI agents

Semantic grep

Server mode

Local mode

Design principles

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

jina-cli

Install

Commands

Pipes

Usage

Read web pages

Search

Embed

Rerank

Deduplicate

Screenshot

BibTeX

PDF extraction

JSON output

Exit codes

Environment variables

For AI agents

Semantic grep

Server mode

Local mode

Design principles

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages