Skip to content

jina-ai/cli

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

jina-cli

MCP version

All Jina AI APIs as Unix commands. Search, read, embed, rerank - with pipes.

This CLI is designed for both humans and AI agents. An agent with shell access needs only run(command="jina search ...") instead of managing 20 separate tool definitions. The CLI supports pipes, chaining (&&, ||, ;), and --help for self-discovery.

Install

pip install jina-cli
# or
uv pip install jina-cli

Set your API key:

export JINA_API_KEY=your-key-here
# Get one at https://jina.ai/?sui=apikey

Commands

Command Description
jina read URL Extract clean markdown from web pages
jina search QUERY Web search (also --arxiv, --ssrn, --images, --blog)
jina embed TEXT Generate embeddings
jina rerank QUERY Rerank documents from stdin by relevance
jina dedup Deduplicate text from stdin
jina screenshot URL Capture screenshot of a URL
jina bibtex QUERY Search BibTeX citations (DBLP + Semantic Scholar)
jina expand QUERY Expand a query into related queries
jina pdf URL Extract figures/tables/equations from PDFs
jina datetime URL Guess publish/update date of a URL
jina primer Context info (time, location, network)
jina grep PATTERN Semantic grep (requires pip install jina-grep)

Pipes

The point of a CLI is composability. Every command reads from stdin and writes to stdout.

# Search and rerank
jina search "transformer models" | jina rerank "efficient inference"

# Read multiple URLs
cat urls.txt | jina read

# Search, deduplicate results
jina search "attention mechanism" | jina dedup

# Chain searches
jina expand "climate change" | head -1 | xargs -I {} jina search "{}"

# Get BibTeX for arXiv results
jina search --arxiv "BERT" --json | jq -r '.results[].title' | head -3

Usage

Read web pages

jina read https://example.com
jina read https://example.com --links --images
echo "https://example.com" | jina read

Search

jina search "what is BERT"
jina search --arxiv "attention mechanism" -n 10
jina search --ssrn "corporate governance"
jina search --images "neural network diagram"
jina search --blog "embeddings"
jina search "AI news" --time d          # past day
jina search "LLMs" --gl us --hl en     # US, English

Embed

jina embed "hello world"
jina embed "text1" "text2" "text3"
cat texts.txt | jina embed
jina embed "hello" --model jina-embeddings-v3 --task retrieval.query

Rerank

cat docs.txt | jina rerank "machine learning"
jina search "AI" | jina rerank "embeddings" --top-n 5

Deduplicate

cat items.txt | jina dedup
cat items.txt | jina dedup -k 10

Screenshot

jina screenshot https://example.com                        # prints screenshot URL
jina screenshot https://example.com -o page.png            # saves to file
jina screenshot https://example.com --full-page -o page.jpg

BibTeX

jina bibtex "attention is all you need"
jina bibtex "transformer" --author Vaswani --year 2017

PDF extraction

jina pdf https://arxiv.org/pdf/2301.12345
jina pdf 2301.12345                        # arXiv ID shorthand
jina pdf https://example.com/paper.pdf --type figure,table

JSON output

Every command supports --json for structured output, useful for piping to jq:

jina search "BERT" --json | jq '.results[0].url'
jina read https://example.com --json | jq '.data.content'

Exit codes

Code Meaning
0 Success
1 User/input error (missing args, bad input, missing API key)
2 API/server error (network, timeout, server error)
130 Interrupted (Ctrl+C)

Useful for scripting and agent workflows:

jina search "query" && echo "success" || echo "failed with $?"

Environment variables

Variable Description
JINA_API_KEY API key for Jina services (required for most commands)

For AI agents

An agent with shell access can use this CLI directly:

result = run(command="jina search 'transformer architecture'")
result = run(command="jina read https://arxiv.org/abs/2301.12345")
result = run(command="jina search 'AI' | jina rerank 'embeddings'")

No tool catalog needed. The agent discovers capabilities via jina --help and jina search --help. Errors include actionable guidance.

Semantic grep

jina grep provides semantic search over files using local Jina embeddings on MLX. It requires a separate install:

pip install jina-grep
jina grep "error handling" src/
jina grep -r --threshold 0.3 "database connection" .
grep -rn "error" src/ | jina grep "retry logic"

Supports most GNU grep flags (-r, -n, -l, -c, -A/-B/-C, --include, --exclude) plus semantic flags (--threshold, --top-k, --model). Run jina grep --help for full options.

Server mode

For repeated queries, start a persistent embedding server to avoid model reload:

jina grep serve start    # background server, model stays in GPU memory
jina grep serve stop     # stop when done

Local mode

jina embed and jina rerank support --local to run on Apple Silicon via the jina-grep embedding server instead of the Jina API. No API key needed.

# Start the local server first
jina grep serve start

# Local embeddings
jina embed --local "hello world"
cat texts.txt | jina embed --local --json

# Local reranking (cosine similarity on local embeddings)
cat docs.txt | jina rerank --local "machine learning"

Local mode uses jina-embeddings-v5-nano by default. Override with --model jina-embeddings-v5-small.

Requires pip install jina-grep and jina grep serve start.

Design principles

Inspired by CLI is All Agents Need:

  • One tool, not twenty. A single run(command="jina search ...") replaces a sprawling tool catalog. Less tool selection overhead, more problem solving.
  • Unix pipes are the composition model. stdout is data, stderr is diagnostics. Commands chain with |, &&, ||. No SDK needed.
  • Progressive --help for self-discovery. Layer 0: command list. Layer 1: usage + examples. Layer 2: full options. The agent fetches only what it needs, saving context budget.
  • Error messages that course-correct. Every error says what went wrong and exactly how to fix it. One bad command should not cost more than one retry.
  • stderr is the agent's most important channel. When a command fails, stderr carries the fix. Never discard it. Never mix it with data.
  • Consistent output format. Same structure every time so the agent learns once, not every time. --json for structured, plain text for pipes.
  • Meaningful exit codes. 0 success, 1 user error, 2 API error, 130 interrupted. Scripts and agents branch on these, not on parsing error strings.
  • Layer 1 is raw Unix, Layer 2 is for LLM cognition. Pipe internals stay pure (no metadata, no truncation). Formatting and context only at the final output boundary.

License

Apache-2.0

About

All Jina AI APIs as Unix CLI commands. Search, read, embed, rerank - with pipes.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages