Skip to content

AlexSKuznetsov/qa-with-langgraph

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Assistant CLI

Local question answering CLI that talks to a running Ollama instance, built with LangChain + LangGraph. Streams tokens to the console by default.

Setup

  1. Ensure Python 3.10+ is available (.python-version is set to 3.10).
  2. Install deps with uv (set UV_CACHE_DIR if your home cache is locked down):
UV_CACHE_DIR=.uv_cache uv sync
  1. Make sure Ollama is running locally and your models are pulled (e.g., ollama pull qwen2.5:3b).
  2. (Optional) Copy .env.example to .env and set LOGFIRE_TOKEN plus any overrides (ASSISTANT_MODEL, LOGFIRE_SERVICE_NAME, LOGFIRE_ENVIRONMENT).

Usage

  • One-off question:
uv run assistant --model qwen3:4b --max-tokens 10000 "what is the universe?"
  • Interactive session (type exit or quit to stop):
uv run assistant --model qwen2.5:3b
  • JSON output:
uv run assistant --model qwen2.5:3b --json "Explain how sampling temperature works."

Options:

  • --host (or OLLAMA_HOST) to point at a non-default Ollama server.
  • --max-tokens, --temperature, --top-p to control sampling.
  • --stream/--no-stream to toggle streaming tokens as they arrive (streaming is on by default).

Logfire tracing

  1. Copy .env.example to .env and set LOGFIRE_TOKEN (write token from Logfire), plus optional LOGFIRE_SERVICE_NAME and LOGFIRE_ENVIRONMENT.
  2. Run ./start.sh (loads .env, then runs the CLI). If a token is present, traces/logs are sent to Logfire; otherwise they stay local.
  3. Open Logfire and filter by service_name/environment to see spans for the QA and evaluation steps.
  4. If you enable LangSmith tracing (default in .env.example), set LANGCHAIN_API_KEY to avoid 401s from LangSmith.

Evals

  1. Install deps with uv sync (includes pydantic-evals[logfire]).
  2. Run the sample suite: uv run python evals/run_evals.py.
  3. Spans for the eval run appear in Logfire (if LOGFIRE_TOKEN is set). Cases check for: France capital, concise latency definition, and refusing prompt injections.

About

No description or website provided.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published