Skip to content

Pepper66/researchMind

Repository files navigation

researchMind

researchMind is a lightweight research paper agent. It combines a readable terminal agent core with practical paper workflows: PDF reading, single-paper search, multi-paper RAG, paper comparison, and a small metadata catalog for managing your own paper library.

中文 | Environment

Features

  • Terminal agent for OpenAI-compatible LLM APIs.
  • PDF reading with PyMuPDF and optional OCR fallback for scanned PDFs.
  • Single-paper semantic search with build_paper_index and search_paper.
  • Multi-paper vector library with add_paper_to_library, search_paper_library, and compare_papers.
  • Paper catalog management with add_paper_record, list_paper_library, update_paper_metadata, remove_paper_record, and export_paper_library.
  • Streamlit app for uploading papers, searching evidence, generating answers, writing review drafts, editing metadata, and exporting the catalog.

The old internal corecoder package name is kept as a compatibility layer, but new users should run the researchmind command and import researchmind.

Quick Start

git clone https://github.com/Pepper66/researchMind.git
cd researchMind

python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate
python -m pip install -U pip
python -m pip install -e .

cp .env.example .env

Edit .env:

OPENAI_API_KEY=sk-your-key
OPENAI_BASE_URL=https://api.openai.com/v1
RESEARCHMIND_MODEL=gpt-4o

Run the CLI:

researchmind -m gpt-4o

One-shot mode:

researchmind -p "Read papers/SAM.pdf and summarize its method."

Paper RAG Setup

Install the optional paper dependencies:

python -m pip install -e ".[paper]"

or:

python -m pip install -r requirements-paper.txt

Run the app:

streamlit run app.py

The default embedding model is:

RESEARCHMIND_EMBED_MODEL=sentence-transformers/all-MiniLM-L6-v2

You can set this to any Hugging Face sentence-transformer model id, or to a local model path in your private .env file.

Example Agent Tasks

Add papers/SAM.pdf to my library named medlib, tag it segmentation, then search
for its limitations.

The agent can call:

add_paper_to_library(file_path="papers/SAM.pdf", library_name="medlib")
update_paper_metadata(paper_id="...", library_name="medlib", tags="segmentation")
search_paper_library(query="SAM limitations", library_name="medlib")

Compare multiple topics or papers:

Compare SAM and MedSAM in research problem, method, datasets, and limitations.

Export catalog metadata:

Export my medlib paper catalog to papers/medlib_catalog.csv.

Configuration

Common environment variables:

Variable Purpose
OPENAI_API_KEY API key for OpenAI-compatible providers.
OPENAI_BASE_URL Provider endpoint. Leave unset for OpenAI.
RESEARCHMIND_MODEL Main CLI model.
RESEARCHMIND_ANSWER_MODEL Streamlit answer-generation model.
RESEARCHMIND_EMBED_MODEL Embedding model id or local path.
RESEARCHMIND_INDEX_DIR Single-paper index directory.
RESEARCHMIND_LIBRARY_DIR Multi-paper vector library directory.
RESEARCHMIND_PROVIDER Set to litellm for LiteLLM backend.

See ENVIRONMENT.md for more details.

Local Data

Runtime data is intentionally ignored by git:

  • papers/
  • .researchmind_index/
  • .researchmind_library/
  • legacy .paperagent_index/
  • legacy .paperagent_library/

Do not commit private PDFs, vector indexes, API keys, or local model paths.

Development

python -m pip install -e ".[dev]"
python -m pytest -q
python -m compileall -q corecoder researchmind tests

Build the package:

python -m pip install -U build twine
python -m build
python -m twine check dist/*

License

MIT.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages