Skip to content

daikiymmt/research-system

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Research System

Automated research paper discovery via Google Scholar, PDF monitoring, and AI-powered summarization for academic and technical literature.

Overview

The Research System automates your research workflow in a simple daily cycle:

  1. Automated Discovery - The system searches Google Scholar based on your topic/keyword list and creates a daily digest
  2. Review & Save - You review the digest and download PDFs of interesting papers to your topic's Sources/ folder
  3. Automatic Summarization - The system detects new PDFs and generates summaries in the Notes/ folder

Sample Daily Workflow:

  • Morning (Automated): System searches for new papers and creates today's digest
  • When you start work: Run /research-system:generate-research-digest to see today's papers and any new summaries
  • Throughout the day: Review digest, download interesting PDFs to [Topic]/Sources/
  • Evening (Automated): System detects new PDFs and queues them for summarization
  • Next day: Run /research-system:generate-research-digest again to get today's papers + yesterday's summaries

Note: Timing is fully customizable during setup.

Features

  • Google Scholar Discovery: Daily searches with configurable frequency
  • PDF Monitoring: Automatically detects new PDFs you save to Sources/ folders
  • AI Summarization: Generates concise bullet-point summaries with semantic tags
  • Large PDF Handling: Automatically splits papers >= 5 MB into sections to avoid context overflow
  • Conference Proceedings Support: Extract individual papers from multi-paper proceedings
  • Intelligent Filtering: Removes irrelevant papers based on your business focus
  • Flexible Integration: Works standalone or with task management systems
  • Markdown-based: Works with any markdown editor (Obsidian support built-in)
  • Secure API Keys: Supports SERPAPI_KEY environment variable (recommended) or config file
  • Config Validation: Built-in validation script to check your setup

Quick Start

1. Install the Plugin

cd ~/.claude/plugins/
git clone https://github.com/daiki-tadokoro/research-system.git

2. Install Python Dependencies

pip3 install -r ~/.claude/plugins/research-system/requirements.txt

3. Set Up SerpAPI Key

Get a free key from serpapi.com (250 searches/month on free tier).

Option A: Environment variable (recommended)

export SERPAPI_KEY=your_key_here

Option B: Config file Set serpapi.api_key in your config.yaml during setup.

4. Run Setup Wizard

In Claude Code:

/research-system:setup-research-automation

The wizard will guide you through:

  • Configuring research directory location
  • Setting up research topics and keywords
  • Configuring filter criteria
  • Setting up automated cron jobs

5. Validate Configuration

python3 ~/.claude/plugins/research-system/scripts/utilities/validate_config.py

6. Wait for Papers or Run Manually

Papers are fetched automatically by cron job, or run manually:

/research-system:fetch-papers

7. Process New Papers

/research-system:generate-research-digest

Sample Daily Workflow

Automated Tasks

  • Paper Discovery: Searches Google Scholar daily (or weekly, configurable), creates digest in daily-digests/YYYY-MM-DD.md
  • PDF Monitoring: Scans for new PDFs you've saved, queues them for summarization

Your Workflow

  1. Run /research-system:generate-research-digest to generate summaries and create research-today.md
  2. Review the digest and download interesting PDFs to topic folders

Filtering Large Digests

  • Run /research-system:filter-research-digest to remove irrelevant papers
  • Run /research-system:update-research-filters to refine criteria iteratively

Skills

  • /research-system:about - Show documentation and usage information
  • /research-system:generate-research-digest - Generate summaries and create today's digest
  • /research-system:research-summary - Generate summary for a single PDF
  • /research-system:split-conference-pdf - Split conference proceedings into individual papers
  • /research-system:filter-research-digest - Filter digest by relevance
  • /research-system:update-research-filters - Interactively refine filter criteria
  • /research-system:setup-research-automation - Configuration wizard
  • /research-system:fix-scheduled-scripts - Repair cron jobs after plugin directory changes
  • /research-system:fetch-papers - Manually run paper fetching
  • /research-system:monitor-sources - Scan for new PDFs and add to summarization queue
  • /research-system:check-logs - View recent log entries to diagnose issues

Working with Large PDFs and Conference Proceedings

Large Paper Handling

The system automatically handles large papers (>= 5 MB) by:

  1. Detecting file size before processing
  2. Splitting into sections using PDF structure
  3. Processing each section separately to avoid context overflow
  4. Cleaning up temporary files after summarization

This happens automatically - no action needed.

Conference Proceedings

  1. Split the proceedings: /research-system:split-conference-pdf
  2. Review extracted papers in the output directory
  3. Copy desired papers to your topic's Sources/ folder
  4. Run /research-system:generate-research-digest to summarize them

Note: Conference splitting requires the PDF to have embedded bookmarks/table of contents.

Directory Structure

research-directory/
├── research-today.md           # Your daily starting point
├── research-today-archive/     # Historical daily digests
├── daily-digests/              # Daily paper discovery results
│   ├── 2025-11-04.md
│   └── 2025-11-03.md
├── .research-data/             # Tracking files and logs
│   ├── .research-queue.json
│   ├── .seen_scholar_papers.json
│   ├── .processed_pdfs.json
│   ├── keywords.md
│   ├── fetch_papers.log
│   └── monitor_sources.log
└── [Topic Folders]/            # One per research topic
    ├── Sources/                # Put PDFs here
    └── Notes/                  # Auto-generated summaries

Configuration

Run /research-system:setup-research-automation to configure interactively, or manually create ~/.claude/research-system-config/config.yaml (see config/config.template.yaml).

Validate Configuration

python3 path/to/research-system/scripts/utilities/validate_config.py

This checks config file format, required fields, paths, API keys, and keywords file.

Requirements

  • Python 3.8+
  • Claude Code (for summarization)
  • SerpAPI key for Google Scholar (free tier: 250 searches/month)

API Usage

  • Google Scholar (via SerpAPI): Free tier allows 250 searches/month
    • Default: daily searches
    • Configurable: set search_frequency: "weekly" to search only on Sundays
    • With 10 topics x 3 keywords x 30 days = ~900 searches/month (consider weekly mode for many keywords)

Tips

  • Start with 3-5 topics with 3-5 keywords each
  • Use weekly mode if you have many keywords to conserve API quota
  • Refine filter criteria iteratively using /research-system:update-research-filters
  • Check logs if papers stop appearing: run /research-system:check-logs
  • Validate config after changes: python3 scripts/utilities/validate_config.py

Troubleshooting

Cron jobs stopped working

Run /research-system:fix-scheduled-scripts to repair the symlink and update crontab.

No papers in digest

  • Run /research-system:check-logs to see errors
  • Run /research-system:fetch-papers to manually trigger
  • Run /research-system:fix-scheduled-scripts if cron paths are broken
  • Verify SerpAPI key: python3 scripts/utilities/validate_config.py

Summaries not generating

  • Run /research-system:monitor-sources to scan for new PDFs
  • Run /research-system:generate-research-digest to process queued PDFs
  • Verify PDFs exist in your topic's Sources/ folders

Too many irrelevant papers

  • Run /research-system:filter-research-digest on large digests
  • Run /research-system:update-research-filters to refine criteria

License

MIT

Credits

Originally based on ttorres33/research-system by Teresa Torres. Improved with skills migration, Google Scholar focus, config validation, and security enhancements.

About

Automated research paper discovery via Google Scholar, PDF monitoring, and AI-powered summarization. Claude Code plugin.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages