Skip to content

Search runbook assumes Boolean query syntax that Semantic Scholar doesn't support #1

@melek

Description

@melek

Problem

The search runbook (runbooks/search.md) and the protocol template guide users toward writing Boolean query strings with AND/OR operators and quoted phrases for Semantic Scholar. However, the Semantic Scholar API explicitly does not support Boolean syntax — its search endpoint accepts plain-text queries only.

From the MCP tool description:

"A plain-text search query string. No special query syntax is supported."

This means queries like "response calibration" AND ("human-AI" OR "human-computer") are silently treated as a bag of words, producing unpredictable results rather than the intended Boolean filtering.

arXiv does support Boolean + quoted phrases. paper-search-mcp appears to be plain-text.

Impact

  • Users will write Boolean queries for Semantic Scholar during Phase 0 protocol definition, believing they work
  • Search results will be unpredictable — sometimes adequate (relevant words present), sometimes missing targeted papers
  • The protocol documents a search strategy that cannot be faithfully executed, undermining reproducibility (A7)

Proposed Fix

  1. Search runbook: Add a per-database query syntax reference table at the top. Flag Semantic Scholar as plain-text only. Recommend using fieldsOfStudy, year, and venue API parameters for filtering instead of query-level Boolean.
  2. Protocol template: Split the search terms table to have database-specific guidance. Semantic Scholar rows should note "plain-text, use API filters" vs arXiv rows noting "Boolean + quoted phrases supported."
  3. SKILL.md Phase 0 Field 2: Update the search term guidance to explain the syntax difference between databases during protocol definition.
  4. Search agent: Could validate query syntax against database capabilities before executing, logging a warning if Boolean operators are detected in a plain-text-only database query.

Context

Discovered during panel review (2026-03-23). The first live review run hit this immediately.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions