Fix broken literature_enhanced imports in two writer scripts by realmarcin · Pull Request #88 · CultureBotAI/CommunityMech

realmarcin · 2026-05-26T02:27:49Z

Summary

Fix a pre-existing import bug surfaced as a heads-up by PR #87.

scripts/add_evidence_source.py and scripts/intelligent_snippet_fixer.py both import EnhancedLiteratureFetcher from communitymech.literature_enhanced — a module that was never committed to git (only a stale .pyc in __pycache__/ was shadowing the missing source locally). Both scripts raise ModuleNotFoundError on import for anyone running them today.

Swap to LiteratureFetcher from communitymech.literature, which exists in the repo, is well-tested, and exposes the same fetch_pubmed_abstract + fetch_paper surface plus a richer DOI fallback chain (CrossRef → PubMed-via-DOI → PMC full-text → OpenAlex → Semantic Scholar → Europe PMC → publisher meta-tag scrape) that subsumes what the missing fetch_abstract_for_doi did.

API differences

Difference	Impact
`fetch_paper()` returns `(abstract, pdf_url)` not a dict	Tuple-unpack at call sites
`fetch_paper()` has no `download_pdf` kwarg	The flag was effectively a no-op in `LiteratureFetcher`'s pipeline (the PDF URL is just returned alongside the abstract)
Title field unavailable separately	`add_evidence_source.guess_evidence_source` `filter(None, …)`-merged title with snippet + abstract; losing it degrades classification marginally. PubMed abstracts include the title in the abstract text, so PMID references are unaffected. If DOI title is needed later, `fetch_doi_metadata()` returns CrossRef metadata with a title field.
Constructor: no `use_fallback_pdf` arg	Drop it; was a no-op

After-state

Both scripts now import + run their initialization paths cleanly (verified via importlib.util.spec_from_file_location + exec_module).
pytest tests/: 136 passed, 9 skipped (unchanged).
ruff check: pre-existing F541 warnings on lines this PR didn't touch are left alone.

Test plan

Both scripts pass ast.parse.
Both scripts now import without ModuleNotFoundError.
Existing tests pass.
(Manual, post-merge) Run one of the converted writers against real data with --apply to confirm end-to-end behavior.

🤖 Generated with Claude Code

scripts/add_evidence_source.py and scripts/intelligent_snippet_fixer.py both import EnhancedLiteratureFetcher from communitymech.literature_enhanced — a module that was never committed to git (only a stale .pyc was shadowing the missing source locally). Both scripts have raised ModuleNotFoundError on import for as long as anyone has tried to run them, which was surfaced as a pre-existing-state heads-up by the recent writer-conversion PR #87. Swap to LiteratureFetcher from communitymech.literature, which exposes the same fetch_pubmed_abstract + fetch_paper surface plus a richer DOI fallback chain (CrossRef → PubMed via DOI lookup → PMC full-text → OpenAlex → Semantic Scholar → Europe PMC → publisher meta-tag scrape) that subsumes what fetch_abstract_for_doi did. API differences: - fetch_paper returns (abstract, pdf_url) not a dict; tuple-unpack at call sites. - LiteratureFetcher.fetch_paper has no download_pdf kwarg (the older version's flag was a no-op in the LiteratureFetcher pipeline; the pdf URL is just returned alongside the abstract). - Title field is unavailable separately. In add_evidence_source.py's guess_evidence_source classifier the title was filter(None, …)-merged with snippet and abstract anyway; losing it degrades classification marginally (PubMed abstracts include the title in the abstract text, so PMID references are unaffected). If richer DOI classification is needed later, LiteratureFetcher.fetch_doi_metadata() returns CrossRef metadata with a title field. After-state: both scripts now import and run their initialization paths cleanly. pytest tests/ still passes (136 passed, 9 skipped). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

This PR fixes two writer scripts that currently fail to import due to referencing a non-existent communitymech.literature_enhanced module, by switching them to the in-repo communitymech.literature.LiteratureFetcher API.

Changes:

Replace EnhancedLiteratureFetcher imports/usages with LiteratureFetcher in scripts/intelligent_snippet_fixer.py and scripts/add_evidence_source.py.
Update call sites to match LiteratureFetcher.fetch_paper() returning (abstract, pdf_url) (tuple-unpack and ignore pdf_url).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
scripts/intelligent_snippet_fixer.py	Switches to `LiteratureFetcher` and updates abstract-fetching logic to use `fetch_paper()` tuple return.
scripts/add_evidence_source.py	Switches to `LiteratureFetcher`, updates `fetch_paper()` usage, and adjusts exception handling.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot flagged that title was assigned None and then passed through guess_evidence_source as a parameter that the classifier merged into its keyword-matching text via filter(None, ...). With title always None the parameter was dead code that just clutters the call sites. Remove the title parameter from guess_evidence_source and from both caller blocks. PubMed abstracts already embed the title in the abstract text (so PMID-driven classification is unchanged), and CrossRef titles for DOI references are available via LiteratureFetcher.fetch_doi_metadata() if richer classification is wanted later — that's now a clear future-work hook rather than a hard-coded-None pretense. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Add validate-strict CI workflow Locks in the 0-error closed-schema baseline established across PRs #84-#88 so it can't silently regress on future merges. Mirrors the qc.yaml workflow shipped to TraitMech in PR #77 and adapted for the CommunityMech runner conventions. Runs `just validate-strict` + `just audit-writers` + `pytest tests/` on PRs touching kb/communities/, schema, source, scripts, justfile, or this workflow. Uploads the categorized TSV reports as workflow artifacts so reviewers can inspect failures without re-running locally. Deliberately scoped narrower than `just qc` — the existing network-quality.yml handles the network-integrity audit; this new workflow specifically gates closed-schema validation + writer audit + unit tests, which are fast and run on every PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Address Copilot review on PR #89 Three findings, all addressed: - Path globs: \`**.py\` doesn't actually match nested subdirs in GitHub Actions glob semantics — it behaves like \`*.py\`. Switch to \`**/*.py\` so changes under src/communitymech/<subpkg>/ (like network/, validators/, embedding/) trigger the workflow. - Push trigger had no \`paths:\` filter, so the workflow ran on every commit to main. Mirror the pull_request path list via a YAML anchor (&trigger_paths + *trigger_paths) so the two stay in sync. - uv sync: switch to \`--frozen --all-extras\` so the workflow fails if uv.lock is stale (instead of silently re-resolving) while keeping the dev/test extras the existing network-quality.yml uses. Also added uv.lock + tests/**/*.py to the trigger paths so dependency and test changes re-run the workflow. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings May 26, 2026 02:27

Copilot started reviewing on behalf of realmarcin May 26, 2026 02:27 View session

Copilot AI reviewed May 26, 2026

View reviewed changes

Comment thread scripts/add_evidence_source.py Outdated

Comment thread scripts/add_evidence_source.py Outdated

realmarcin merged commit 393b373 into main May 26, 2026

realmarcin deleted the fix/broken-literature-imports branch May 26, 2026 02:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix broken literature_enhanced imports in two writer scripts#88

Fix broken literature_enhanced imports in two writer scripts#88
realmarcin merged 2 commits into
mainfrom
fix/broken-literature-imports

realmarcin commented May 26, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

realmarcin commented May 26, 2026

Summary

API differences

After-state

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants