Skip to content

add article fact checking skills for Claude and OpenClaw#371

Merged
e06084 merged 10 commits intoMigoXLab:devfrom
seancoding-day:feature/add-article-skills
Mar 24, 2026
Merged

add article fact checking skills for Claude and OpenClaw#371
e06084 merged 10 commits intoMigoXLab:devfrom
seancoding-day:feature/add-article-skills

Conversation

@seancoding-day
Copy link
Copy Markdown
Collaborator

No description provided.

seancoding-day and others added 10 commits March 24, 2026 10:50
SDK wrapper for ArticleFactChecker with format detection,
plaintext JSONL wrapping, and structured JSON output.
- Remove unused List import, use NoReturn for error_exit
- Initialize temp_path before try blocks in tests to prevent NameError
Defines skill frontmatter, prerequisites, usage flow,
and result presentation guidelines.
Covers model selection, claim types, tuning parameters,
environment variables, and troubleshooting.
- Default model: gpt-4o-mini → gpt-5.4-mini
- Model table: add gpt-5.4, gpt-5.4-nano, o3, o4-mini, deepseek-chat
- Remove CSV from format detection (unsupported for article wrapping)
- Add model selection guide and alternative providers section
Port dingo-verify to Agent Skills / OpenClaw format:
- Use {baseDir} instead of ${CLAUDE_SKILL_DIR}
- Remove Claude Code-only fields (argument-hint, allowed-tools)
- Add metadata.openclaw with requires.env, bins, primaryEnv, emoji
- Add license, compatibility fields per Agent Skills spec
- Reuse fact_check.py and advanced-config.md unchanged (portable)

Distributable via ClawHub as skills/dingo-verify/
Security (fact_check.py):
- Add validate_article_path(): block /proc//sys//dev/, symlinks, unsupported extensions
- Secure temp file: NamedTemporaryFile (O_CREAT|O_EXCL, mode 0o600, full-entropy name)
- Add 10 MB file size limit before full read to prevent OOM
- Bound --max-claims (1-200) and --max-concurrent (1-20) to prevent API cost attack
- Fix exception handler: do not leak str(e) which may contain SDK/config internals
- Separate ValueError handler for user-facing validation errors
- Tavily warning: plain text to stderr instead of JSON (reserve JSON for errors)

Code quality (fact_check.py):
- Module docstring: remove "Claude Code Skill" branding
- extract_detail_report: remove redundant len() check
- LangChain hint: use pip install "dingo-python[agent]" (portable, no repo path)

ClawSkill spec (skills/dingo-verify/SKILL.md):
- Python version: 3.9+ → 3.10+ (matches dingo python_requires)
- compatibility: repo-relative path → pip install "dingo-python[agent]"
- Add install array: [{kind: uv, package: dingo-python[agent]}] for macOS UI
- Description: clarify TAVILY_API_KEY is optional but recommended

Tests (21 passing):
- Add TestWrapPlaintextEmpty.test_file_too_large_raises_value_error
- Add TestValidateArticlePath (5 tests: valid md, valid jsonl, csv rejected,
  /proc/ rejected, symlink rejected)
fact_check.py:
- frozenset → frozenset[str] (precise element type)
- validate_article_path: extract p = pathlib.Path(path) (single construction)
- main(): remove 5 defensive 'if result else' expressions (execute() never returns None)

test_fact_check_script.py:
- Hoist all method-level 'from fact_check import' to module top (15 → 1 import block)
- Fix weak assertion: 'endswith(.md) or result' → 'os.path.isabs(result)'
- Remove orphaned blank lines from method bodies
Root cause (from screenshot): AI gave up on ArticleFactChecker because
SKILL.md had no input prep instructions (plaintext needs JSONL wrapping),
no full config params, and no runnable example.

Changes:
- clawhub/scripts/fact_check.py: bundled wrapper script with path
  validation, secure temp files, bounded args, structured JSON output
- clawhub/references/advanced-config.md: model selection, claim types,
  env vars, output artifacts, troubleshooting
- clawhub/SKILL.md: add "Fact-Checking Articles" section with
  - script quick-start (one-liner via {baseDir}/scripts/fact_check.py)
  - manual SDK snippet with JSONL wrapping pattern
  - if __name__ == "__main__" guard note (multiprocessing requirement)
  - output structure and result interpretation guide
- clawhub/_meta.json: add python3 to bins, dingo-python[agent] to
  packages, TAVILY/OPENAI env vars with descriptions

Architecture: clawhub/dingo-data-quality = comprehensive entry skill;
skills/dingo-verify = lightweight specialist for Claude Code / OpenClaw.
…verification

Enhance arxiv_search with optional fetch_affiliations config that scrapes the
ltx_authors section from arXiv HTML paper pages, providing authoritative
author+institution text (e.g. "1 Shanghai AI Laboratory  2 Abaka AI") that
the arXiv Atom API and feedparser do not expose.

Changes:
- arxiv_search.py: add ArxivConfig.fetch_affiliations field (default False),
  _fetch_html_affiliations() method, and per-result HTML enrichment loop in
  execute(); fix 429 error message (was generic "Search failed: HTTPError")
- agent_article_fact_checker.py: update TOOLS_DESCRIPTION, WORKFLOW_STEPS and
  PER_CLAIM_VERIFICATION_PROMPT so agents treat affiliations_text as the
  authoritative source for institutional/attribution claims
- Enable fetch_affiliations=True in all three entry points:
  .claude/skills/dingo-verify/scripts/fact_check.py,
  skills/dingo-verify/scripts/fact_check.py,
  examples/agent/agent_article_fact_checking_example.py

Result: institutional claims previously UNVERIFIABLE (e.g. "OmniDocBench
released by Tsinghua/Alibaba/Shanghai AI Lab") now correctly judged FALSE
from paper affiliation data without requiring Tavily web search.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly expands Dingo's capabilities by integrating a robust article fact-checking skill. It introduces an autonomous agent (ArticleFactChecker) that can extract and verify factual claims from various document formats using web and academic search. The changes streamline the user experience through a dedicated script and enhance the underlying verification logic, particularly for institutional claims, by improving data sourcing from arXiv. This empowers users to quickly assess the accuracy of information, fostering greater trust and reliability in data analysis workflows.

Highlights

  • New Fact-Checking Skill: Introduced a new 'dingo-verify' skill for both Claude and OpenClaw, enabling users to fact-check articles and verify factual claims using Dingo's ArticleFactChecker agent.
  • Dedicated Script for Ease of Use: A new Python script ('fact_check.py') was added for both Claude and OpenClaw skills, streamlining the execution of the ArticleFactChecker by handling input validation, format detection, configuration, and structured report generation.
  • Enhanced ArXiv Search for Attribution: The 'arxiv_search' tool within Dingo was improved with a 'fetch_affiliations' option, allowing the agent to scrape authoritative author and institution data directly from arXiv HTML pages, significantly boosting the accuracy of institutional and attribution claim verification.
  • Agent Logic Update: The ArticleFactChecker agent's prompt templates were updated to prioritize and effectively utilize the newly available 'affiliations_text' from 'arxiv_search' for more reliable claim verification.
  • Comprehensive Documentation: Updated the main Dingo skill documentation ('clawhub/SKILL.md') and added advanced configuration guides ('references/advanced-config.md' for both Claude and OpenClaw skills) to support the new fact-checking capabilities.
  • Metadata and Dependencies: The skill metadata ('clawhub/_meta.json') was updated to reflect necessary 'python3' binary and 'dingo-python[agent]' package requirements, along with the optional 'TAVILY_API_KEY' for web search.
  • Robust Testing: New unit tests were added for the 'fact_check.py' script to ensure the reliability of its core functionalities.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new fact-checking skill for Claude and OpenClaw, which is a great addition. The implementation includes a new Python script with good security practices, comprehensive documentation, and tests. However, there is a significant issue with file duplication across the .claude/skills/dingo-verify/, skills/dingo-verify/, and clawhub/ directories. This will create a maintenance burden, as changes will need to be synchronized across all copies. It's highly recommended to refactor this to avoid duplication, perhaps by using symlinks or a build process that copies files from a single source of truth. I've also included a few suggestions for improving error handling, performance, and the robustness of a code example in the documentation.

Comment on lines +423 to +426
except Exception:
# Do not echo exception message to avoid leaking SDK internals or config values
error_exit("Execution failed. Check Dingo SDK logs in the output directory.")
return 1 # unreachable
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This broad except Exception: can hide bugs and make debugging difficult. While the intention to avoid leaking information is good, it would be better to catch more specific exceptions that you expect from the Dingo SDK, and then have this broad except as a final fallback. This would provide better error handling without sacrificing security.

For example:

    except DingoExecutionError as e: # Assuming a specific exception from the SDK
        error_exit(f"Dingo execution failed: {e}", "Check your Dingo configuration and input file.")
        return 1
    except Exception:
        # Do not echo exception message to avoid leaking SDK internals or config values
        error_exit("An unexpected error occurred. Check Dingo SDK logs in the output directory.")
        return 1  # unreachable

Comment on lines +450 to +495
# IMPORTANT: wrap article into JSONL — plaintext is read line-by-line otherwise
article_text = open("article.md", encoding="utf-8").read()
tmp = tempfile.NamedTemporaryFile(mode="w", suffix=".jsonl", delete=False, encoding="utf-8")
tmp.write(json.dumps({"content": article_text}, ensure_ascii=False) + "\n")
tmp.close()

config = {
"input_path": tmp.name,
"dataset": {"source": "local", "format": "jsonl"},
"executor": {"max_workers": 1},
"evaluator": [{
"fields": {"content": "content"},
"evals": [{
"name": "ArticleFactChecker",
"config": {
"key": os.environ["OPENAI_API_KEY"],
"model": os.getenv("OPENAI_MODEL", "gpt-5.4-mini"),
"api_url": os.getenv("OPENAI_BASE_URL", "https://api.openai.com/v1"),
"parameters": {
"temperature": 0,
"agent_config": {
"max_concurrent_claims": 5,
"max_iterations": 50,
"tools": {
"claims_extractor": {
"api_key": os.environ["OPENAI_API_KEY"],
"model": os.getenv("OPENAI_MODEL", "gpt-5.4-mini"),
"base_url": os.getenv("OPENAI_BASE_URL", "https://api.openai.com/v1"),
"max_claims": 50
},
"arxiv_search": {"max_results": 5},
**({"tavily_search": {"api_key": os.environ["TAVILY_API_KEY"]}}
if os.getenv("TAVILY_API_KEY") else {})
}
}
}
}
}]
}]
}

if __name__ == "__main__":
result = Executor.exec_map["local"](InputArgs(**config)).execute()
print(f"Score: {result.score:.1f}% | Output: {result.output_path}")
os.unlink(tmp.name)
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The code example for manual SDK usage has a potential resource leak. The temporary file created with tempfile.NamedTemporaryFile(delete=False) is not guaranteed to be cleaned up if an exception occurs before os.unlink(tmp.name) is called. Also, the file is created at the module level, so if this script is imported, it will create a temp file that is never deleted.

It's better to create and clean up the temporary file within a try...finally block inside the if __name__ == "__main__" guard to ensure it's always deleted.

if __name__ == "__main__":
    tmp_path = None
    try:
        # IMPORTANT: wrap article into JSONL — plaintext is read line-by-line otherwise
        article_text = open("article.md", encoding="utf-8").read()
        with tempfile.NamedTemporaryFile(mode="w", suffix=".jsonl", delete=False, encoding="utf-8") as tmp_file:
            tmp_file.write(json.dumps({"content": article_text}, ensure_ascii=False) + "\n")
            tmp_path = tmp_file.name

        config = {
            "input_path": tmp_path,
            "dataset": {"source": "local", "format": "jsonl"},
            "executor": {"max_workers": 1},
            "evaluator": [{
                "fields": {"content": "content"},
                "evals": [{
                    "name": "ArticleFactChecker",
                    "config": {
                        "key": os.environ["OPENAI_API_KEY"],
                        "model": os.getenv("OPENAI_MODEL", "gpt-5.4-mini"),
                        "api_url": os.getenv("OPENAI_BASE_URL", "https://api.openai.com/v1"),
                        "parameters": {
                            "temperature": 0,
                            "agent_config": {
                                "max_concurrent_claims": 5,
                                "max_iterations": 50,
                                "tools": {
                                    "claims_extractor": {
                                        "api_key": os.environ["OPENAI_API_KEY"],
                                        "model": os.getenv("OPENAI_MODEL", "gpt-5.4-mini"),
                                        "base_url": os.getenv("OPENAI_BASE_URL", "https://api.openai.com/v1"),
                                        "max_claims": 50
                                    },
                                    "arxiv_search": {"max_results": 5},
                                    **({"tavily_search": {"api_key": os.environ["TAVILY_API_KEY"]}}
                                       if os.getenv("TAVILY_API_KEY") else {})
                                }
                            }
                        }
                    }
                }]
            }]
        }

        result = Executor.exec_map["local"](InputArgs(**config)).execute()
        print(f"Score: {result.score:.1f}%  |  Output: {result.output_path}")
    finally:
        if tmp_path and os.path.exists(tmp_path):
            os.unlink(tmp_path)

Comment on lines +245 to +252
if cls.config.fetch_affiliations and results:
for i, entry_id in enumerate(entry_ids):
html_url = entry_id.replace('/abs/', '/html/')
affiliations_text = cls._fetch_html_affiliations(
html_url, timeout=cls.config.timeout
)
if affiliations_text is not None:
results[i]['affiliations_text'] = affiliations_text
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Fetching affiliations involves making a network request for each paper. This is done sequentially in a loop, which can be slow if there are many results. To improve performance, consider fetching these pages in parallel using a ThreadPoolExecutor.

Comment on lines +525 to +527
except Exception as exc:
log.debug("Failed to fetch HTML affiliations from %s: %s", html_url, exc)
return None
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Catching a broad Exception can hide underlying issues and make debugging harder. It's better to catch more specific exceptions from the requests library, such as requests.exceptions.RequestException, to handle network-related errors more gracefully and avoid catching unrelated exceptions.

Suggested change
except Exception as exc:
log.debug("Failed to fetch HTML affiliations from %s: %s", html_url, exc)
return None
except _requests.exceptions.RequestException as exc:
log.debug("Failed to fetch HTML affiliations from %s: %s", html_url, exc)
return None

@e06084 e06084 merged commit 08a5bfc into MigoXLab:dev Mar 24, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants