Skip to content

Safer and more efficient GitHub content fetching #92

@httpdss

Description

@httpdss

Summary
Fetching a single file from GitHub clones/pulls the whole repository branch, which is slow and brittle. Improve safety, speed, and resilience.

Proposed changes

  • Prefer lightweight fetch for single files:
    • Use raw.githubusercontent.com or the GitHub Contents API for HTTPS where possible.
    • Support pinning by tag or commit SHA in addition to branch.
  • Add network robustness:
    • Timeouts, retries with exponential backoff.
    • Configurable retry/timeout via CLI/env.
  • Security controls:
    • Optional allowlist/denylist of protocols.
    • --deny-network flag to disable all remote fetching.

Files

  • struct_module/content_fetcher.py
  • docs/file-handling.md (document protocol behavior and options)

Acceptance criteria

  • Single-file fetch no longer requires a full repo clone by default.
  • Pinning by commit or tag is supported and documented.
  • Timeouts/retries configurable and covered by tests.
  • Backwards compatibility maintained for existing configs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions