Skip to content

chore: open-source governance, slim Docker default, real-PDF example#1

Merged
magic-alt merged 1 commit into
mainfrom
feature/oss-setup-and-real-pdf-example
May 21, 2026
Merged

chore: open-source governance, slim Docker default, real-PDF example#1
magic-alt merged 1 commit into
mainfrom
feature/oss-setup-and-real-pdf-example

Conversation

@magic-alt

Copy link
Copy Markdown
Owner

Summary

Three things in one branch:

  1. End-to-end verification of the existing code — ruff lint clean, pytest 317 passed, 2 skipped, Docker image builds (~50 s with base deps) and curl /health returns {"status":"ok","version":"0.1.0"} from the running container.
  2. Standard open-source project governance so all future changes land via PR (no direct pushes to main).
  3. A runnable example that downloads a real listed-company financial report PDF (Apple Inc., NASDAQ:AAPL) and runs the full jetbot pipeline against it with the mock LLM — fully reproducible, no API key required.

Scope of changes

  • Licensing & governance: LICENSE (MIT), CONTRIBUTING.md, CODE_OF_CONDUCT.md, SECURITY.md, .github/CODEOWNERS, .github/pull_request_template.md, .github/ISSUE_TEMPLATE/*, docs/BRANCH_PROTECTION.md (canonical rules + gh CLI recipe), scripts/git-hooks/pre-commit (blocks direct commits to main, runs local CI).
  • CI workflow (.github/workflows/ci.yml): PR-only on main, concurrency cancellation, pip cache, GHA-cached Docker buildx build.
  • Docker (Dockerfile, docker-compose.yml): default image now installs base deps only (~50 s build instead of >5 min downloading paddlepaddle/opencv); extras opt-in via --build-arg EXTRAS=.... Compose passes EXTRAS=celery,postgres,s3 to api/worker and makes .env optional. Dropped obsolete version: key.
  • Real-PDF example (examples/real_pdf_analysis/): run_example.py downloads Apple's FY24-Q4 Consolidated Financial Statements PDF (~4.9 MB, 4 pages), runs the LangGraph pipeline through LocalStore, prints the head of the generated trader_report.md. README documents how to point it at other filings or switch to a real LLM.
  • Cleanup: ignore .mypy_cache/, *.egg-info/, example outputs/fixtures; untrack src/financial_report_agent.egg-info/.

How was this tested?

# 1. Tests + lint
python -m ruff check src tests              # All checks passed!
python -m pytest --timeout=60               # 317 passed, 2 skipped in 6.18s

# 2. Docker
docker build -t jetbot:test .               # OK in ~50 s
docker run -d -p 18000:8000 --name s jetbot:test
curl http://localhost:18000/health          # {"status":"ok","version":"0.1.0"}

# 3. Real-PDF example
python examples/real_pdf_analysis/run_example.py
# [download] OK (4919216 bytes)
# [run] doc_id=...
# OK extracted/statements.json, risk_signals.json, trader_report.md

Follow-up (maintainer action required)

Branch protection rules on main need to be enabled via GitHub UI or gh CLI — see docs/BRANCH_PROTECTION.md. The repo files in this PR cannot enforce that by themselves.

Checklist

  • Branch is created from latest main
  • Lint + tests pass locally
  • New code (example) has been smoke-tested end to end
  • Docs updated (README links to example + contributing flow)
  • No secrets or API keys committed

OSS governance:
- Add LICENSE (MIT), CONTRIBUTING, CODE_OF_CONDUCT, SECURITY
- Add PR template, issue templates, CODEOWNERS
- Add docs/BRANCH_PROTECTION.md with gh-cli recipe
- Add scripts/git-hooks/pre-commit (blocks direct commits to main, runs local CI)
- Update CI workflow: PR-only on main, add ruff format check, GHA docker buildx cache

Docker:
- Dockerfile: install base deps only by default, opt-in via EXTRAS build arg
- docker-compose: pass EXTRAS=celery,postgres,s3 to api/worker, make .env optional, drop obsolete version

Real-PDF example:
- examples/real_pdf_analysis/ downloads Apple Inc. FY24-Q4 consolidated financial statements PDF and runs full pipeline with mock LLM
- README documents how to point at other filings and switch to real LLMs

Cleanup:
- .gitignore: ignore .mypy_cache, *.egg-info, example outputs/fixtures
- Untrack src/financial_report_agent.egg-info
Copilot AI review requested due to automatic review settings May 21, 2026 02:15

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR packages several repo-wide “project hygiene” improvements: adding standard open-source governance docs and templates, slimming the default Docker image by making extras opt-in, and adding a runnable end-to-end example that exercises the pipeline on a real public-company PDF using the mock LLM.

Changes:

  • Added governance/security/community documentation and GitHub templates (PR template, issue templates, CODEOWNERS) plus a local pre-commit hook and branch protection guide.
  • Updated CI to focus on main, add concurrency cancellation, enable pip caching, and use Buildx with GHA cache for Docker builds.
  • Made Docker default to base dependencies (extras opt-in via EXTRAS build arg) and added a real-PDF end-to-end example with outputs ignored by git.

Reviewed changes

Copilot reviewed 23 out of 24 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/financial_report_agent.egg-info/top_level.txt Removes tracked build artifact metadata from the repo.
src/financial_report_agent.egg-info/SOURCES.txt Removes tracked build artifact metadata from the repo.
src/financial_report_agent.egg-info/requires.txt Removes tracked build artifact metadata from the repo.
src/financial_report_agent.egg-info/PKG-INFO Removes tracked build artifact metadata from the repo.
src/financial_report_agent.egg-info/dependency_links.txt Removes tracked build artifact metadata from the repo.
SECURITY.md Adds a security policy and reporting guidance.
scripts/git-hooks/pre-commit Adds a local pre-commit hook to block commits to main/master and run local CI.
README.md Links to the new real-PDF example and contributing/licensing info.
LICENSE Adds MIT license text.
examples/real_pdf_analysis/run_example.py Adds a runnable script that downloads a public PDF and runs the pipeline.
examples/real_pdf_analysis/README.md Documents how to run the real-PDF example and interpret outputs.
examples/real_pdf_analysis/.gitignore Ignores generated example outputs and downloaded PDFs.
docs/BRANCH_PROTECTION.md Documents canonical branch protection settings and a gh CLI recipe.
Dockerfile Makes extras opt-in via EXTRAS build arg to slim default images.
docker-compose.yml Updates compose to pass EXTRAS, make .env optional, and drop obsolete version:.
CONTRIBUTING.md Adds contribution workflow, local CI expectations, and project conventions.
CODE_OF_CONDUCT.md Adds a short Code of Conduct referencing Contributor Covenant.
.gitignore Ignores mypy cache, egg-info, and example runtime artifacts.
.github/workflows/ci.yml Updates CI triggers, adds concurrency + pip cache, and modernizes Docker build caching.
.github/pull_request_template.md Adds a PR template aligned with the new workflow.
.github/ISSUE_TEMPLATE/feature_request.yml Adds a feature request issue form.
.github/ISSUE_TEMPLATE/config.yml Configures issue creation and links security reporting.
.github/ISSUE_TEMPLATE/bug_report.yml Adds a bug report issue form.
.github/CODEOWNERS Adds default code owner configuration.
Comments suppressed due to low confidence (1)

docker-compose.yml:46

  • worker sets S3_ENDPOINT=http://minio:9000 but does not depend on the minio service. This can cause the worker to start before MinIO is available and fail early (depending on connection retry behavior). Add minio to worker.depends_on or otherwise ensure the worker waits/retries appropriately.
      - CELERY_BROKER_URL=redis://redis:6379/0
      - STORAGE_BACKEND=postgres
      - DATABASE_URL=postgresql://jetbot:jetbot@postgres:5432/jetbot
      - S3_ENDPOINT=http://minio:9000
    volumes:
      - app-data:/app/data
    depends_on:
      - redis
      - postgres


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

language: str,
) -> str:
# Force the mock LLM so the example is fully offline / deterministic.
os.environ.setdefault("LLM_DEFAULT_MODEL", "mock:mock")
Comment thread SECURITY.md
If you discover a security vulnerability, please **do not** open a public
GitHub issue. Instead, report it privately via GitHub's
[Private vulnerability reporting](https://docs.github.com/en/code-security/security-advisories/guidance-on-reporting-and-writing-information-about-vulnerabilities/privately-reporting-a-security-vulnerability)
feature on this repository, or email the maintainers.
Comment thread CODE_OF_CONDUCT.md
- Be respectful and welcoming.
- No harassment, discrimination, or personal attacks.
- Assume good intent; disagree on technical content, not on people.
- Report unacceptable behavior to the maintainers via the address listed in
Comment thread docs/BRANCH_PROTECTION.md
Comment on lines +38 to +56
```bash
gh api -X PUT \
repos/magic-alt/jetbot/branches/main/protection \
-F required_status_checks.strict=true \
-F 'required_status_checks.contexts[]=lint-and-test' \
-F enforce_admins=true \
-F required_pull_request_reviews.required_approving_review_count=1 \
-F required_pull_request_reviews.dismiss_stale_reviews=true \
-F required_pull_request_reviews.require_code_owner_reviews=true \
-F required_linear_history=true \
-F allow_force_pushes=false \
-F allow_deletions=false \
-F required_conversation_resolution=true \
-F restrictions= # empty restrictions block all direct pushes
```

> Note: The `restrictions` block must be sent as an empty object/array to
> mean "nobody may push directly". Adjust per the GitHub REST API docs if
> running this through a tool that requires a different JSON shape.
Comment thread .github/workflows/ci.yml
Comment on lines 3 to +7
on:
push:
branches: [master, main]
branches: [main]
pull_request:
branches: [master, main]
branches: [main]
@magic-alt magic-alt merged commit ddfb33e into main May 21, 2026
6 checks passed
@magic-alt magic-alt deleted the feature/oss-setup-and-real-pdf-example branch May 21, 2026 04:32
magic-alt added a commit that referenced this pull request Jun 6, 2026
…xample

chore: open-source governance, slim Docker default, real-PDF example
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants