Skip to content

[codex] Clarify optional OCR and Spark surfaces#142

Open
sidmohan0 wants to merge 3 commits into
devfrom
codex/dfpy-75-optional-surfaces
Open

[codex] Clarify optional OCR and Spark surfaces#142
sidmohan0 wants to merge 3 commits into
devfrom
codex/dfpy-75-optional-surfaces

Conversation

@sidmohan0
Copy link
Copy Markdown
Contributor

Summary

  • Add a dedicated optional-surfaces docs page for OCR and Spark install profiles, limitations, and deferred-overhaul status.
  • Keep README, docs index, SDK docs, CLI docs, and roadmap centered on lightweight text PII screening while preserving OCR/Spark references.
  • Document exact extras for local OCR, URL-based OCR, Donut OCR, SparkService, and Spark PII UDF helpers.
  • Add a runtime dependency safety test proving public OCR/Spark service references do not require optional OCR/Spark imports on the core path.

Stacking note

This branch is stacked on PR #141 (codex/dfpy-74-core-import-hardening) and therefore also includes PR #139 in the current GitHub diff. It targets dev so CI runs; after the earlier PRs merge, this should collapse to the incremental DFPY-75 docs/test changes.

Validation

  • DATAFOG_NO_TELEMETRY=1 DO_NOT_TRACK=1 .venv312/bin/python -m pytest tests/test_runtime_dependency_safety.py tests/test_no_network_core.py -q
  • .venv312/bin/python -m pre_commit run --files .gitignore README.md docs/index.rst docs/python-sdk.rst docs/cli.rst docs/optional-surfaces.rst docs/roadmap.rst tests/test_runtime_dependency_safety.py --show-diff-on-failure
  • git diff --check
  • DATAFOG_NO_TELEMETRY=1 DO_NOT_TRACK=1 .venv312/bin/python -m sphinx -b html docs docs/_build/html
  • DATAFOG_NO_TELEMETRY=1 DO_NOT_TRACK=1 .venv312/bin/python -m pytest -m "not slow" -q

Refs DFPY-75

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant