Skip to content

Prod: enhance documentation and CI for training workflows and templates#34

Merged
saadqbal merged 15 commits into
mainfrom
develop
May 11, 2026
Merged

Prod: enhance documentation and CI for training workflows and templates#34
saadqbal merged 15 commits into
mainfrom
develop

Conversation

@saadqbal
Copy link
Copy Markdown
Contributor

@saadqbal saadqbal commented May 11, 2026

Summary

Related

Type of change

  • Feature
  • Bug fix
  • Tech-debt / refactor
  • Docs
  • Security / hardening
  • Breaking change

Test plan

Screenshots / recordings

Deployment notes

Checklist

  • Tests added / updated and passing locally
  • Docs updated if behavior or config changed
  • No secrets / credentials in the diff
  • For security-sensitive paths: appropriate reviewer requested

Note

Low Risk
Low risk documentation-only changes; main risk is broken links/navigation or Mintlify build issues due to the large new MDX page and updated ignore rules.

Overview
Adds a new Join a Use Case doc page, How training works, detailing the end-to-end training/inference lifecycle and per–use-case preprocessing/metrics so runs can be reproduced locally.

Updates docs to reflect newly available ingestion templates for keypoint detection and semantic segmentation (replacing “template coming soon” language) and adjusts onboarding copy to explicitly include a dataset-ingestion step before creating a use case.

Hardens the Mintlify dev/build experience by explicitly ignoring .github/** in .mintignore, wires the new page into docs.json navigation, and fixes a stray character at the start of evals.json.

Reviewed by Cursor Bugbot for commit d0d10b9. Bugbot is set up for automated code reviews on this repo. Configure here.

saadqbal and others added 14 commits May 6, 2026 18:28
chore: sync templates + faqs from main → develop (re-authored as @saadqbal)
ci: add FR-pass comment + FR gate callers
Documents the training and inference pipeline for all nine supported
use cases so a user evaluating tracebloc can reproduce a run locally
and compare metrics against what the platform reports.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The file started with U+200B (UTF-8 e2 80 8b) before the opening
bracket, which broke JSON parsing and caused mint dev to fail with a
YAML parser error.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lives between hyperparameters and model optimization, where users are
already configuring a run and want to understand what the platform
does with their model and data.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mint dev was parsing .github/pull_request_template.md as MDX and
failing on the HTML comments. The file claims .github is auto-ignored
but some CLI versions still scan it; listing it explicitly is harmless
and unblocks local preview.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Expand the page from a high-level overview into a reference users can
match runs against. Each of the nine supported use cases (image
classification, object detection, semantic segmentation, keypoint
detection, text classification, tabular classification, tabular
regression, time series forecasting, time-to-event prediction) now has
a consistent plain-English breakdown of preprocessing, train/val split,
training step, validation step, cycle metrics, and inference output —
including the platform-side defaults and reproduction-load-bearing
details (mask handling for SS, OD-vs-YOLO image-size pinning,
augmentation pipeline behavior, frozen-in-cycle-1 preprocessing state,
scaled-vs-original-target metric scales, etc.).

Also adds a shared "Experiment parameters" table grounded in the SDK's
actual starting defaults (SGD, lr=0.001, batch_size=16, epochs=10,
dynamic per-dataset validation_split) and a tightened
"Reproducing a run locally" checklist.

.mintignore: consolidate the .github/ entries into a single block so
the dev server stops tripping over GitHub PR-template HTML comments.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Clarify that reproducing a tracebloc run locally — even with everything
matched — will produce small numerical variation, with the major
sources called out: hardware/CUDA differences, GPU non-determinism,
library versions, data-loader worker timing, federated averaging
between cycles, stateful layer behavior, and mixed-precision rounding.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs: add "Bring your data" step to homepage journey

The homepage's "How It Works" jumped from "Create your AI workspace"
straight to "Define a use case", with "Select datasets" framed as a
sub-action of use-case definition. That inverted the actual journey
(ingest first, then define a use case from your prepared datasets)
and contradicted /create-use-case, which gets it right.

Adds an explicit "Bring your data" step between workspace creation
and use-case definition, and rewords Step 2 + the Next Steps card so
"Select datasets" no longer reads as a UI action performed during
use-case creation.

Closes #28

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: fix missed inversion in homepage hero card

The hero CardGroup at the top of the homepage said "Define datasets,
set metrics, invite contributors" — same inversion as the "How It
Works" steps and the "Next Steps" card, framing datasets as something
configured during use-case creation. The previous commit fixed the
two visible spots but missed this one.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The setup-guide's "What's Next" sent data owners straight from "client
running" to "create a use case", with "select datasets" framed as a
sub-action of use-case creation. Same inversion pattern as the
homepage bug fixed in #29 — found during a broader audit.

Reword the data-owner bullet to route through ingestion first, then
into use-case creation, with "pick from prepared datasets" replacing
"select datasets" to match the homepage wording.

Closes #30

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
keypoint_detection and semantic_segmentation templates exist on
tracebloc/data-ingestors develop (both with READMEs, scripts, sample
data, sample CSVs) but three places on the docs site still claimed
they were "coming soon" or "still in flight" and pointed users at
support@tracebloc.io instead of the actual templates.

- prerequisites.mdx: replace "Template coming soon" with [Example]
  links for both Keypoint Detection and Semantic Segmentation, matching
  the format used by image_classification and object_detection.
- templates.mdx: add the 2 missing rows to the Available templates
  table; delete the "still in flight" sentence underneath.

Grep `coming soon|still in flight|template.*soon` across develop
returns zero hits after this change (the legitimate "if your use case
isn't yet supported, contact us" fallbacks on prerequisites.mdx:6 and
define.mdx:14,30 are different copy and stay).

Closes #32

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@saadqbal saadqbal merged commit 9a94c72 into main May 11, 2026
5 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants