docs: add "How training works" page by divyasinghds · Pull Request #26 · tracebloc/docs

divyasinghds · 2026-05-07T10:04:10Z

Summary

Adds tools-help/how-training-works.mdx — a transparency page that walks through what the tracebloc client does to your data and model in each of the nine supported use cases.
Each accordion covers: input format, preprocessing (incl. split strategy), training/validation step, default loss/optimizer, cycle-level metrics with the underlying library calls (sklearn, torchmetrics, lifelines), and inference output.
Closes with a 6-step "reproduce locally" checklist so an evaluating user can run the same pipeline on the same data and compare metrics number-for-number.
Wired into the Tools & Help group in docs.json.

Replaces #25 (renamed from "pipeline reference" → "How training works"; rebased onto develop).

Test plan

mint dev renders the page and the new nav entry appears under Tools & Help
All nine accordions expand correctly
mint broken-links passes
Spot-check by an owner of core/metrics/ and core/domains/ in tracebloc-client that the per-use-case metric lists and loss formulas are accurate (notably segmentation boundary metrics and the Cox loss description)

🤖 Generated with Claude Code

Note

Low Risk
Low risk documentation-only changes; main risk is incorrect or out-of-date training/metric descriptions misleading users, not runtime behavior.

Overview
Adds a new join-use-case/how-training-works.mdx doc that подробно describes the platform’s training/inference pipeline and per-use-case preprocessing, loss/optimizer behavior, metrics, and inference outputs, plus guidance for reproducing runs locally.

Wires the new page into the Join a Use Case navigation in docs.json, explicitly ignores .github/ in .mintignore to prevent Mintlify dev server MDX parsing issues, and cleans up a minor formatting/encoding issue at the start of evals.json.

^{Reviewed by Cursor Bugbot for commit 6eb7094. Bugbot is set up for automated code reviews on this repo. Configure here.}

Documents the training and inference pipeline for all nine supported use cases so a user evaluating tracebloc can reproduce a run locally and compare metrics against what the platform reports. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

LukasWodka · 2026-05-07T10:04:37Z

👋 Heads-up — Code review queue is at 12 / 8

Above the WIP limit. The team convention is to review existing PRs before opening new work.

Open PRs currently in Code review (oldest first):

client#108 — chore: sync main → develop after misrouted docs PRs · author: @saadqbal · no reviewer assigned
docs#14 — docs: migrate join-use-case API examples to snake_case (0.7.0) · author: @divyasinghds · no reviewer assigned
docs#4 — docs: switch install to extras syntax for tracebloc_package 0.6.33 · author: @LukasWodka · reviewer: @saadqbal
documentation-lagecy-#7 — docs: migrate start-training and hyperparameters to snake_case API (0.7.0) · author: @aptracebloc · reviewer: @LukasWodka
model-zoo#59 — fix: pin revision on every from_pretrained() call · author: @aptracebloc · no reviewer assigned
start-training#10 — ci: add WIP-limit-check caller workflow · author: @LukasWodka · no reviewer assigned
start-training#6 — chore: add PR template + customer bump + stale auto-close · author: @LukasWodka · no reviewer assigned
start-training#9 — ci: add kanban closure-routing caller workflow · author: @LukasWodka · reviewer: @saadqbal
tracebloc-py-package#117 — TEST: trigger WIP-limit-check (do not merge — will close shortly) · author: @saadqbal · no reviewer assigned
tracebloc-py-package#123 — chore: tighten huggingface extra pins to compatible major-version ranges · author: @aptracebloc · no reviewer assigned

Pull from review before opening new work. (This is a nudge from the kanban WIP check, not a block.)

The file started with U+200B (UTF-8 e2 80 8b) before the opening bracket, which broke JSON parsing and caused mint dev to fail with a YAML parser error. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Lives between hyperparameters and model optimization, where users are already configuring a run and want to understand what the platform does with their model and data. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

mint dev was parsing .github/pull_request_template.md as MDX and failing on the HTML comments. The file claims .github is auto-ignored but some CLI versions still scan it; listing it explicitly is harmless and unblocks local preview. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Expand the page from a high-level overview into a reference users can match runs against. Each of the nine supported use cases (image classification, object detection, semantic segmentation, keypoint detection, text classification, tabular classification, tabular regression, time series forecasting, time-to-event prediction) now has a consistent plain-English breakdown of preprocessing, train/val split, training step, validation step, cycle metrics, and inference output — including the platform-side defaults and reproduction-load-bearing details (mask handling for SS, OD-vs-YOLO image-size pinning, augmentation pipeline behavior, frozen-in-cycle-1 preprocessing state, scaled-vs-original-target metric scales, etc.). Also adds a shared "Experiment parameters" table grounded in the SDK's actual starting defaults (SGD, lr=0.001, batch_size=16, epochs=10, dynamic per-dataset validation_split) and a tightened "Reproducing a run locally" checklist. .mintignore: consolidate the .github/ entries into a single block so the dev server stops tripping over GitHub PR-template HTML comments. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Clarify that reproducing a tracebloc run locally — even with everything matched — will produce small numerical variation, with the major sources called out: hardware/CUDA differences, GPU non-determinism, library versions, data-loader worker timing, federated averaging between cycles, stateful layer behavior, and mixed-precision rounding. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

divyasinghds self-assigned this May 7, 2026

divyasinghds mentioned this pull request May 7, 2026

docs: add use case pipeline reference page #25

Closed

4 tasks

divyasinghds and others added 3 commits May 7, 2026 15:39

docs: strip leading zero-width space from evals.json

3e2fed2

The file started with U+200B (UTF-8 e2 80 8b) before the opening bracket, which broke JSON parsing and caused mint dev to fail with a YAML parser error. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

LukasWodka mentioned this pull request May 8, 2026

docs: add "Bring your data" step to homepage journey #29

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: add "How training works" page#26

docs: add "How training works" page#26
divyasinghds wants to merge 6 commits intodevelopfrom
docs/how-training-works

divyasinghds commented May 7, 2026 •

edited by cursor Bot

Loading

Uh oh!

LukasWodka commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

divyasinghds commented May 7, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

LukasWodka commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

divyasinghds commented May 7, 2026 •

edited by cursor Bot

Loading