Skip to content

Conversation

@kennyyu
Copy link
Contributor

@kennyyu kennyyu commented Nov 12, 2025

This adds tracing annotations to make it easier to identify where time is spent in SFT workloads. To run it:

  • set enable_trace=True
  • Run sl_basic:
python -m tinker_cookbook.recipes.sl_basic enable_trace=true
  • While the job is running, generate the trace:
python -m tinker_cookbook.utils.trace /tmp/tinker-examples/sl_basic/trace_events.jsonl /tmp/sl_basic.json
  • And then visualize with perfetto:
Screenshot 2025-11-21 at 7 44 13 AM

async def submit_batch(epoch_idx: int, batch_idx: int) -> SubmittedBatch:
step = epoch_idx * n_batches + batch_idx
context = get_scope_context()
context.attributes["step"] = step
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could be nice to have a helper to do
context.set_attribute('step', step)
or even
set_scope_attribute('step', step)
which does the get_scope_context() internally

This adds tracing annotations to make it easier to identify
where time is spent in SFT workloads.
@kennyyu
Copy link
Contributor Author

kennyyu commented Nov 21, 2025

Pre-existing type check/uv errors

@kennyyu kennyyu merged commit 5f5ce26 into main Nov 21, 2025
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants