Auto-share session traces to a private HF dataset by DarshanCode2005 · Pull Request #193 · huggingface/ml-intern

DarshanCode2005 · 2026-04-29T22:06:00Z

Resolves #191

What

Every ml-intern session now gets uploaded to your own private Hugging Face
dataset (default {hf_user}/ml-intern-traces) in the Claude Code JSONL
format that the HF Agent Trace Viewer
auto-detects. The dataset is created private. You can flip it to public
from inside the CLI.

The existing upload to smolagents/ml-intern-sessions is unchanged. That
dataset still feeds the backend KPI scheduler.

Why

Users wanted a way to browse, share, and debug their own runs through the
HF trace viewer without exposing them publicly by default.

Changes

Config

agent/config.py: added share_traces: bool = True and
personal_trace_repo_template: str = "{hf_user}/ml-intern-traces".
configs/cli_agent_config.json and configs/frontend_agent_config.json:
surface the same defaults.

Uploader

agent/core/session_uploader.py: rewrote the subprocess uploader.
- New to_claude_code_jsonl(trajectory) converter. Maps litellm messages
  to user / assistant / tool_use / tool_result blocks. Deterministic
  SHA-1 UUIDs keyed by session_id::role::idx so re-uploads keep the
  parent chain stable. System prompts are skipped.
- New CLI flags: --format {row, claude_code}, --token-env,
  --private.
- Per-format status tracking on the local trajectory: upload_status
  for the org dataset and personal_upload_status for the user repo,
  so a failure on one path does not clobber the other.
- Token resolution: --token-env overrides the org fallback chain
  (HF_SESSION_UPLOAD_TOKEN, HF_TOKEN, HF_ADMIN_TOKEN). The
  personal upload uses HF_TOKEN directly.
- private is now passed into create_repo instead of being hardcoded.
  Existing repos keep whatever visibility the user set.

Session wiring

agent/core/session.py:
- Added _personal_trace_repo_id(). Returns None when sharing is
  disabled, the user is anonymous, or the template is missing.
- Added _spawn_uploader() helper to keep subprocess args in one place.
- save_and_upload_detached now spawns two detached subprocesses: one
  for the org dataset (row format, existing token chain) and one for
  the per-user private dataset (Claude Code format, HF_TOKEN).
- retry_failed_uploads_detached now accepts personal_repo_id and
  runs both retry passes in parallel.
agent/core/agent_loop.py: pass
personal_repo_id=session._personal_trace_repo_id() into the startup
retry so personal uploads that failed in a prior session get
re-attempted on next launch.

CLI

agent/main.py: new /share-traces slash command.
- /share-traces shows the dataset URL and current visibility.
- /share-traces public flips to public via
  HfApi.update_repo_settings.
- /share-traces private flips back to private.
- Idempotent create_repo so the first flip works even before any
  session has been saved.
- Uses the user's own HF_TOKEN.
agent/utils/terminal_display.py: added the command to the help text.

Docs

README.md: new "Sharing Traces" section covering default behavior,
the slash command, the opt-out flag, and how to override the template.

Edge cases handled

Anonymous user (no resolvable HF username): personal upload is skipped.
HF_TOKEN missing: personal upload is skipped, org upload is unaffected.
Pre-existing local session logs created before this change: the personal
retry path skips files that have never been tagged, so we do not
suddenly re-upload old sessions to a newly created personal repo.
Heartbeat saves: the same trajectory is rewritten every minute. Both
upload paths run on every save, the Hub deduplicates by content hash so
the cost is small.

github-actions · 2026-04-30T14:53:45Z

Claude finished @lewtun's task in 3m 18s —— View job

PR Review: Auto-share session traces to a private HF dataset

No blocking issues — 3 P1
Verdict: ready to merge

What I checked

session_uploader.py: JSONL converter, token resolution, upload/retry logic, status field design
session.py: subprocess wiring, personal repo resolution, concurrent upload paths
agent_loop.py: startup retry pass for personal repo
main.py: /share-traces command, token flow
config.py: new fields, defaults

P1 — Concurrent write race between org and personal uploaders

save_and_upload_detached spawns two subprocesses that each read-modify-write the same local JSON file with no lock. Both processes load the file at startup, then write back their respective status key:

session_uploader.py:374-377 — org uploader writes upload_status = "success"
session_uploader.py:374-377 (personal) — writes personal_upload_status = "success"

If the personal uploader reads the file before the org uploader finishes writing, the personal uploader's final json.dump will overwrite with a stale copy that still has upload_status: "pending". On the next session start, retry_failed_uploads at session.py:583-598 re-uploads the session to the org dataset unnecessarily.

The org Hub dataset receives spurious duplicate uploads; the Hub is content-hash idempotent so no data is corrupted, but the retry behavior is incorrect.

Fix this →

P1 — Personal upload silently skipped when user authenticated via `huggingface-cli login`

_spawn_uploader at session.py:503 passes --token-env HF_TOKEN, so the uploader subprocess resolves the personal upload token via os.environ.get("HF_TOKEN") only (session_uploader.py:44). Users who ran huggingface-cli login (which saves the credential to ~/.cache/huggingface/token but does not export HF_TOKEN) will silently fail personal uploads — the local file is immediately marked personal_upload_status: "failed" at session_uploader.py:323-325.

The resolved token is available in session.hf_token (set from resolve_hf_token() in main.py:934) but is never propagated to the subprocess environment. The /share-traces command correctly uses session.hf_token at main.py:854, so the inconsistency is apparent.

Fix this →

P1 — All trace events share the session start timestamp

to_claude_code_jsonl at session_uploader.py:142-143 assigns the session start time to a single timestamp variable, then uses it for every emitted event (session_uploader.py:164, 202, 225). The Claude Code JSONL schema uses per-event timestamps to drive the timeline view in the HF Agent Trace Viewer; with all events sharing the same timestamp they appear simultaneous and the viewer cannot reconstruct turn ordering from time alone.

The per-message timestamp information is not currently stored in the trajectory dict. A minimal fix would be to store datetime.now().isoformat() at the time each message is added to the context manager and include it in get_trajectory()'s messages serialisation.

Fix this →

Notes

The create_repo exception-eating pattern at session_uploader.py:354-363 swallows auth/permission errors, making silent failures harder to diagnose. Non-blocking since upload_file will also fail and be marked correctly.
Using Python's builtin name format as a parameter (session_uploader.py:285, 409) is a minor code smell but has no runtime impact in this context.

github-actions · 2026-04-30T15:20:33Z

Claude finished @lewtun's task in 2m 41s —— View job

PR Review: Auto-share session traces to a private HF dataset

Gather context and read changed files
Check if previously flagged P1 issues were addressed
Look for new P0 issues
Post findings

No blocking issues — 0 new findings
Verdict: ready to merge

What I checked

session_uploader.py: concurrent write paths, token resolution, JSONL converter
session.py: _spawn_uploader, _personal_trace_repo_id, save_and_upload_detached, retry logic
agent_loop.py: startup retry wiring for personal repo
main.py: /share-traces command, token flow, update_repo_settings call
config.py: new fields and defaults

Re-review notes

The two new commits (a83ad51, 78f8cd2) are both merge-from-main syncs — no changes to the trace-sharing code. The three P1 findings from the prior review are unaddressed; per review policy they are not re-flagged here, and the author may defer them to a follow-up.

No new P0 findings were identified.

Co-authored-by: Codex <codex@openai.com>

github-actions · 2026-05-01T09:27:34Z

Claude finished @lewtun's task in 2m 56s —— View job

PR Review: Auto-share session traces to a private HF dataset

Gather context and read changed files
Check what new commits have been pushed since last review
Look for new P0 issues (re-review: suppressing P1s per policy)
Post findings

No new P0 findings
Verdict: ready to merge

What I checked

backend/dependencies.py: new _dev_user_from_env() function — token resolution, exception handling, fallback behavior, caching behavior
agent/core/hf_access.py: fetch_whoami_v2 implementation (no internal caching confirmed)
agent/config.py, agent/core/session_uploader.py, configs/*.json, README.md: dataset rename propagation
session_manager.py + routes/agent.py: how user_id flows from get_current_user into sessions (to confirm the rename doesn't break the repo path derivation)

Re-review notes

Two new real-code commits since the last review (a83ad51):

5e70584 — Use HF token owner for local dev auth introduces _dev_user_from_env() (backend/dependencies.py:141). The function correctly falls back to DEV_USER at every failure point. One note (P1, not re-flagged under re-review policy): _dev_user_from_env() calls fetch_whoami_v2 twice per request — once at line 152, then again inside _fetch_user_plan at line 169 — with no caching. In contrast, the production path caches _validate_token results for 5 minutes. This means dev mode makes 2 uncached HTTP round-trips to api.huggingface.co on every request; if the HF API is slow or unavailable, requests will block up to the 5-second fetch_whoami_v2 timeout before falling back to DEV_USER. Not blocking since the fallback is correct and offline dev still works, but a simple functools.lru_cache or module-level (token, resolved_user) pair would eliminate the redundancy.

14e2690 — Rename personal session trace dataset (ml-intern-traces → ml-intern-sessions) propagates cleanly across all 6 touch points. Any testers who already created the {user}/ml-intern-traces dataset during branch testing will have their old traces orphaned (new sessions write to the new name), but since this is pre-merge that's an acceptable disruption.

The three P1 findings from the first review remain outstanding (concurrent write race, huggingface-cli login silent failure, shared event timestamps). Per review policy they are not re-flagged here.

Co-authored-by: Codex <codex@openai.com>

github-actions · 2026-05-01T09:34:14Z

Claude finished @lewtun's task in 3m 36s —— View job

PR Review: Auto-share session traces to a private HF dataset

Gather context and read changed files
Check what new commits have been pushed since last review
Look for new P0 issues (re-review: suppressing P1s per policy)
Post findings

No new P0 findings
Verdict: ready to merge

What I checked

session_uploader.py: dataset_card_readme, _upload_dataset_card, integration into upload_session_as_file retry loop, token and repo_id propagation
tests/unit/test_session_uploader.py: coverage of README content and format-gated upload behavior
session.py: heartbeat save path — confirmed personal_upload_status is wiped on each heartbeat, meaning _upload_dataset_card runs on every save (pre-existing design, acknowledged in PR description as acceptable)

Re-review notes

One new real-code commit since the last review (14e2690):

4cc7717 — Add session dataset card metadata adds dataset_card_readme() and _upload_dataset_card() to session_uploader.py, and two unit tests. The implementation is clean. Note that _upload_dataset_card sits inside the for attempt in range(max_retries) retry loop (session_uploader.py:445), so a transient failure in the README upload consumes retry budget that would otherwise go to the session JSONL upload. This is worth watching for rate-limit scenarios but is not a blocking concern given the Hub's per-user rate headroom.

The three P1 findings from the first review (concurrent write race, huggingface-cli login silent failure, shared event timestamps) remain outstanding. Per review policy they are not re-flagged here.

lewtun · 2026-05-01T09:37:09Z

                json.dump(data, f, indent=2)
            return False

-        # Scrub secrets (HF tokens, API keys, etc.) from messages + events


Why was this removed? We should retain some best-effort scrubbing

Co-authored-by: OpenAI Codex <codex@openai.com>

github-actions · 2026-05-01T09:58:49Z

Claude finished @lewtun's task in 7m 16s —— View job

PR Review: Auto-share session traces to a private HF dataset

Gather context and read changed files
Investigate lewtun's comment about removed scrubbing at session_uploader.py:68
Check new commits since last review (10e7279)
Post findings

No new P0 findings
Verdict: ready to merge

What I checked

session_uploader.py: _update_upload_status fcntl lock correctness, _resolve_token chain for HF_TOKEN, scrubbing coverage in _write_row_payload and _write_claude_code_payload
session.py: _spawn_uploader token propagation via _ML_INTERN_PERSONAL_TOKEN env var
context_manager/manager.py: per-message timestamp assignment and litellm.Message extra-field preservation in model_dump()
hf_tokens.py: resolve_hf_token + get_cached_hf_token() covering huggingface-cli login path
tests/unit/test_session_uploader.py: new coverage for token precedence, atomic status update, per-message timestamps
tests/unit/test_dangling_tool_calls.py: timestamp preservation test

Re-review notes

10e7279 — Fix session trace upload review issues addresses all three P1 findings from the first review:

Concurrent write race (P1, first review): fixed — _update_upload_status re-reads the file under fcntl.LOCK_EX before writing, and _read_session_file uses LOCK_SH. The main-process atomic rename (tmp.replace(filepath)) is compatible because the status update opens a fresh file descriptor to the renamed inode.
huggingface-cli login silent failure (P1, first review): fixed — _spawn_uploader now passes self.hf_token into the subprocess via _ML_INTERN_PERSONAL_TOKEN; _resolve_token("HF_TOKEN") reads that first, then falls through to resolve_hf_token(…, include_cached=True) which calls huggingface_hub.get_token() covering the cached-login case (hf_tokens.py:19-22).
Shared event timestamps (P1, first review): fixed — ContextManager.add_message stamps each message at append time (manager.py:243-244); litellm.Message uses extra="allow" so the field survives model_dump() and is picked up in to_claude_code_jsonl at session_uploader.py:172.

Addressing @lewtun's comment (`session_uploader.py:68`)

The scrubbing was not removed — it was refactored out of upload_session_as_file into the _scrub() helper introduced at line 68 in the new file. It is called in both upload paths:

_write_row_payload:258-260 — scrubs messages, events, and tools (same coverage as the original inline code in main)
_write_claude_code_payload:282 — scrubs messages before passing to to_claude_code_jsonl (events/tools are not included in the Claude Code JSONL output)

Additionally, save_trajectory_local scrubs the trajectory before writing to disk (session.py:401-407), so the local file already has secrets redacted before any uploader subprocess reads it.

Co-authored-by: OpenAI Codex <codex@openai.com>

github-actions · 2026-05-01T10:08:46Z

Claude finished @lewtun's task in 1m 26s —— View job

PR Review: Auto-share session traces to a private HF dataset

Gather context and read changed files
Check what new commits have been pushed since last review
Look for new P0 issues (re-review: suppressing P1s per policy)
Post findings

No new P0 findings
Verdict: ready to merge

What I checked

session_uploader.py: _scrub_session_for_upload correctness, both write paths (_write_row_payload, _write_claude_code_payload) using the new helper, shallow copy safety
tests/unit/test_session_uploader.py: new scrubbing tests for both formats — coverage of HF token, Anthropic key, and GitHub token patterns across messages, events, and tools
Dataset card README wording accuracy

Re-review notes

One new real-code commit since the last review (10e7279):

ef14549 — Preserve secret scrubbing before trace uploads addresses @lewtun's inline comment at session_uploader.py:68. The fix introduces _scrub_session_for_upload (session_uploader.py:256-262) that scrubs messages, events, and tools in one place, and both _write_row_payload and _write_claude_code_payload now call it. The _write_claude_code_payload path previously only scrubbed messages; now events and tools are also scrubbed (they are not passed to to_claude_code_jsonl but the helper is now symmetric and future-proof). The shallow copy is correct — scrubbed = dict(data) followed by key replacement does not mutate the caller's data.

Two new tests verify end-to-end that HF, Anthropic, and GitHub secrets are replaced in both the row and claude_code output files.

The dataset card README update accurately describes the automated scrubbing and appropriately qualifies the warning as "no comprehensive redaction" — this is a correct and honest description.

The three P1 findings from the first review (concurrent write race, huggingface-cli login silent failure, shared event timestamps) were all addressed in 10e7279 and acknowledged in the prior review. No further concerns.

Co-authored-by: OpenAI Codex <codex@openai.com>

github-actions · 2026-05-01T10:16:05Z

Claude finished @lewtun's task in 1m 8s —— View job

PR Review: Auto-share session traces to a private HF dataset

Gather context and read changed files
Check new commits since last review
Look for new P0 issues (re-review: suppressing P1s per policy)
Post findings

No new P0 findings
Verdict: ready to merge

What I checked

session_uploader.py:348-417: dataset_card_readme template change — removed self-referential Repository: link (which was interpolated via f-string), added hardcoded demo and CLI links
tests/unit/test_session_uploader.py:26-30: new assertions verify the added links and that the old link pattern is absent
session_uploader.py:426: call site still passes repo_id — parameter is now unused in the template but callers are unaffected

Re-review notes

One new real-code commit since the last review (ef14549):

c906a38 — Link ML Intern demo in dataset card replaces the per-repo self-link (https://huggingface.co/datasets/{repo_id}) in the dataset card README with two hardcoded links to the ML Intern demo Space and CLI repo. The old link was functional (interpolated via f-string) but redundant since a reader is already on the dataset page. The repo_id parameter at session_uploader.py:348 is now unused inside the template; the call site at session_uploader.py:426 still passes it without error. Not flagging as the function signature is a minor cosmetic cleanup candidate at most (P2).

The three P1 findings from the first review (concurrent write race, huggingface-cli login silent failure, shared event timestamps) were all addressed in 10e7279. No further concerns.

lewtun

Thanks a lot for the nice PR @DarshanCode2005 ! LGTM now with some fixes I pushed

DarshanCode2005 · 2026-05-01T10:39:39Z

Thanks @lewtun. Would it be alright if I emailed you for some career advice?

DarshanCode2005 and others added 7 commits April 30, 2026 03:25

feat: add share_traces toggle and per-user trace repo template

d6c9c3d

feat: support Claude Code JSONL format and per-target auth

e5f078a

feat: dual-upload sessions to private user trace dataset

4f2e0d0

chore: retry personal trace uploads on booting

f1bbf00

feat: add /share-traces command to flip dataset visibility

d4cf5a0

docs: document HF trace auto-share and /share-traces

c7ec35f

Merge branch 'main' into share-trace-on-HF

78f8cd2

Merge branch 'main' into share-trace-on-HF

a83ad51

lewtun and others added 2 commits May 1, 2026 11:12

Use HF token owner for local dev auth

5e70584

Co-authored-by: Codex <codex@openai.com>

Rename personal session trace dataset

14e2690

Co-authored-by: Codex <codex@openai.com>

Add session dataset card metadata

4cc7717

Co-authored-by: Codex <codex@openai.com>

lewtun reviewed May 1, 2026

View reviewed changes

Fix session trace upload review issues

10e7279

Co-authored-by: OpenAI Codex <codex@openai.com>

Preserve secret scrubbing before trace uploads

ef14549

Co-authored-by: OpenAI Codex <codex@openai.com>

Link ML Intern demo in dataset card

c906a38

Co-authored-by: OpenAI Codex <codex@openai.com>

lewtun approved these changes May 1, 2026

View reviewed changes

lewtun merged commit 0bd7547 into huggingface:main May 1, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Auto-share session traces to a private HF dataset#193

Auto-share session traces to a private HF dataset#193
lewtun merged 14 commits into
huggingface:mainfrom
DarshanCode2005:share-trace-on-HF

DarshanCode2005 commented Apr 29, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 30, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 30, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 1, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 1, 2026 •

edited

Loading

Uh oh!

lewtun May 1, 2026

Uh oh!

github-actions Bot commented May 1, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 1, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 1, 2026 •

edited

Loading

Uh oh!

lewtun left a comment

Uh oh!

Uh oh!

DarshanCode2005 commented May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

DarshanCode2005 commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

Changes

Config

Uploader

Session wiring

CLI

Docs

Edge cases handled

Uh oh!

github-actions Bot commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: Auto-share session traces to a private HF dataset

What I checked

P1 — Concurrent write race between org and personal uploaders

P1 — Personal upload silently skipped when user authenticated via huggingface-cli login

P1 — All trace events share the session start timestamp

Notes

Uh oh!

github-actions Bot commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: Auto-share session traces to a private HF dataset

What I checked

Re-review notes

Uh oh!

github-actions Bot commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: Auto-share session traces to a private HF dataset

What I checked

Re-review notes

Uh oh!

github-actions Bot commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: Auto-share session traces to a private HF dataset

What I checked

Re-review notes

Uh oh!

lewtun May 1, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: Auto-share session traces to a private HF dataset

What I checked

Re-review notes

Addressing @lewtun's comment (session_uploader.py:68)

Uh oh!

github-actions Bot commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: Auto-share session traces to a private HF dataset

What I checked

Re-review notes

Uh oh!

github-actions Bot commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: Auto-share session traces to a private HF dataset

What I checked

Re-review notes

Uh oh!

lewtun left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

DarshanCode2005 commented May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

DarshanCode2005 commented Apr 29, 2026 •

edited

Loading

github-actions Bot commented Apr 30, 2026 •

edited

Loading

P1 — Personal upload silently skipped when user authenticated via `huggingface-cli login`

github-actions Bot commented Apr 30, 2026 •

edited

Loading

github-actions Bot commented May 1, 2026 •

edited

Loading

github-actions Bot commented May 1, 2026 •

edited

Loading

github-actions Bot commented May 1, 2026 •

edited

Loading

Addressing @lewtun's comment (`session_uploader.py:68`)

github-actions Bot commented May 1, 2026 •

edited

Loading

github-actions Bot commented May 1, 2026 •

edited

Loading