docs: update TELEMETRY.md to reflect current event set#893
Conversation
The event list in TELEMETRY.md was missing three events that the telemetry layer actually emits: model_load, dataset, and extension_entry. A handful of per-event metadata fields were also not documented (runtime_kernel, runtime_environment, platform_os), and there was a numpy_version typo. This patch brings the document in line with the events declared in tabpfn_common_utils.telemetry.core.events, and adds a per-event breakdown of extra metadata so users can see exactly which fields are attached to each event type. No behavioural change.
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
There was a problem hiding this comment.
Code Review
This pull request updates the TELEMETRY.md file to provide a more comprehensive and accurate description of the data collected by the project. It introduces several new events (model_load, dataset, extension_entry), expands the metadata collected for all events (e.g., OS, runtime environment, library versions), and restructures the per-event metadata section. A review comment suggests including the fit_mode parameter in the metadata description for the session event to ensure full transparency.
| - `install_id` – unique, random and anonymous installation ID | ||
|
|
||
| ### Extra metadata (per-event) | ||
| - `fit_called` / `predict_called`: `task` (classification or regression), `num_rows` (*rounded*), `num_columns` (*rounded*), `duration_ms` |
There was a problem hiding this comment.
The session event also appears to collect the fit_mode parameter, as seen in the initialization of TabPFNClassifier and TabPFNRegressor (e.g., log_model_init_params(self, {"fit_mode": self.fit_mode})). It should be included in the "Extra metadata (per-event)" section for completeness and transparency.
| - `fit_called` / `predict_called`: `task` (classification or regression), `num_rows` (*rounded*), `num_columns` (*rounded*), `duration_ms` | |
| - `session`: `fit_mode`\n- `fit_called` / `predict_called`: `task` (classification or regression), `num_rows` (*rounded*), `num_columns` (*rounded*), `duration_ms` |
There was a problem hiding this comment.
Pull request overview
Updates the repository’s telemetry documentation to reflect the currently emitted event set and associated metadata, aligning TELEMETRY.md with the telemetry layer’s present behavior.
Changes:
- Expands the documented telemetry events to include
model_load,dataset, andextension_entry, and clarifies existing event descriptions. - Updates the “Metadata (all events)” list to include additional fields and fixes the
numpy_vesion→numpy_versiontypo. - Reorganizes event-specific fields into an “Extra metadata (per-event)” section for clarity.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Summary
model_load,dataset, andextension_entry.runtime_kernel,runtime_environment,platform_os.numpy_vesion→numpy_versiontypo.Brings
TELEMETRY.mdinto line with the events declared intabpfn_common_utils.telemetry.core.events.Test plan
tabpfn_common_utils/src/tabpfn_common_utils/telemetry/core/events.py