Added heatmap and bounded/normalization to metrics #559
Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates the results UI to use a heatmap (instead of a radar chart) for comparing metrics across runs, and introduces backend/FE support for “bounded” metric metadata to drive consistent [0,1]-based color mapping.
Changes:
- Replaced the radar chart option with a heatmap visualization and updated Plotly layout/trace builders accordingly.
- Added
BOUNDEDto backend metric metadata (BaseMetric.get_metadata) and annotated several metrics with boundedness. - Improved model comparison table readability by truncating long model/metric names with tooltips.
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| DashAI/front/src/utils/i18n/locales/en/models.json | Adds heatmap label and removes radar label. |
| DashAI/front/src/utils/i18n/locales/es/models.json | Adds heatmap label and removes radar label. |
| DashAI/front/src/pages/results/constants/layoutMaking.jsx | Updates Plotly layout behavior for heatmap vs bar. |
| DashAI/front/src/pages/results/constants/graphsMaking.jsx | Removes radar trace creation; adds heatmapMaking trace builder. |
| DashAI/front/src/pages/results/components/ResultsGraphsSelection.jsx | Switches chart toggle from radar to heatmap. |
| DashAI/front/src/pages/results/components/ResultsGraphsPlot.jsx | Renders heatmap traces when selected. |
| DashAI/front/src/pages/results/components/ResultsGraphs.jsx | Fetches metric metadata and builds the heatmap trace. |
| DashAI/front/src/components/models/ModelComparisonTable.jsx | Truncates long names with tooltip for readability. |
| DashAI/back/metrics/base_metric.py | Introduces BOUNDED and exposes it via get_metadata. |
| DashAI/back/metrics/classification_metric.py | Sets default BOUNDED for classification metrics. |
| DashAI/back/metrics/classification/log_loss.py | Normalizes log loss by log(num_classes) and marks as bounded. |
| DashAI/back/metrics/classification/cohen_kappa.py | Doc formatting change for reference URL. |
| DashAI/back/metrics/translation/bleu.py | Marks BLEU as bounded. |
| DashAI/back/metrics/translation/chrf.py | Marks CHRF as bounded. |
| DashAI/back/metrics/translation/ter.py | Marks TER as unbounded. |
Comments suppressed due to low confidence (1)
DashAI/front/src/pages/results/constants/graphsMaking.jsx:36
graphsMaking's public API has changed (it now expects(graphsToView, run, metrics, values, runIndex, theme)and only builds bar traces). There are still call sites in the repo that import this module and callgraphsMakingwith the previous signature (e.g.front/src/components/pipelines/results/ResultsGraphs.jsx), which will produce incorrect Plotly trace data at runtime. Please either update those call sites in this PR or keepgraphsMakingbackward-compatible (e.g. via an adapter/overload or a separate export for the pipeline view).
function graphsMaking(graphsToView, run, metrics, values, runIndex, theme) {
graphsToView.bar = graphsToView.bar || [];
const colors = getTraceColors(theme);
const color = colors[runIndex % colors.length];
const runLabel = run.run_name || run.name || `Run ${runIndex + 1}`;
graphsToView.bar.push({
type: "bar",
name: runLabel,
x: metrics,
y: values,
marker: { color, opacity: 0.85 },
});
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
This needs some work as we talked in today's meeting |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.
Comments suppressed due to low confidence (1)
DashAI/back/metrics/classification/log_loss.py:90
- PR description mentions normalizing LogLoss to a bounded [0,1] score, but
score()still returns rawsklearn.metrics.log_loss(...)(unbounded, range [0,+∞)). Either implement the normalization (and adjust any metadata like boundedness accordingly) or update the documentation/PR description so the frontend doesn't assume a bounded value.
Returns
-------
float
Log Loss, lower is better.
"""
from sklearn.metrics import log_loss
true_labels, _ = prepare_to_metric(true_labels, probs_pred_labels)
return log_loss(true_labels, probs_pred_labels)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…localization messages

This pull request introduces a new heatmap visualization for model metric comparisons in the results UI, replacing the previous radar chart. It also standardizes the concept of "bounded" metrics (i.e., metrics with a natural [0, 1] range) in both backend and frontend, enabling better color-coding and interpretation of metric values. Several metrics are updated to explicitly declare their boundedness, and the frontend now fetches this metadata to drive the new heatmap's color mapping and annotations.
Visualization improvements:
Added a new heatmap chart for comparing metrics across runs, replacing the radar chart. The heatmap uses backend-provided
boundedandmaximizemetadata to color-code cells, making it easier to interpret metric performance. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14]Improved model comparison table: metric and model names are now truncated with tooltips for readability.
Backend metric metadata:
Introduced a new
BOUNDEDattribute toBaseMetricand its subclasses, indicating if a metric is naturally in [0, 1]. This is now included in the metric metadata returned to the frontend. [1] [2] [3] [4] [5] [6] [7]Updated several metrics to explicitly set
BOUNDED, includingLogLoss,Bleu,Chrf, andTer, with appropriate comments on their value ranges. [1] [2] [3] [4]Metric normalization and documentation:
Modified
LogLossto normalize its output to [0, 1] by dividing bylog(num_classes), and updated its documentation to clarify the meaning of the normalized score. [1] [2]Minor documentation formatting fix in
cohen_kappa.pyfor readability.