Skip to content

feat(timeseries): add idxmin / idxmax to TimeSeries#3115

Open
jbbqqf wants to merge 1 commit into
unit8co:masterfrom
jbbqqf:feat/2696-add-idxmin-idxmax
Open

feat(timeseries): add idxmin / idxmax to TimeSeries#3115
jbbqqf wants to merge 1 commit into
unit8co:masterfrom
jbbqqf:feat/2696-add-idxmin-idxmax

Conversation

@jbbqqf
Copy link
Copy Markdown

@jbbqqf jbbqqf commented May 9, 2026

Fixes #2696.

Summary

TimeSeries.min(axis=0) and .max(axis=0) return a single-row series whose timestamp is just the first entry of the original time index, not the timestamp of the actual minimum / maximum. As @eschibli pointed out in #2696, this is misleading — users may not read the min/max docstring before trusting the index. @madtoinou agreed in the same thread that the corresponding time-step is well-defined per component, even in the multivariate case.

This PR adds TimeSeries.idxmin() and TimeSeries.idxmax(), which return a pandas.Series indexed by component name with values being the time-index entry (Timestamp or int for RangeIndex-based series) at which each component attains its minimum / maximum — mirroring pandas.DataFrame.idxmin/idxmax. The change is purely additive (no break in existing behavior).

Other Information

Why a pd.Series and not a TimeSeries?
Each component can have a different argmin / argmax (this is exactly the multivariate dilemma raised in the issue thread). A single-row TimeSeries cannot represent that — it would have to pick one component's argmax index for the whole row. Returning a pd.Series (which is what users were already falling back to via series.pd_dataframe().idxmax()) sidesteps the problem cleanly and matches pandas semantics.

Stochastic series. For probabilistic series we reduce samples with the median first, so the returned index does not depend on n_samples.

min / max docstrings. Added a one-line note pointing readers to the new helpers when they want the actual argmin/argmax timestamp; the existing behaviour (and signature) is unchanged.

CHANGELOG. Updated under Unreleased > For users of the library > Improved.

Reproduce BEFORE/AFTER yourself (copy-paste)

# --- one-time setup ---
git clone https://github.com/unit8co/darts.git /tmp/darts-2696 && cd /tmp/darts-2696
uv sync --group dev

# --- BEFORE (origin/master) ---
git checkout origin/master
uv run python -c "
import pandas as pd
from darts import TimeSeries
idx = pd.date_range('2020-01-01', periods=5, freq='D')
ts = TimeSeries(times=idx, values=[3.0, 1.0, 4.0, 0.0, 2.0], components=['a'])
print('min(axis=0).time_index[0] =', ts.min(axis=0).time_index[0])
print('idxmin =', ts.idxmin())
"
# Expected: prints 2020-01-01 (misleading: actual min is at 2020-01-04),
#           then AttributeError: 'TimeSeries' object has no attribute 'idxmin'

# --- AFTER (this PR) ---
git fetch https://github.com/jbbqqf/darts.git feat/2696-add-idxmin-idxmax
git checkout FETCH_HEAD
uv run python -c "
import pandas as pd
from darts import TimeSeries
idx = pd.date_range('2020-01-01', periods=5, freq='D')
ts = TimeSeries(times=idx, values=[3.0, 1.0, 4.0, 0.0, 2.0], components=['a'])
print('min(axis=0).time_index[0] =', ts.min(axis=0).time_index[0])  # unchanged
print('idxmin =', ts.idxmin().to_dict())
"
# Expected: same first line, then idxmin = {'a': 2020-01-04} — the real argmin timestamp.

The two commands are identical except for the checked-out ref.

What I ran locally

  • uv run pytest darts/tests/test_timeseries.py::TestSimpleStatistics -v → 14 / 14 passed (10 pre-existing + 4 new).
  • uv run pytest darts/tests/test_timeseries.py -k 'min or max' → all green.
  • uv run ruff check darts/timeseries.py darts/tests/test_timeseries.py → all checks passed.
  • Re-ran test_idxmin_idxmax_univariate_datetime against origin/master (with the test cherry-picked but the implementation reverted) — fails with AttributeError: 'TimeSeries' object has no attribute 'idxmin'. The new tests therefore both fail on master and pass on this branch.

Edge cases tested

# Scenario Input Expected Verified by
1 Univariate, datetime index 5 floats with argmin=3, argmax=2 idxmin → 2020-01-04, idxmax → 2020-01-03 test_idxmin_idxmax_univariate_datetime
2 Multivariate, different argmin per component 2 components with disjoint min/max positions per-component values in returned pd.Series test_idxmin_idxmax_multivariate
3 RangeIndex time index Int-indexed series returned values are int, not Timestamp test_idxmin_idxmax_range_index
4 Stochastic series (50 samples around a target curve) argmin of the median curve at t=2 returns 2020-01-03 regardless of sample noise test_idxmin_idxmax_stochastic

Risk / blast radius

Purely additive — two new methods on TimeSeries, no change to min/max semantics, no signature changes elsewhere. The two min/max docstrings gain a .. note:: block but their behavior and return shape are identical to before.


PR drafted with assistance from Claude Code (Anthropic). Implementation, regression tests, and the BEFORE/AFTER reproducer above were reviewed manually against unit8co/darts@master and the discussion in #2696. The reproducer block was used during development; reviewers can paste it verbatim to verify.

`TimeSeries.min(axis=0)` and `.max(axis=0)` return a single-row series
whose timestamp is just the *first* entry of the original time index,
not the timestamp of the actual min / max value. The original issue
documents how this is misleading; the maintainer has agreed (unit8co#2696)
that adding `idxmin` / `idxmax` is the cleanest fix.

This commit:
- Adds `TimeSeries.idxmin()` and `TimeSeries.idxmax()` returning a
  `pandas.Series` indexed by component name with values being the
  time-index entries (Timestamp or int) at which each component attains
  its min / max — mirroring `pandas.DataFrame.idxmin/idxmax` semantics.
  This sidesteps the multivariate dilemma flagged in the issue (each
  component can have a different argmin/argmax) cleanly and is purely
  additive (no breaking change).
- For stochastic series we reduce samples with the median first so the
  returned index does not depend on `n_samples`.
- Adds a one-line note to `min()` / `max()` docstrings pointing readers
  to the new helpers.
- Adds 4 regression tests covering the univariate, multivariate,
  RangeIndex, and stochastic cases. The univariate test fails on
  master (`AttributeError: 'TimeSeries' object has no attribute
  'idxmin'`) and passes on this branch.
- CHANGELOG entry under Unreleased > Improved.

Refs: unit8co#2696

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@jbbqqf jbbqqf requested a review from dennisbader as a code owner May 9, 2026 22:32
Copy link
Copy Markdown
Collaborator

@jakubchlapek jakubchlapek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @jbbqqf, thanks for this PR.
The code looks fine, just had a few documentation changes I left in the comments. I also debated adding the axis parameter to the idxmin/max methods, but can't see a real usecase in which we wouldn't use multi series instead. You can let me know what you think.

Comment thread darts/timeseries.py
Comment on lines +4767 to +4771
.. note::
With ``axis=0`` the returned timestamp is the first entry of the
original ``time_index`` and **does not** correspond to the
timestamp of the actual minimum value. Use :func:`idxmin` to get
the timestamp at which each component attains its minimum.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd say this is redudnant with the rest of the docstring as the behavior is already mentioned above (`If we reduce over time (axis=0), the series will have length one and will use the first entry of the
original ``time_index```)

Comment thread darts/timeseries.py
Comment on lines +4802 to +4807
.. note::
With ``axis=0`` the returned timestamp is the first entry of the
original ``time_index`` and **does not** correspond to the
timestamp of the actual maximum value. Use :func:`idxmax` to get
the timestamp at which each component attains its maximum.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same case as above for min

Comment thread darts/timeseries.py
Comment on lines +4833 to +4838

Useful as a companion to :func:`min` because ``min(axis=0)`` returns a
single-row series whose timestamp is the *first* time index entry of
the original series, not the entry of the actual minimum (see
`issue #2696 <https://github.com/unit8co/darts/issues/2696>`_).

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd say this is unnecesssary

Comment thread darts/timeseries.py
Comment on lines +4875 to +4880

Useful as a companion to :func:`max` because ``max(axis=0)`` returns a
single-row series whose timestamp is the *first* time index entry of
the original series, not the entry of the actual maximum (see
`issue #2696 <https://github.com/unit8co/darts/issues/2696>`_).

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

Comment thread darts/timeseries.py
Comment on lines +4857 to +4859
# Reduce samples first so the result is independent of stochasticity;
# using the median (rather than mean) keeps the returned index value
# an actual observed value when n_samples == 1.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unnecessary, already mentioned in the docstring

Comment thread CHANGELOG.md

**Improved**

- Added `TimeSeries.idxmin()` and `TimeSeries.idxmax()`, returning a `pandas.Series` (indexed by component) of the time index value at which each component attains its minimum / maximum. Also clarified the docstrings of `min()` / `max()` to point users to these new helpers when they want the actual argmin/argmax timestamp. Closes [#2696](https://github.com/unit8co/darts/issues/2696).
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Added `TimeSeries.idxmin()` and `TimeSeries.idxmax()`, returning a `pandas.Series` (indexed by component) of the time index value at which each component attains its minimum / maximum. Also clarified the docstrings of `min()` / `max()` to point users to these new helpers when they want the actual argmin/argmax timestamp. Closes [#2696](https://github.com/unit8co/darts/issues/2696).
- Added `TimeSeries.idxmin()` and `TimeSeries.idxmax()`, returning a `pandas.Series` (indexed by component) of the time index value at which each component attains its minimum / maximum. Closes [#3115](https://github.com/unit8co/darts/pull/3115).

Comment on lines +3300 to +3358
def test_idxmin_idxmax_univariate_datetime(self):
# univariate, deterministic, datetime index — covers issue #2696
# where TimeSeries.min(axis=0) returns the first timestamp instead of
# the timestamp of the actual minimum.
idx = pd.date_range("2020-01-01", periods=5, freq="D")
values = np.array([3.0, 1.0, 4.0, 0.0, 2.0])
ts = TimeSeries(times=idx, values=values, components=["a"])

# idxmin / idxmax return a pd.Series indexed by component name.
idxmin = ts.idxmin()
idxmax = ts.idxmax()
assert list(idxmin.index) == ["a"]
assert list(idxmax.index) == ["a"]
# The actual minimum is at position 3 (2020-01-04) and the maximum at
# position 2 (2020-01-03). This is the load-bearing assertion: it
# fails on master where users had to fall back to pd_dataframe()
# because TimeSeries provided no idx{min,max}.
assert idxmin["a"] == pd.Timestamp("2020-01-04")
assert idxmax["a"] == pd.Timestamp("2020-01-03")

def test_idxmin_idxmax_multivariate(self):
# different argmin/argmax per component
idx = pd.date_range("2020-01-01", periods=3, freq="D")
values = np.array([[1.0, 0.0], [0.0, 0.0], [0.0, 1.0]])
ts = TimeSeries(times=idx, values=values, components=["a", "b"])

idxmin = ts.idxmin()
idxmax = ts.idxmax()
# argmin returns first occurrence of the minimum, mirroring numpy.
assert idxmin["a"] == pd.Timestamp("2020-01-02")
assert idxmin["b"] == pd.Timestamp("2020-01-01")
assert idxmax["a"] == pd.Timestamp("2020-01-01")
assert idxmax["b"] == pd.Timestamp("2020-01-03")

def test_idxmin_idxmax_range_index(self):
# RangeIndex-based series should return integer indices.
values = np.array([5.0, 3.0, 8.0, 1.0])
ts = TimeSeries(
times=pd.RangeIndex(start=10, stop=14, step=1),
values=values,
components=["x"],
)
assert ts.idxmin()["x"] == 13
assert ts.idxmax()["x"] == 12

def test_idxmin_idxmax_stochastic(self):
# For a stochastic series we reduce samples with the median first so
# the answer does not depend on how many samples were drawn.
rng = np.random.default_rng(0)
idx = pd.date_range("2020-01-01", periods=4, freq="D")
# Component "a" has its median minimum at t=2.
median_target = np.array([5.0, 3.0, 1.0, 2.0])
values = np.stack(
[median_target + rng.normal(0, 0.01, size=4) for _ in range(50)],
axis=-1,
)[:, None, :]
ts = TimeSeries(times=idx, values=values, components=["a"])
assert ts.idxmin()["a"] == pd.Timestamp("2020-01-03")

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplified the tests a bit :)

Suggested change
def test_idxmin_idxmax_univariate_datetime(self):
# univariate, deterministic, datetime index — covers issue #2696
# where TimeSeries.min(axis=0) returns the first timestamp instead of
# the timestamp of the actual minimum.
idx = pd.date_range("2020-01-01", periods=5, freq="D")
values = np.array([3.0, 1.0, 4.0, 0.0, 2.0])
ts = TimeSeries(times=idx, values=values, components=["a"])
# idxmin / idxmax return a pd.Series indexed by component name.
idxmin = ts.idxmin()
idxmax = ts.idxmax()
assert list(idxmin.index) == ["a"]
assert list(idxmax.index) == ["a"]
# The actual minimum is at position 3 (2020-01-04) and the maximum at
# position 2 (2020-01-03). This is the load-bearing assertion: it
# fails on master where users had to fall back to pd_dataframe()
# because TimeSeries provided no idx{min,max}.
assert idxmin["a"] == pd.Timestamp("2020-01-04")
assert idxmax["a"] == pd.Timestamp("2020-01-03")
def test_idxmin_idxmax_multivariate(self):
# different argmin/argmax per component
idx = pd.date_range("2020-01-01", periods=3, freq="D")
values = np.array([[1.0, 0.0], [0.0, 0.0], [0.0, 1.0]])
ts = TimeSeries(times=idx, values=values, components=["a", "b"])
idxmin = ts.idxmin()
idxmax = ts.idxmax()
# argmin returns first occurrence of the minimum, mirroring numpy.
assert idxmin["a"] == pd.Timestamp("2020-01-02")
assert idxmin["b"] == pd.Timestamp("2020-01-01")
assert idxmax["a"] == pd.Timestamp("2020-01-01")
assert idxmax["b"] == pd.Timestamp("2020-01-03")
def test_idxmin_idxmax_range_index(self):
# RangeIndex-based series should return integer indices.
values = np.array([5.0, 3.0, 8.0, 1.0])
ts = TimeSeries(
times=pd.RangeIndex(start=10, stop=14, step=1),
values=values,
components=["x"],
)
assert ts.idxmin()["x"] == 13
assert ts.idxmax()["x"] == 12
def test_idxmin_idxmax_stochastic(self):
# For a stochastic series we reduce samples with the median first so
# the answer does not depend on how many samples were drawn.
rng = np.random.default_rng(0)
idx = pd.date_range("2020-01-01", periods=4, freq="D")
# Component "a" has its median minimum at t=2.
median_target = np.array([5.0, 3.0, 1.0, 2.0])
values = np.stack(
[median_target + rng.normal(0, 0.01, size=4) for _ in range(50)],
axis=-1,
)[:, None, :]
ts = TimeSeries(times=idx, values=values, components=["a"])
assert ts.idxmin()["a"] == pd.Timestamp("2020-01-03")
def test_idxmin_idxmax_univariate_datetime(self):
idx = pd.date_range("2020-01-01", periods=5, freq="D")
values = np.array([3.0, 1.0, 4.0, 0.0, 2.0])
ts = TimeSeries(times=idx, values=values, components=["a"])
idxmin = ts.idxmin()
idxmax = ts.idxmax()
assert list(idxmin.index) == ["a"]
assert list(idxmax.index) == ["a"]
# minimum at position 3 (2020-01-04), maximum at position 2 (2020-01-03)
assert idxmin["a"] == pd.Timestamp("2020-01-04")
assert idxmax["a"] == pd.Timestamp("2020-01-03")
def test_idxmin_idxmax_multivariate(self):
idx = pd.date_range("2020-01-01", periods=3, freq="D")
values = np.array([[1.0, 0.0], [0.0, 0.0], [0.0, 1.0]])
ts = TimeSeries(times=idx, values=values, components=["a", "b"])
idxmin = ts.idxmin()
idxmax = ts.idxmax()
# first occurrence of the minimum, mirroring numpy
assert idxmin["a"] == pd.Timestamp("2020-01-02")
assert idxmin["b"] == pd.Timestamp("2020-01-01")
assert idxmax["a"] == pd.Timestamp("2020-01-01")
assert idxmax["b"] == pd.Timestamp("2020-01-03")
def test_idxmin_idxmax_range_index(self):
values = np.array([5.0, 3.0, 8.0, 1.0])
ts = TimeSeries(
times=pd.RangeIndex(start=10, stop=14, step=1),
values=values,
components=["x"],
)
assert ts.idxmin()["x"] == 13
assert ts.idxmax()["x"] == 12
def test_idxmin_idxmax_stochastic(self):
idx = pd.date_range("2020-01-01", periods=3, freq="D")
# sample 0 min at t=1, sample 1 min at t=2 — medians [3.0, 2.0, 1.5], min at t=2
values = np.array([[[2.0, 4.0]], [[1.0, 3.0]], [[3.0, 0.0]]]) # (3, 1, 2)
ts = TimeSeries(times=idx, values=values, components=["a"])
assert ts.idxmin()["a"] == pd.Timestamp("2020-01-03")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TimeSeries.max(axis=0), .min(axis=0) return a meaningless index

2 participants