Make running trials in timeline plot visible #731

nabenabe0928 · 2023-12-12T12:30:39Z

Contributor License Agreement

This repository (optuna-dashboard) and Goptuna share common code.
This pull request may therefore be ported to Goptuna.
Make sure that you understand the consequences concerning licenses and check the box below if you accept the term before creating this pull request.

I agree this patch may be ported to Goptuna by other Goptuna contributors.

Reference Issues/PRs

This PR makes the running trials in the timeline plot visible.
Currently, running trials are handled as a run duration of zero.
For this reason, we cannot see running trials in the timeline plot although it actually exists.
In this PR, I aim to make them appear in the timeline plot.
Plus, I revised the code so that trials, which were killed before they completed, will handle the max time in the plot as the max completed/pruned/failed timing.

What does this implement/fix? Explain your changes.

This PR makes the following changes:

Running trials will appear on the timeline plot, and
The maximum of the x-axis is handled as either the current date time (if the current running durations are not too long) or the maximum completed time.

codecov · 2023-12-12T12:35:18Z

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (c38d0d9) 67.20% compared to head (1c0477e) 68.22%.
Report is 64 commits behind head on main.

Files	Patch %	Lines
optuna_dashboard/_app.py	97.22%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #731      +/-   ##
==========================================
+ Coverage   67.20%   68.22%   +1.02%     
==========================================
  Files          35       35              
  Lines        2293     2329      +36     
==========================================
+ Hits         1541     1589      +48     
+ Misses        752      740      -12

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

The last version could get maxRunDuration of zero when we have waiting or failed trials without any complete or pruned trials. It leads to false in the judge of hasRunning. However, in this case, we should be able to say that we have running trials. I fixed this issue in this commit.

c-bata · 2023-12-13T01:14:16Z

@keisuke-umezawa Could you review this PR?

nabenabe0928 · 2023-12-13T06:16:00Z

I checked the behavior with the following (checking the dashboard while the following is running to check the green bar grows over time):

import time

import optuna


def objective(trial: optuna.Trial) -> float:
    x = trial.suggest_float("x", -5, 5)
    time.sleep(30)
    return x


if __name__ == "__main__":
    study = optuna.create_study(storage="sqlite:///demo.db")
    study.optimize(objective, n_trials=100)

Another check (check dashboard after the run finishes so that we can see that the green bar does not squash the other bars after a while):

def create_study_with_various_trials(storage: str) -> None:
    rng = np.random.RandomState(42)

    def _objective_intermediate_values(trial: optuna.Trial) -> float:
        x = trial.suggest_categorical("x", choices=["a", "b", "c"])
        rnd = rng.random()
        if rnd < 0.25:
            trial.report(trial.number, step=0)
            trial.report(trial.number + 1, step=1)
        elif rnd < 0.5:
            trial.report(trial.number, step=0)
            raise optuna.TrialPruned()
        elif rnd < 0.75:
            raise optuna.TrialPruned()
        else:
            raise ValueError("Unexpected Error.")

        return 0.0

    study = optuna.create_study(study_name="various-trials", storage=storage)
    study.optimize(_objective_intermediate_values, n_trials=40, catch=(Exception,))
    study.enqueue_trial({"x": "a"})
    study.ask({"x": CategoricalDistribution(["a", "b", "c"])})


if __name__ == "__main__":
    create_study_with_various_trials(storage="sqlite:///demo.db")

keisuke-umezawa · 2023-12-24T06:46:47Z

optuna_dashboard/ts/components/GraphTimeline.tsx

+      const start = t.datetime_start?.getTime() ?? new Date().getTime()
+      const now = new Date().getTime()
+      // This is an ad-hoc handling to check if the trial is running.
+      return now - start < maxRunDuration * 5


[question] Why do we need those lines for ad-hoc handling?

For me, it can be just simply with const hasRunning = trials.some((t) => t.state === runningKey), but do you have some reasons that you need to write those lines?

We can remove it as well, but then the problem is that the timeline for RUNNING trials, which was not somehow killed properly (it is basically preemption kill, which often happens in our internal cluster) then the timeline plot for the other trials will not be visible because they will be squashed by the running trials, which are not really running because of preemption kill.

keisuke-umezawa

I left come questions and comments. Could you check them?

optuna_dashboard/ts/components/GraphTimeline.tsx

keisuke-umezawa · 2023-12-24T06:54:20Z

optuna_dashboard/ts/components/GraphTimeline.tsx

-    const completes = bars.map((b, i) => b.datetime_complete ?? starts[i])
+    const runDurations = bars.map((b) => {
+      const start = b.datetime_start?.getTime() ?? new Date().getTime()
+      const complete = b.datetime_complete?.getTime() ?? start


How about this since we can use the current max time for running trials?

const complete = b.datetime_complete?.getTime() ?? maxDatetime.getTime()

start is necessary for waiting trials.

keisuke-umezawa · 2023-12-24T06:56:29Z

optuna_dashboard/ts/components/GraphTimeline.tsx

+      return Math.max(
+        1,
+        !isRunning ? complete - start : maxDatetime.getTime() - start
+      )
+    })


[question] Why do we need to get max here? If we fix the code as I said in the above, those lines can be just complete - start?

For me, it seems that we can use the following lines:

const runDurations = bars.map((b, i) => { const start = starts[i].getTime() const complete = b.datetime_complete?.getTime() ?? maxDatetime.getTime() return complete - start }

The lower bound 1 is necessary to be able to recognize at least bars exist in case runDurations=0.
Otherwise, I think your code is better.

Oh, actually, const complete = b.datetime_complete?.getTime() ?? maxDatetime.getTime() is incorrect when we think of trialState===WAITING.

Ok, I understand that you want to show waiting and in some cases also running trials.

In the optuna, it uses the current time for complete if a trial does not have it when it is waiting or running.
ref: https://github.com/optuna/optuna/blob/master/optuna/visualization/_timeline.py#L85-L86 Are you planning to make back port to optuna itself?

I did the backport here

keisuke-umezawa

Sorry, I wrongly added approved. I changed it to ng 🙇

keisuke-umezawa

Thank you for working on it!
I just add a nits comment. If you fix it, you can merge it.
Btw, it makes some differences with optuna itself, and it is recommended to make back port to optuna.

optuna_dashboard/ts/components/GraphTimeline.tsx

keisuke-umezawa · 2023-12-25T07:24:07Z

optuna_dashboard/ts/components/GraphTimeline.tsx

+      return Math.max(
+        1,
+        !isRunning ? complete - start : maxDatetime.getTime() - start
+      )
+    })


Ok, I understand that you want to show waiting and in some cases also running trials.

In the optuna, it uses the current time for complete if a trial does not have it when it is waiting or running.
ref: https://github.com/optuna/optuna/blob/master/optuna/visualization/_timeline.py#L85-L86 Are you planning to make back port to optuna itself?

nabenabe0928 · 2024-01-16T06:12:31Z

@keisuke-umezawa I made a backport to Optuna! Are there any other things I should do?

keisuke-umezawa · 2024-01-21T06:40:52Z

@nabenabe0928
The implementation in optuna side is not fixed yet, and it will make some differences between optuna and optuna-dashboad later. But, I believe that you will make a back port again from optuna to optuna-dashboard. So, I will merge this PR.

nabenabe0928 added 2 commits December 12, 2023 13:26

Make running trials in timeline plot visible

52c090a

Apply formatter

059f475

nabenabe0928 added 2 commits December 12, 2023 14:19

Refactor the code

656994f

nabenabe0928 force-pushed the bug-fix/make-running-trials-in-timeline-plot-visible branch from b8f9ae5 to 895e327 Compare December 12, 2023 13:41

Debug the maxDateTime

7a1f791

nabenabe0928 marked this pull request as ready for review December 12, 2023 14:10

c-bata assigned keisuke-umezawa Dec 13, 2023

keisuke-umezawa reviewed Dec 24, 2023

View reviewed changes

keisuke-umezawa approved these changes Dec 24, 2023

View reviewed changes

keisuke-umezawa requested changes Dec 24, 2023

View reviewed changes

nabenabe0928 added 2 commits December 24, 2023 08:48

Address some comments by umezawa

df16f73

Refactor based on umezawa's comment

ad25ad9

keisuke-umezawa approved these changes Dec 25, 2023

View reviewed changes

Change the waiting trials' start time

1c0477e

nabenabe0928 mentioned this pull request Dec 27, 2023

Backport the change of the timeline plot in Optuna Dashboard optuna/optuna#5168

Merged

nabenabe0928 requested a review from keisuke-umezawa January 15, 2024 09:05

keisuke-umezawa merged commit 6752b17 into optuna:main Jan 21, 2024
11 checks passed

nabenabe0928 mentioned this pull request Mar 4, 2024

Running trials not showing in the Timeline graph #820

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make running trials in timeline plot visible #731

Make running trials in timeline plot visible #731

nabenabe0928 commented Dec 12, 2023 •

edited

Loading

codecov bot commented Dec 12, 2023 •

edited

Loading

c-bata commented Dec 13, 2023

nabenabe0928 commented Dec 13, 2023 •

edited

Loading

keisuke-umezawa Dec 24, 2023

keisuke-umezawa Dec 24, 2023

nabenabe0928 Dec 24, 2023 •

edited

Loading

keisuke-umezawa left a comment

keisuke-umezawa Dec 24, 2023 •

edited

Loading

nabenabe0928 Dec 24, 2023

keisuke-umezawa Dec 24, 2023 •

edited

Loading

keisuke-umezawa Dec 24, 2023

nabenabe0928 Dec 24, 2023

nabenabe0928 Dec 24, 2023

keisuke-umezawa Dec 25, 2023

nabenabe0928 Dec 27, 2023

keisuke-umezawa left a comment

keisuke-umezawa left a comment

keisuke-umezawa Dec 25, 2023

nabenabe0928 commented Jan 16, 2024

keisuke-umezawa commented Jan 21, 2024

Make running trials in timeline plot visible #731

Make running trials in timeline plot visible #731

Conversation

nabenabe0928 commented Dec 12, 2023 • edited Loading

Contributor License Agreement

Reference Issues/PRs

What does this implement/fix? Explain your changes.

codecov bot commented Dec 12, 2023 • edited Loading

Codecov Report

c-bata commented Dec 13, 2023

nabenabe0928 commented Dec 13, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nabenabe0928 Dec 24, 2023 • edited Loading

Choose a reason for hiding this comment

keisuke-umezawa left a comment

Choose a reason for hiding this comment

keisuke-umezawa Dec 24, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

keisuke-umezawa Dec 24, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

keisuke-umezawa left a comment

Choose a reason for hiding this comment

keisuke-umezawa left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nabenabe0928 commented Jan 16, 2024

keisuke-umezawa commented Jan 21, 2024

nabenabe0928 commented Dec 12, 2023 •

edited

Loading

codecov bot commented Dec 12, 2023 •

edited

Loading

nabenabe0928 commented Dec 13, 2023 •

edited

Loading

nabenabe0928 Dec 24, 2023 •

edited

Loading

keisuke-umezawa Dec 24, 2023 •

edited

Loading

keisuke-umezawa Dec 24, 2023 •

edited

Loading