Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tune] experiment_analysis.py nan #9826

Closed
javadan opened this issue Jul 30, 2020 · 5 comments
Closed

[tune] experiment_analysis.py nan #9826

javadan opened this issue Jul 30, 2020 · 5 comments
Labels
bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component)

Comments

@javadan
Copy link

javadan commented Jul 30, 2020

I have a case where I can't get an Analysis object on some experiment directories.

It is looking for episode_reward_mean, but it is nan sometimes because it's an average.

When debugging, I get to the below, when looking for the best config or best log dir.
idx = df[metric].idxmax() makes idx 'nan'.


    def _retrieve_rows(self, metric=None, mode=None):
        assert mode is None or mode in ["max", "min"]
        rows = {}
        for path, df in self.trial_dataframes.items():
            if mode == "max":
                idx = df[metric].idxmax()
            elif mode == "min":
                idx = df[metric].idxmin()
            else:
                idx = -1
            rows[path] = df.iloc[idx].to_dict()

        return rows



    def get_best_config(self, metric, mode="max"):
        """Retrieve the best config corresponding to the trial.

        Args:
            metric (str): Key for trial info to order on.
            mode (str): One of [min, max].
        """
        rows = self._retrieve_rows(metric=metric, mode=mode)
        if not rows:
            # only nans encountered when retrieving rows
            logger.warning("Not able to retrieve the best config for {} "
                           "according to the specified metric "
                           "(only nans encountered).".format(
                               self._experiment_dir))
            return None
        all_configs = self.get_all_configs()
        compare_op = max if mode == "max" else min
        best_path = compare_op(rows, key=lambda k: rows[k][metric])
        return all_configs[best_path]

Could _retrieve_rows return a default value like None, rather than throwing an exception on the df.iloc[idx] ?

@javadan javadan added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Jul 30, 2020
@amogkam
Copy link
Contributor

amogkam commented Jul 30, 2020

@javadan what version of Ray is this? I believe this error has been fixed in this PR #9381. It will be included in the next 0.8.7 release. In the meantime you can install ray from the latest wheels (https://docs.ray.io/en/master/installation.html#latest-snapshots-nightlies)

@javadan
Copy link
Author

javadan commented Jul 30, 2020

Thanks, 0.8.6. I'll try out the wheels plan now.

@javadan
Copy link
Author

javadan commented Jul 30, 2020

Hmm so I updated to ray-0.9.0.dev0
and am getting some error in exploration.py and a few other files.
AttributeError: module 'gym.spaces' has no attribute 'Space'
Guess it's always a risk, running latest snapshots.
Do you think I need to roll back to a specific gym version maybe?
I think they must have changed their package names at some point, from gym to gym.spaces or vice versa.
How would I check what version of gym rllib is looking for?

@amogkam
Copy link
Contributor

amogkam commented Jul 30, 2020

Good question, I think downgrading your gym version should work? @ericl @sven1977 any advice here?

@javadan
Copy link
Author

javadan commented Jul 30, 2020

I worked around for now, by globally replacing gym.spaces with gym
Good enough for now. Thanks.

@javadan javadan closed this as completed Jul 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component)
Projects
None yet
Development

No branches or pull requests

2 participants