Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ImportError: cannot import name makedirs from sklearn.utils.fixes #574

Closed
gykovacs opened this issue Jun 4, 2019 · 9 comments · Fixed by #576
Closed

ImportError: cannot import name makedirs from sklearn.utils.fixes #574

gykovacs opened this issue Jun 4, 2019 · 9 comments · Fixed by #576

Comments

@gykovacs
Copy link

@gykovacs gykovacs commented Jun 4, 2019

Description

importing imblearn.datasets with sklearn 0.21.2 causes ImportError in both Python 3.5, 3.6, 3.7

Steps/Code to Reproduce

Example:

import imblearn.datasets

Expected Results

No error is thrown.

Actual Results

Traceback (most recent call last):
File "", line 1, in
File "/home/gykovacs/anaconda3/envs/sv/lib/python3.7/site-packages/imblearn/datasets/init.py", line 8, in
from ._zenodo import fetch_datasets
File "/home/gykovacs/anaconda3/envs/sv/lib/python3.7/site-packages/imblearn/datasets/_zenodo.py", line 60, in
from sklearn.utils.fixes import makedirs
ImportError: cannot import name 'makedirs' from 'sklearn.utils.fixes' (/home/gykovacs/anaconda3/envs/sv/lib/python3.7/site-packages/sklearn/utils/fixes.py)

Versions

Linux-5.0.0-15-generic-x86_64-with-debian-buster-sid
Python 3.7.3 (default, Mar 27 2019, 22:11:17)
[GCC 7.3.0]
NumPy 1.16.4
SciPy 1.3.0
Scikit-Learn 0.21.2
Imbalanced-Learn 0.4.3

@hayesall

This comment has been minimized.

Copy link
Contributor

@hayesall hayesall commented Jun 4, 2019

Good find, I just reproduced it. It looks like the problem lines are here:

if download_if_missing and not available:
makedirs(zenodo_dir, exist_ok=True)

I didn't find what sklearn.utils.fixes.makedirs did, or when it was removed. But it looks like it's doing something similar to pathlib.Path.mkdir, e.g. zenodo_dir.mkdir(exist_ok=True)

edit:

Found the signature removed in scikit-learn:2bd87f6e, copied below:

if 'exist_ok' in signature(os.makedirs).parameters:
    makedirs = os.makedirs
else:
    def makedirs(name, mode=0o777, exist_ok=False):
        """makedirs(name [, mode=0o777][, exist_ok=False])
        Super-mkdir; create a leaf directory and all intermediate ones.  Works
        like mkdir, except that any intermediate path segment (not just the
        rightmost) will be created if it does not exist. If the target
        directory already exists, raise an OSError if exist_ok is False.
        Otherwise no exception is raised.  This is recursive.
        """

        try:
            os.makedirs(name, mode=mode)
        except OSError as e:
            if (not exist_ok or e.errno != errno.EEXIST
                    or not os.path.isdir(name)):
                raise
@gykovacs

This comment has been minimized.

Copy link
Author

@gykovacs gykovacs commented Jun 5, 2019

Shouldn't this piece of code define makedirs either way? I mean either as os.makedirs or as the function in the else branch?

@chkoar

This comment has been minimized.

Copy link
Member

@chkoar chkoar commented Jun 5, 2019

Good catch.

Shouldn't this piece of code define makedirs either way? I mean either as os.makedirs or as the function in the else branch?

No. The code that posted by @hayesall used to be in scikit-learn. I suppose that this was a trick to make a unified makedirs function that will work across Python versions. Since it does not exist anymore this line fails. CIs should probably catch this bug as we test against the scikit-learn's master.

Also, in the README we state that imbalanced-learn 0.4 is the last version to support Python 2.7. So, we could either use the Path class from the pathlib or use directly the os.makedirs function.

@gykovacs

This comment has been minimized.

Copy link
Author

@gykovacs gykovacs commented Jun 5, 2019

Yeah, sorry, I overlooked @hayesall's comment. Would it make sense then to just replace
from sklearn.utils.fixes import makedirs by import os.makedirs as makedirs?

@chkoar

This comment has been minimized.

Copy link
Member

@chkoar chkoar commented Jun 5, 2019

I believe that would make the CI to pass the tests.

@gykovacs

This comment has been minimized.

Copy link
Author

@gykovacs gykovacs commented Jun 5, 2019

As far as I see, the tests are passing now, even though scikit-learn 0.21.2 comes from both the pypi and conda repositories. Regarding TravisCI, I see that sklearn 0.20 is wired in the config file, however, I could not find any wired version for CircleCI. Wouldn't it make sense to let the CI jobs work with most recent versions?

@chkoar

This comment has been minimized.

Copy link
Member

@chkoar chkoar commented Jun 5, 2019

@gykovacs well, I believe that it is debatable in what versions we should support. It is true that it is hard coded to use 0.20. Although, we test against the upstream master allowing failures, though. It make sense to test against the latest version of scikit-learn but this might break people code that will upgrade imbalanced-learn and not scikit-learn As I can see CircleCI should use the latest release of scikit-learn.

@gykovacs

This comment has been minimized.

Copy link
Author

@gykovacs gykovacs commented Jun 5, 2019

Agree, this is a strategic question. However, in fact, if someone installs sklearn and imblearn from PyPi, as of now, this bug will be present.

One important thing to highlight is that imblearn pulls the latest sklearn, and does not support it: if someone installs only imblearn from PyPi, it won't work with the sklearn it pulls as a dependency.

Another option would be to wire the support of sklearn 0.20 as a dependency.

I am still wondering why CircleCI passes if it pulls the lastest sklearn.

@gykovacs

This comment has been minimized.

Copy link
Author

@gykovacs gykovacs commented Jun 5, 2019

I just checked, the CircleCI does not pass, so it works fine. The issue is shown up in the following feature branch testing:
https://ci.appveyor.com/project/glemaitre/imbalanced-learn/build/job/t44c3dnibfx8jo7a

UPDATE:
The last time CircleCI passed was a month ago:
https://ci.appveyor.com/project/glemaitre/imbalanced-learn/history

So basically, as of now, the master does not pass because of the sklearn update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.