Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Fix "ResourceWarning: unclosed file" when running tests with Python 3 #3410

Open
ogrisel opened this Issue Jul 17, 2014 · 31 comments

Comments

Projects
None yet
7 participants
Owner

ogrisel commented Jul 17, 2014

Running the tests highlights that we do not close file object explicitly in many places of the code. In Python 3.4+ this raises warnings such as:

scikit-learn/sklearn/datasets/base.py:1: ResourceWarning: unclosed file [...]

This is probably a matter of wrapping the code that manipulates such recently opened file objects in a with statement:

f = open(some_filename)
# do something with f

by:

with open(some_filename) as f:
    # do something with f

Run the tests with nosetests -v sklearn to find them all.

Hello, I'm an absolute beginner. The issue looks simple to me and something a beginner like me could fix.

Could someone please explain the running the tests part?

Sorry for being a n00b, I'm just getting started!

Owner

MechCoder commented Jul 17, 2014

Hi, thanks for looking into this, Nose helps us to test our software. You need to install nose first. https://nose.readthedocs.org/en/latest/

After doing that as @ogrisel has said, in Python 3 (I'm atleast not unable to reproduce this in Python 2). run nosetests -v sklearn, which is a verbose way of checking which tests are being run.

Hey, I ran nosetests.

Ran 1164 tests in 160.732s
FAILED (SKIP=14, errors=2)

Errors:

1. ERROR: sklearn.cluster.bicluster.tests.test_utils.test_get_submatrix
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/usr/lib/python2.7/dist-packages/sklearn/cluster/bicluster/tests/test_utils.py", line 43, in test_get_submatrix
    assert_true(np.all(X != -1))
  File "/usr/lib/python2.7/unittest/case.py", line 422, in assertTrue
    if not expr:
  File "/usr/lib/python2.7/dist-packages/scipy/sparse/base.py", line 183, in __bool__
    raise ValueError("The truth value of an array with more than one "
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all().


2. ERROR: sklearn.cluster.tests.test_spectral.test_spectral_lobpcg_mode
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/usr/lib/python2.7/dist-packages/sklearn/cluster/tests/test_spectral.py", line 66, in test_spectral_lobpcg_mode
    random_state=0, eigen_solver="lobpcg")
  File "/usr/lib/python2.7/dist-packages/sklearn/cluster/spectral.py", line 268, in spectral_clustering
    eigen_tol=eigen_tol, drop_first=False)
  File "/usr/lib/python2.7/dist-packages/sklearn/manifold/spectral_embedding_.py", line 303, in spectral_embedding
    largest=False, maxiter=2000)
  File "/usr/lib/python2.7/dist-packages/scipy/sparse/linalg/eigen/lobpcg/lobpcg.py", line 411, in lobpcg
    activeBlockVectorBP, retInvR=True)
  File "/usr/lib/python2.7/dist-packages/scipy/sparse/linalg/eigen/lobpcg/lobpcg.py", line 148, in b_orthonormalize
    gramVBV = sla.cholesky(gramVBV)
  File "/usr/lib/python2.7/dist-packages/scipy/linalg/decomp_cholesky.py", line 81, in cholesky
    check_finite=check_finite)
  File "/usr/lib/python2.7/dist-packages/scipy/linalg/decomp_cholesky.py", line 30, in _cholesky
    raise LinAlgError("%d-th leading minor not positive definite" % info)
LinAlgError: 2-th leading minor not positive definite
Owner

ogrisel commented Jul 18, 2014

@dhawaljoh this is completely unrelated to this issue, please open a new issue to discuss failing tests there instead. Also please use triple-backticks or <pre> blocks to quote error messages in markdown formatted comments / issue descriptions. Otherwise it's not readable. I fixed your comment for you.

Thanks, @ogrisel. I'll make sure I know my markdown before posting again!

I tried running:

nosetests -v --py3where=PY3WHERE sklearn

That gave me the same output as in my previous comment.

How do I run the test so as to detect the unclosed file warnings? I read the help for nosetests, but, I'm not too sure.

@MechCoder could you help me out as to how to proceed?

Member

lesteve commented Jul 19, 2014

Just looking at your traceback above, it seems like you are running python2.7 and this issue is python3 specific (more exactly the ResourceWarnings are only emitted with python3) so you probably want to set-up a python3 environment.

When you run the tests with python3, you will see warning like this:

ResourceWarning: unclosed file <_io.TextIOWrapper name='/home/lesteve/python/scikit-learn/sklearn/datasets/descr/boston_house_prices.rst' mode='r' encoding='UTF-8'>.

and

ResourceWarning: unclosed file <_io.BufferedReader name='/home/lesteve/python/scikit-learn/sklearn/datasets/tests/data/svmlight_classification.txt'>

and the goal is to remove all of them.

Owner

ogrisel commented Jul 19, 2014

@dhawaljoh First install Python 3.3 or 3.4 by using the package of your Linux distrib if it's recent enough or by building from source or using a ppa repo such as https://launchpad.net/~fkrull/+archive/ubuntu/deadsnakes for your version of Ubuntu.

Then install nosetests for your Python 3 interpreter:

$ /path/to/python3 -m pip install nose

If you don't have pip, get it from by running get-pip.py with your python3 interpreter.

this should install a nosetests-3.3 or nosetests-3.4 command in the same bin folder where python3 is installed. Run the tests with that command.

Owner

ogrisel commented Jul 19, 2014

To know the version of a Linux system and architecture you can use:

$ lsb_release -a
$ uname -a

I broke dependencies. I lost GUI on my Ubuntu. I've migrated to Windows for now, I'll get back on this soon as I can.

Hello again,
I have a virtual env set up and have nosetests installed.
I also have installed sklearn buy when I do:

nosetests -v sklearn

I get an error:

ERROR: Failure: ImportError (No module named 'sklearn')

Though I can use sklearn when not in virtualenv.

Owner

MechCoder commented Aug 7, 2014

What did you do to install scikit-learn?

sudo apt-get install python-sklearn

and on the virtual env -

sudo pip3 install -U scikit-learn
Owner

larsmans commented Aug 11, 2014

Install nose in the virtualenv.

@larsmans, I have.

nosetests -V

returns:

nosetests version 1.3.3
Owner

ogrisel commented Aug 11, 2014

@dhawaljoh but make sure that the nosetests command installed in your PATH uses the Python from your virtualenv. So you need to re-install nose in your virtualenv and use the nose command that comes from it. In particular it as I already mentioned in this previous comment, the nosetest command when installed in a Python 3 interpreter tends to have an explicit sufiix that matches the version of Python, e.g. nosetests-3.4

@ogrisel, I installed nosetests by specifying the path to Python3 on my virtualenv and now have a nosetests-3.4 file in the same directory as Python3 on the virtualenv.

I'm confused. My understanding is that since I want the nosetests to use python3, I should be executing nosetests-3.4 -v sklearn from my virtualenv. Right?

But when I do it, it gives me the same error as before, i.e:
ERROR: Failure: ImportError (No module named 'sklearn')

So, I opened up the python interpreter and keyed in import sklearn which throws an error: ImportError: No module named 'sklearn'

So, I did: sudo apt-get install python-sklearn

But that tells me: python-sklearn is already the newest version.

I've uninstalled sklearn and reinstalled, but in vain.

Owner

ogrisel commented Aug 12, 2014

You need to use the pip command from your venv to install both sklearn and nose in the same venv and then use the command.

Do pip show nose and pip show scikit-learn with the pip command from your venv to see which version is installed and where.

Installing stuff in the global Python is useless if you want to use the venv. Alternatively you can install everything in your Python 3 home directly (without using any venv) as I already detailed in this comment. Use the absolute path to each command (pip, python, nosetests) to make sure that you are using the one you expect and not another unrelated version that happens to be in your PATH.

Owner

ogrisel commented Aug 12, 2014

In a Linux terminal, you can find the path of the executable of a command by running:

which nosetests-3.4

and:

which pip3

and:

which python3

If you want to use virtuaenvs, you need to make sure that it's properly activated and that all 3 which commands return files from with your venv and not from your system python.

Installing sklearn using apt-get will only install it on your system python, not in your venv. You need to install sklearn with the pip3 from your venv to install it in your venv.

To run the tests on sklearn from your venv you need to run the nostests-3.4 command from the same venv.

Owner

ogrisel commented Aug 12, 2014

You should read the virtualenv user documentation.

Thank you so much for your patience, @ogrisel and everybody else.
I didn't have python3-dev on my venv, so my scipy hadn't installed correctly. I got it running.

So, on running nosetests-3.4 -v sklearn

I get:

Ran 3328 tests in 179.922s

OK (SKIP=19)

No warning were raised.

Bump?

Owner

MechCoder commented Aug 24, 2014

No warning were raised.

Are you sure? I followed the exact instructions that @ogrisel gave. I get warnings like

Doctest: sklearn.datasets.base.load_boston ... /home/manoj/scikit-learn/sklearn/datasets
/base.py:1:    ResourceWarning: unclosed file <_io.TextIOWrapper name='/home/manoj/scikit-
learn/sklearn/datasets/data/boston_house_prices.csv' mode='r' encoding='UTF-8'>

in between.
Try redirecting the console output to a text file, and then just search for the resource warnings.

Owner

MechCoder commented Aug 24, 2014

The warnings are in between, not in the end, if that is what you are confused about.

Umm, I got this working, finally!

I follow the instructions on this page to make the required changes, right?

Owner

MechCoder commented Aug 29, 2014

yes!

Contributor

calvingiles commented Aug 31, 2014

I am going to have a look at this at the sklearn sprint

Contributor

calvingiles commented Aug 31, 2014

I created a pull request. Not sure why I didn't appear here. #3612.

Contributor

calvingiles commented Aug 31, 2014

Workaround for PIL warnings suggested in python-pillow/Pillow#835

@ogrisel ogrisel closed this in dd9e512 Aug 31, 2014

ogrisel added a commit that referenced this issue Aug 31, 2014

Merge pull request #3612 from calvingiles/fix-resource-warnings
[MRG+1] Fixed ResourceWarnings from inside scikit-learn. Fixes #3410.

@ogrisel ogrisel reopened this Aug 31, 2014

Owner

ogrisel commented Aug 31, 2014

Reopened to track the resolution of the PIL related part.

yarikoptic added a commit to yarikoptic/scikit-learn that referenced this issue Sep 8, 2014

Merge tag '0.15.2' into releases
Release 0.15.2

* tag '0.15.2': (95 commits)
  FIX appveyor.yml path
  Release 0.15.2
  DOC Fixes in 0.15.2
  ENH upload Windows wheels to rackspace
  FIX heisenfailure on 32 bit python + speedup
  FIX heisenfailure in test_lasso_lars_path_length
  FIX Windows CI: use prebuilt numpy / scipy
  utils.testing: add assert_greater_equal and assert_less_equal
  FIX define CC and CXX for travis
  FIX update sklearn.__all__ to include all end-user submodules
  Changed f=open() to with open() as f to eliminate ResourceWarnings. Fixes #3410.
  fix_gnb_proba
  ENH: Remove unused copy in k-means
  ENH: minor speed-up in k-means
  DOC: more explicit title
  FIX: Fix the imputation example
  Repeated word: 'the the' -> 'the'
  DOC: Added np.sqrt since default is 'squared=False'
  Update gaussian_process.py
  DOC deprecation of **kwargs in neighbors per 0.18
  ...

yarikoptic added a commit to yarikoptic/scikit-learn that referenced this issue Sep 8, 2014

Merge branch 'releases' into dfsg
* releases: (95 commits)
  FIX appveyor.yml path
  Release 0.15.2
  DOC fixes in 0.15.2
  ENH upload Windows wheels to rackspace
  FIX heisenfailure on 32 bit python + speedup
  FIX heisenfailure in test_lasso_lars_path_length
  FIX Windows CI: use prebuilt numpy / scipy
  utils.testing: add assert_greater_equal and assert_less_equal
  FIX define CC and CXX for travis
  FIX update sklearn.__all__ to include all end-user submodules
  Changed f=open() to with open() as f to eliminate ResourceWarnings. Fixes #3410.
  fix_gnb_proba
  ENH: Remove unused copy in k-means
  ENH: minor speed-up in k-means
  DOC: more explicit title
  FIX: Fix the imputation example
  Repeated word: 'the the' -> 'the'
  DOC: Added np.sqrt since default is 'squared=False'
  Update gaussian_process.py
  DOC deprecation of **kwargs in neighbors per 0.18
  ...

Conflicts:
	sklearn/externals/joblib/__init__.py
	sklearn/externals/joblib/numpy_pickle.py
	sklearn/externals/joblib/parallel.py
	sklearn/externals/joblib/pool.py

yarikoptic added a commit to yarikoptic/scikit-learn that referenced this issue Sep 8, 2014

Merge branch 'dfsg' into debian
* dfsg: (95 commits)
  FIX appveyor.yml path
  Release 0.15.2
  DOC fixes in 0.15.2
  ENH upload Windows wheels to rackspace
  FIX heisenfailure on 32 bit python + speedup
  FIX heisenfailure in test_lasso_lars_path_length
  FIX Windows CI: use prebuilt numpy / scipy
  utils.testing: add assert_greater_equal and assert_less_equal
  FIX define CC and CXX for travis
  FIX update sklearn.__all__ to include all end-user submodules
  Changed f=open() to with open() as f to eliminate ResourceWarnings. Fixes #3410.
  fix_gnb_proba
  ENH: Remove unused copy in k-means
  ENH: minor speed-up in k-means
  DOC: more explicit title
  FIX: Fix the imputation example
  Repeated word: 'the the' -> 'the'
  DOC: Added np.sqrt since default is 'squared=False'
  Update gaussian_process.py
  DOC deprecation of **kwargs in neighbors per 0.18
  ...

kashif added a commit to kashif/scikit-learn that referenced this issue Sep 12, 2014

IssamLaradji added a commit to IssamLaradji/scikit-learn that referenced this issue Oct 13, 2014

IssamLaradji added a commit to IssamLaradji/scikit-learn that referenced this issue Oct 13, 2014

Merge pull request #3612 from calvingiles/fix-resource-warnings
[MRG+1] Fixed ResourceWarnings from inside scikit-learn. Fixes #3410.

SaurabhJha added a commit to SaurabhJha/scikit-learn that referenced this issue Oct 15, 2014

IssamLaradji added a commit to IssamLaradji/scikit-learn that referenced this issue Nov 22, 2014

hugovk commented Sep 23, 2017 edited

The Pillow warnings mentioned in python-pillow/Pillow#835 no longer occur with the latest Pillow release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment