Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change to ensure_directory() func in HyperSpy 1.6.1 breaks kikuchipy.io._io.save() #251

Closed
hakonanes opened this issue Nov 29, 2020 · 5 comments
Assignees
Labels
bug Something isn't working
Milestone

Comments

@hakonanes
Copy link
Member

hakonanes commented Nov 29, 2020

Edit: So in HyperSpy 1.6.1 the hyperspy.misc.io.tools.ensure_directory() (https://github.com/hyperspy/hyperspy/blob/RELEASE_next_minor/hyperspy/misc/io/tools.py#L80) changed, from just creating the necessary paths to making the passed path parameter a directory, even when it is a filename with an extension! This is the cause of all the IO tests failing.

For some reason I don't know, most IO tests fail on the GitHub Action's virtual machines (called runners). The tests pass locally. These tests are marked with @pytest.mark.xfail so the test suite will return green even though those tests fail.

They did not fail previously on Travis CI, and, without anything having changed, they also passed on GitHub Actions a couple days ago.

Haven't found anything on the internet about this, so I am clueless.


I consider this not fixed until we release a patch or minor version.

@hakonanes hakonanes added maintenance This relates to package maintenance tests This relates to the tests labels Nov 29, 2020
@hakonanes hakonanes added this to the future milestone Nov 29, 2020
@hakonanes hakonanes self-assigned this Nov 29, 2020
@hakonanes
Copy link
Member Author

The error message:

filename = PosixPath('/tmp/pytest-of-runner/pytest-0/test_load_manufacturer0/patterns_temp.h5')
signal = <EBSD, title: , dimensions: (3, 10|5, 5)>, add_scan = None
scan_number = 1, kwargs = {}, ver_signal = '0.3.dev0'
man_ver_dict = {'manufacturer': 'kikuchipy', 'version': '0.3.dev0'}, mode = 'w'

    def file_writer(
        filename: str,
        signal,
        add_scan: Optional[bool] = None,
        scan_number: int = 1,
        **kwargs,
    ):
        """Write an :class:`~kikuchipy.signals.EBSD` or
        :class:`~kikuchipy.signals.LazyEBSD` signal to an existing,
        but not open, or new h5ebsd file.
    
        Only writing to kikuchipy's h5ebsd format is supported.
    
        Parameters
        ----------
        filename
            Full path of HDF file.
        signal : kikuchipy.signals.EBSD or kikuchipy.signals.LazyEBSD
            Signal instance.
        add_scan
            Add signal to an existing, but not open, h5ebsd file. If it does
            not exist it is created and the signal is written to it.
        scan_number
            Scan number in name of HDF dataset when writing to an existing,
            but not open, h5ebsd file.
        kwargs
            Keyword arguments passed to :meth:`h5py:Group.require_dataset`.
        """
        # Set manufacturer and version to use in file
        from kikuchipy.release import version as ver_signal
    
        man_ver_dict = {"manufacturer": "kikuchipy", "version": ver_signal}
    
        # Open file in correct mode
        mode = "w"
        if os.path.isfile(filename) and add_scan:
            mode = "r+"
        try:
>           f = h5py.File(filename, mode=mode)

/home/runner/work/kikuchipy/kikuchipy/kikuchipy/io/plugins/h5ebsd.py:776: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <[AttributeError("'File' object has no attribute '_id'") raised in repr()] File object at 0x7f012c9a5790>
name = b'/tmp/pytest-of-runner/pytest-0/test_load_manufacturer0/patterns_temp.h5'
mode = 'w', driver = None, libver = None, userblock_size = None, swmr = False
rdcc_nslots = None, rdcc_nbytes = None, rdcc_w0 = None, track_order = False
kwds = {}, fapl = <h5py.h5p.PropFAID object at 0x7f012cba0630>

    def __init__(self, name, mode=None, driver=None,
                 libver=None, userblock_size=None, swmr=False,
                 rdcc_nslots=None, rdcc_nbytes=None, rdcc_w0=None,
                 track_order=None,
                 **kwds):
        """Create a new file object.
    
        See the h5py user guide for a detailed explanation of the options.
    
        name
            Name of the file on disk, or file-like object.  Note: for files
            created with the 'core' driver, HDF5 still requires this be
            non-empty.
        mode
            r        Readonly, file must exist
            r+       Read/write, file must exist
            w        Create file, truncate if exists
            w- or x  Create file, fail if exists
            a        Read/write if exists, create otherwise (default)
        driver
            Name of the driver to use.  Legal values are None (default,
            recommended), 'core', 'sec2', 'stdio', 'mpio'.
        libver
            Library version bounds.  Supported values: 'earliest', 'v108',
            'v110',  and 'latest'. The 'v108' and 'v110' options can only be
            specified with the HDF5 1.10.2 library or later.
        userblock
            Desired size of user block.  Only allowed when creating a new
            file (mode w, w- or x).
        swmr
            Open the file in SWMR read mode. Only used when mode = 'r'.
        rdcc_nbytes
            Total size of the raw data chunk cache in bytes. The default size
            is 1024**2 (1 MB) per dataset.
        rdcc_w0
            The chunk preemption policy for all datasets.  This must be
            between 0 and 1 inclusive and indicates the weighting according to
            which chunks which have been fully read or written are penalized
            when determining which chunks to flush from cache.  A value of 0
            means fully read or written chunks are treated no differently than
            other chunks (the preemption is strictly LRU) while a value of 1
            means fully read or written chunks are always preempted before
            other chunks.  If your application only reads or writes data once,
            this can be safely set to 1.  Otherwise, this should be set lower
            depending on how often you re-read or re-write the same data.  The
            default value is 0.75.
        rdcc_nslots
            The number of chunk slots in the raw data chunk cache for this
            file. Increasing this value reduces the number of cache collisions,
            but slightly increases the memory used. Due to the hashing
            strategy, this value should ideally be a prime number. As a rule of
            thumb, this value should be at least 10 times the number of chunks
            that can fit in rdcc_nbytes bytes. For maximum performance, this
            value should be set approximately 100 times that number of
            chunks. The default value is 521.
        track_order
            Track dataset/group/attribute creation order under root group
            if True. If None use global default h5.get_config().track_order.
        Additional keywords
            Passed on to the selected file driver.
    
        """
        if swmr and not swmr_support:
            raise ValueError("The SWMR feature is not available in this version of the HDF5 library")
    
        if isinstance(name, _objects.ObjectID):
            with phil:
                fid = h5i.get_file_id(name)
        else:
            if hasattr(name, 'read') and hasattr(name, 'seek'):
                if driver not in (None, 'fileobj'):
                    raise ValueError("Driver must be 'fileobj' for file-like object if specified.")
                driver = 'fileobj'
                if kwds.get('fileobj', name) != name:
                    raise ValueError("Invalid value of 'fileobj' argument; "
                                     "must equal to file-like object if specified.")
                kwds.update(fileobj=name)
                name = repr(name).encode('ASCII', 'replace')
            else:
                name = filename_encode(name)
    
            if track_order is None:
                track_order = h5.get_config().track_order
            if mode is None:
                mode = h5.get_config().default_file_mode
                if mode is None and os.environ.get('H5PY_DEFAULT_READONLY', ''):
                    mode = 'r'
    
            with phil:
                fapl = make_fapl(driver, libver, rdcc_nslots, rdcc_nbytes, rdcc_w0, **kwds)
>               fid = make_fid(name, mode, userblock_size,
                               fapl, fcpl=make_fcpl(track_order=track_order),
                               swmr=swmr)

/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/h5py/_hl/files.py:406: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

name = b'/tmp/pytest-of-runner/pytest-0/test_load_manufacturer0/patterns_temp.h5'
mode = 'w', userblock_size = None
fapl = <h5py.h5p.PropFAID object at 0x7f012cba0630>, fcpl = None, swmr = False

    def make_fid(name, mode, userblock_size, fapl, fcpl=None, swmr=False):
        """ Get a new FileID by opening or creating a file.
        Also validates mode argument."""
    
        if userblock_size is not None:
            if mode in ('r', 'r+'):
                raise ValueError("User block may only be specified "
                                 "when creating a file")
            try:
                userblock_size = int(userblock_size)
            except (TypeError, ValueError):
                raise ValueError("User block size must be an integer")
            if fcpl is None:
                fcpl = h5p.create(h5p.FILE_CREATE)
            fcpl.set_userblock(userblock_size)
    
        if mode == 'r':
            flags = h5f.ACC_RDONLY
            if swmr and swmr_support:
                flags |= h5f.ACC_SWMR_READ
            fid = h5f.open(name, flags, fapl=fapl)
        elif mode == 'r+':
            fid = h5f.open(name, h5f.ACC_RDWR, fapl=fapl)
        elif mode in ['w-', 'x']:
            fid = h5f.create(name, h5f.ACC_EXCL, fapl=fapl, fcpl=fcpl)
        elif mode == 'w':
>           fid = h5f.create(name, h5f.ACC_TRUNC, fapl=fapl, fcpl=fcpl)

/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/h5py/_hl/files.py:179: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???

h5py/_objects.pyx:54: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???

h5py/_objects.pyx:55: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???
E   OSError: Unable to create file (unable to open file: name = '/tmp/pytest-of-runner/pytest-0/test_load_manufacturer0/patterns_temp.h5', errno = 21, error message = 'Is a directory', flags = 13, o_flags = 242)

h5py/h5f.pyx:108: OSError

During handling of the above exception, another exception occurred:

self = <kikuchipy.io.plugins.tests.test_h5ebsd.Testh5ebsd object at 0x7f012c942070>
tmp_path = PosixPath('/tmp/pytest-of-runner/pytest-0/test_load_manufacturer0')

    def test_load_manufacturer(self, tmp_path):
        file = tmp_path / "patterns_temp.h5"
        s = EBSD((255 * np.random.rand(10, 3, 5, 5)).astype(np.uint8))
>       s.save(file)

/home/runner/work/kikuchipy/kikuchipy/kikuchipy/io/plugins/tests/test_h5ebsd.py:126: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/home/runner/work/kikuchipy/kikuchipy/kikuchipy/signals/ebsd.py:1431: in save
    _save(filename, self, overwrite=overwrite, **kwargs)
/home/runner/work/kikuchipy/kikuchipy/kikuchipy/io/_io.py:386: in _save
    writer.file_writer(filename, signal, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

filename = PosixPath('/tmp/pytest-of-runner/pytest-0/test_load_manufacturer0/patterns_temp.h5')
signal = <EBSD, title: , dimensions: (3, 10|5, 5)>, add_scan = None
scan_number = 1, kwargs = {}, ver_signal = '0.3.dev0'
man_ver_dict = {'manufacturer': 'kikuchipy', 'version': '0.3.dev0'}, mode = 'w'

    def file_writer(
        filename: str,
        signal,
        add_scan: Optional[bool] = None,
        scan_number: int = 1,
        **kwargs,
    ):
        """Write an :class:`~kikuchipy.signals.EBSD` or
        :class:`~kikuchipy.signals.LazyEBSD` signal to an existing,
        but not open, or new h5ebsd file.
    
        Only writing to kikuchipy's h5ebsd format is supported.
    
        Parameters
        ----------
        filename
            Full path of HDF file.
        signal : kikuchipy.signals.EBSD or kikuchipy.signals.LazyEBSD
            Signal instance.
        add_scan
            Add signal to an existing, but not open, h5ebsd file. If it does
            not exist it is created and the signal is written to it.
        scan_number
            Scan number in name of HDF dataset when writing to an existing,
            but not open, h5ebsd file.
        kwargs
            Keyword arguments passed to :meth:`h5py:Group.require_dataset`.
        """
        # Set manufacturer and version to use in file
        from kikuchipy.release import version as ver_signal
    
        man_ver_dict = {"manufacturer": "kikuchipy", "version": ver_signal}
    
        # Open file in correct mode
        mode = "w"
        if os.path.isfile(filename) and add_scan:
            mode = "r+"
        try:
            f = h5py.File(filename, mode=mode)
        except OSError:
>           raise OSError("Cannot write to an already open file.")
E           OSError: Cannot write to an already open file.

@hakonanes
Copy link
Member Author

Apparently, similar errors occur when building the docs as well: https://readthedocs.org/projects/kikuchipy/builds/12447541/

@hakonanes
Copy link
Member Author

hakonanes commented Nov 29, 2020

So in HyperSpy 1.6.1 the hyperspy.misc.io.tools.ensure_directory() changed, from just creating the necessary paths to making the passed path parameter a directory, even when it is a filename with an extension! This is the cause of all the IO tests failing.

@hakonanes
Copy link
Member Author

This means that one cannot save EBSD data sets when using HyperSpy 1.6.1, which was just released. Thus, we should rush to get v0.3 out!

@hakonanes hakonanes changed the title Some IO tests fail on GitHub Actions Change to ensure_directory() func in HyperSpy 1.6.1 breaks EBSD.save() Nov 29, 2020
@hakonanes hakonanes added bug Something isn't working and removed tests This relates to the tests maintenance This relates to package maintenance labels Nov 29, 2020
@hakonanes hakonanes modified the milestones: future, v0.3.0 Nov 29, 2020
@hakonanes hakonanes changed the title Change to ensure_directory() func in HyperSpy 1.6.1 breaks EBSD.save() Change to ensure_directory() func in HyperSpy 1.6.1 breaks kikuchipy.io._io.save() Nov 29, 2020
@hakonanes hakonanes pinned this issue Dec 17, 2020
@hakonanes
Copy link
Member Author

hakonanes commented Jan 22, 2021

kikuchipy==0.3.0 (and 0.3.1) is now available via PyPI (and conda-forge shortly), so this shouldn't be a problem anymore. I'll close this, but leave it pinned for a couple of weeks.

@hakonanes hakonanes unpinned this issue Feb 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant