Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for loading files from Azure/ any other fsspec compatible file stores #795

Closed
neerajd12 opened this issue Dec 6, 2022 · 3 comments

Comments

@neerajd12
Copy link

neerajd12 commented Dec 6, 2022

Python version

('python=3.10.6 | packaged by conda-forge | (main, Aug 22 2022, 20:35:26) [GCC '
'10.4.0]')
'os=Linux-5.10.16.3-microsoft-standard-WSL2-x86_64-with-glibc2.35'
'numpy=1.23.4'

Code

MDF version

3.20

Code snippet 1

   import adlfs
   fs = adlfs.AzureBlobFileSystem(account_name="account_name", sas_token="sas_token")
   MDF(fs)

Traceback.

TypeError                                 Traceback (most recent call last)
Cell In [9], line 2
      1 file = fs.open('test/mdf/test.mdf', "rb")
----> 2 MDF(fs)

File /opt/conda/lib/python3.10/site-packages/asammdf/mdf.py:292, in MDF.__init__(self, name, version, channels, **kwargs)
    289     do_close = True
    291 else:
--> 292     name = original_name = Path(name)
    293     if not name.is_file() or not name.exists():
    294         raise MdfException(f'File "{name}" does not exist')

File /opt/conda/lib/python3.10/pathlib.py:960, in Path.__new__(cls, *args, **kwargs)
    958 if cls is Path:
    959     cls = WindowsPath if os.name == 'nt' else PosixPath
--> 960 self = cls._from_parts(args)
    961 if not self._flavour.is_supported:
    962     raise NotImplementedError("cannot instantiate %r on your system"
    963                               % (cls.__name__,))

File /opt/conda/lib/python3.10/pathlib.py:594, in PurePath._from_parts(cls, args)
    589 @classmethod
    590 def _from_parts(cls, args):
    591     # We need to call _parse_args on the instance, so as to get the
    592     # right flavour.
    593     self = object.__new__(cls)
--> 594     drv, root, parts = self._parse_args(args)
    595     self._drv = drv
    596     self._root = root

File /opt/conda/lib/python3.10/pathlib.py:578, in PurePath._parse_args(cls, args)
    576     parts += a._parts
    577 else:
--> 578     a = os.fspath(a)
    579     if isinstance(a, str):
    580         # Force-cast str subclasses to str (issue #21127)
    581         parts.append(str(a))

TypeError: expected str, bytes or os.PathLike object, not AzureBlobFileSystem

Code snippet 2

   import adlfs
   fs = adlfs.AzureBlobFileSystem(account_name="account_name", sas_token="sas_token")
   file = fs.open('apitest/mdf/test.mdf', "rb")
   MDF(file)

Traceback.

---------------------------------------------------------------------------
MdfException                              Traceback (most recent call last)
Cell In [11], line 2
      1 file = fs.open('test/mdf/test.mdf', "rb")
----> 2 MDF(file)

File /opt/conda/lib/python3.10/site-packages/asammdf/mdf.py:265, in MDF.__init__(self, name, version, channels, **kwargs)
    262         do_close = True
    264     else:
--> 265         raise MdfException(
    266             f"{type(name)} is not supported as input for the MDF class"
    267         )
    269 elif isinstance(name, zipfile.ZipFile):
    271     archive = name

MdfException: <class 'adlfs.spec.AzureBlobFile'> is not supported as input for the MDF class

Description

MDF class fails to identity files streamed from cloud stores as files. I've tested this with a file on azure blob store.

A simple fix that works on my fork of this repo is by adding below to neerajd12@3bff61a

from fsspec.spec import AbstractBufferedFile
if isinstance(name, AbstractBufferedFile):
    original_name = None
    file_stream = name
    do_close = False
if isinstance(name, BytesIO):

this works for any/all file systems Supported with fsspec

Hope this helps anyone using azure/aws etc

@danielhrisca
Copy link
Owner

@neerajd12 please try the development branch code

@neerajd12
Copy link
Author

thanks @danielhrisca tested with it and seems to work

@neerajd12
Copy link
Author

@danielhrisca any plan to release this work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants