Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically detect different legacy file formats #10

Closed
TomNicholas opened this issue Mar 8, 2024 · 4 comments
Closed

Automatically detect different legacy file formats #10

TomNicholas opened this issue Mar 8, 2024 · 4 comments
Labels
Kerchunk Relating to the kerchunk library / specification itself

Comments

@TomNicholas
Copy link
Collaborator

We should design the backends so that the user doesn't have to specifically state which type of file they are trying to open. I believe pangeo-forge has some logic for detecting the file format automatically.

Original kerchunk gripe here: fsspec/kerchunk#376

@TomNicholas TomNicholas added the Kerchunk Relating to the kerchunk library / specification itself label Mar 8, 2024
@TomNicholas
Copy link
Collaborator Author

The closest thing pangeo-forge has to this is here. That code abstracts over the different kerchunk openers, but I don't think it automatically detects the correct one to use.

@TomNicholas TomNicholas mentioned this issue Mar 10, 2024
15 tasks
@TomNicholas
Copy link
Collaborator Author

This was partially closed by #43, which can auto-detect the difference between netCDF3 and netCDF4 at least.

@TomNicholas
Copy link
Collaborator Author

Copying Ryan's suggestion here so we don't forget to try it:

You can do this without opening the file with netCDF at all. Both file types have a "magic" at the beginning of the file.

def guess_file_type(fp) -> FileType:
    magic = fp.read(4)
    fp.seek(0)
    if magic[:3] == b"CDF":
        return FileType.netcdf3
    elif magic == b"\x89HDF":
        return FileType.hdf5
    else:
        raise ValueError(f"Unknown file type - magic {magic}")

@TomNicholas
Copy link
Collaborator Author

I think this was closed by #143

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Kerchunk Relating to the kerchunk library / specification itself
Projects
None yet
Development

No branches or pull requests

1 participant