Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds support for file-handlers and file-type in Map and TimeSeries #5193

Closed
wants to merge 6 commits into from

Conversation

devanshshukla99
Copy link
Contributor

@devanshshukla99 devanshshukla99 commented Apr 7, 2021

Description

Fixes #2690

Adds support for file-object and file-type in sunpy.map.Map & sunpy.timeseries.TimeSeries.

Currently, the support for sunpy.timeseries.TimeSeries is broken

@pep8speaks
Copy link

pep8speaks commented Apr 7, 2021

Hello @devanshshukla99! Thanks for updating this PR.

There are currently no PEP 8 issues detected in this Pull Request. Cheers!

Comment last updated at 2021-07-24 08:00:42 UTC

@nabobalis nabobalis added map Affects the map submodule No Backport A PR that isn't to be backported to any release branch. (To be used as a flag to other maintainers) timeseries Affects the timeseries submodule labels Apr 7, 2021
@devanshshukla99 devanshshukla99 marked this pull request as ready for review April 8, 2021 07:13
@devanshshukla99 devanshshukla99 requested review from a team as code owners April 8, 2021 07:13
@nabobalis nabobalis removed request for a team April 8, 2021 10:28
changelog/5193.doc.rst Outdated Show resolved Hide resolved
changelog/5193.trivial.rst Outdated Show resolved Hide resolved
changelog/5193.feature.rst Outdated Show resolved Hide resolved
sunpy/io/ana.py Outdated Show resolved Hide resolved
sunpy/io/ana.py Outdated Show resolved Hide resolved
sunpy/map/map_factory.py Outdated Show resolved Hide resolved
sunpy/map/map_factory.py Outdated Show resolved Hide resolved
sunpy/map/map_factory.py Outdated Show resolved Hide resolved
sunpy/timeseries/tests/test_timeseries_factory.py Outdated Show resolved Hide resolved
sunpy/timeseries/timeseries_factory.py Outdated Show resolved Hide resolved
@devanshshukla99 devanshshukla99 marked this pull request as draft April 8, 2021 11:32
@devanshshukla99
Copy link
Contributor Author

@nabobalis I can't find why test py37-figure-devdeps is failing even though tox is reporting no bug, could you please help me regarding this?

@nabobalis
Copy link
Contributor

The hash has changed. You can ignore it.

@devanshshukla99 devanshshukla99 marked this pull request as ready for review April 10, 2021 11:04
@Cadair Cadair self-requested a review April 14, 2021 16:57
sunpy/io/ana.py Outdated Show resolved Hide resolved
sunpy/map/map_factory.py Outdated Show resolved Hide resolved
@nabobalis nabobalis added the Needs Review Needs reviews before merge label Apr 20, 2021
@nabobalis nabobalis removed the request for review from Cadair May 10, 2021 19:50
sunpy/io/ana.py Outdated Show resolved Hide resolved
sunpy/map/mapbase.py Outdated Show resolved Hide resolved
sunpy/io/file_tools.py Outdated Show resolved Hide resolved
@devanshshukla99 devanshshukla99 force-pushed the fix-2690 branch 2 times, most recently from db7ee52 to 5782674 Compare May 11, 2021 21:44
@devanshshukla99
Copy link
Contributor Author

Sure add a test for detect_filetype.

@nabobalis is there any hdf5 test file in sunpy?

@nabobalis
Copy link
Contributor

Sure add a test for detect_filetype.

@nabobalis is there any hdf5 test file in sunpy?

https://github.com/sunpy/sunpy/blob/main/sunpy/data/test/goes_truncated_test_goes15.nc
and
https://github.com/sunpy/sunpy/blob/main/sunpy/data/test/goes_truncated_test_goes16.nc
should work

@nabobalis
Copy link
Contributor

So just to double check, we want any places where the code cannot use an open file handler to error instead of use the filename and then open the file.

@devanshshukla99
Copy link
Contributor Author

devanshshukla99 commented May 20, 2021

As discussed during the dev meeting, it'd be good to support (and test!) file-like objects that do not actually correspond to a real on-disk file. For example, socket.socket.makefile() returns a file-like object for a socket.

So, if I understand correctly, we want to support socket file-obj in sunpy.io.read_file for reading remote files?

If so, don't we'll still have to download the file locally for processing considering socket file-obj aren't seekable?

@devanshshukla99 devanshshukla99 force-pushed the fix-2690 branch 5 times, most recently from 8b762c8 to 2b27902 Compare June 1, 2021 20:48
@Cadair Cadair self-requested a review July 7, 2021 16:46
@@ -98,6 +102,11 @@ def test_patterns(self):
assert isinstance(maps, list)
assert ([isinstance(amap, sunpy.map.GenericMap) for amap in maps])

# File-like object
with open(a_fname, "rb") as fd:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if you pass random binary data to map?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it returns a ValueError

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a test for this in this file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not currently, should I add one?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you could.

@@ -59,7 +61,7 @@ def read_file(filepath, filetype=None, **kwargs):

Parameters
----------
filepath : path-like
filepath : `str` or file-like
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does a pathlib.Path not work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no it works,
will update👍

@devanshshukla99 devanshshukla99 marked this pull request as draft July 24, 2021 09:05
@nabobalis
Copy link
Contributor

As discussed during the dev meeting, it'd be good to support (and test!) file-like objects that do not actually correspond to a real on-disk file. For example, socket.socket.makefile() returns a file-like object for a socket.

Any work on this? Or we could open an issue to support this in future?

@devanshshukla99
Copy link
Contributor Author

As discussed during the dev meeting, it'd be good to support (and test!) file-like objects that do not actually correspond to a real on-disk file. For example, socket.socket.makefile() returns a file-like object for a socket.

Any work on this? Or we could open an issue to support this in future?

Nothing till now.

So, if I understand correctly, we'll have to read the socket fileobj then pass the data to appropriate readers?
(considering the file-object returned by socket is not seekable)

@nabobalis
Copy link
Contributor

So, if I understand correctly, we'll have to read the socket fileobj then pass the data to appropriate readers?
(considering the file-object returned by socket is not seekable)

Sure, his use case was makefile which returns a file-like handler.

@devanshshukla99
Copy link
Contributor Author

So, if I understand correctly, we'll have to read the socket fileobj then pass the data to appropriate readers?
(considering the file-object returned by socket is not seekable)

Sure, his use case was makefile which returns a file-like handler.

Looks doable!
Will do👍

@nabobalis nabobalis marked this pull request as ready for review August 25, 2021 16:23
@nabobalis
Copy link
Contributor

nabobalis commented Sep 15, 2021

I think we can bump the socket request from Albert to an issue and leave this PR as it for review.

@devanshshukla99, could you rebase this PR for us?

@nabobalis
Copy link
Contributor

Hi @devanshshukla99, sorry for the lack of reviews on this PR.

I am trying to get someone to have a look but could you also rebase this if you have time? If not, I can do so.

@devanshshukla99
Copy link
Contributor Author

Hi @devanshshukla99, sorry for the lack of reviews on this PR.

I am trying to get someone to have a look but could you also rebase this if you have time? If not, I can do so.

Apologies from my side too
Will do 👍🏻

Copy link
Member

@Cadair Cadair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work on this @devanshshukla99, sorry it took so long for me to review it. I think it's close, I just think we need to remove the last bit of name checking stuff for file handlers.

@@ -44,6 +45,9 @@ def read(filename, debug=False, **kwargs):
--------
>>> data = sunpy.io.ana.read(filename) # doctest: +SKIP
"""
if isinstance(filename, io.IOBase):
raise TypeError("Reader does not support file-handler")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
raise TypeError("Reader does not support file-handler")
raise TypeError("The ANA reader does not support file handles.")

@@ -26,6 +27,9 @@ def read(filepath, **kwargs):
pairs : `list`
A list of (data, header) tuples.
"""
if isinstance(filepath, io.IOBase):
raise TypeError("Reader does not support file-handler")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
raise TypeError("Reader does not support file-handler")
raise TypeError("The ANA reader does not support file handles.")

@@ -77,19 +79,24 @@ def read_file(filepath, filetype=None, **kwargs):
-----
Other keyword arguments are passed to the reader used.
"""
filepath = str(pathlib.Path(filepath))
_filepath = filepath.name if isinstance(filepath, io.IOBase) else filepath
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not all objects which implement io.IOBase have a .name attribute:

>>> import io
>>> isinstance(io.BytesIO(), io.IOBase)
True
>>> io.BytesIO().name
AttributeError: '_io.BytesIO' object has no attribute 'name'

I don't think we should do this at all when dealing with io objects, I think it would be better to work out the type from the magic bytes at the start of the file and not any reference to the filename. (Which already happens if this filename doesn't match anything I think).

@@ -44,6 +44,15 @@ def test_read_file_fits(self):
assert all([isinstance(p[1],
sunpy.io.header.FileHeader) for p in pairs])

# Test file-object
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a test here to check it works as expected with other file objects, something like this:

with open(AIA_171_IMAGE, 'rb') as fd:
    fits_bytes = fd.read()

sunpy.io.read_file(io.BytesIO(fits_bytes))
...

Comment on lines +84 to +85
eitmap = sunpy.map.Map(a_fname, filetype=pathlib.Path(a_fname).suffix[1:])
assert isinstance(eitmap, sunpy.map.GenericMap)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You aren't really testing this properly here as you are giving it a filename which it can detect the type for. You would be better off giving it a random filename and the type.

(Although it will also use magic bytes to detect file type if the filename doesn't pan out so I am not sure exactly how to make this test fail.)

@dstansby dstansby removed the Needs Review Needs reviews before merge label Jan 19, 2022
@nabobalis nabobalis marked this pull request as draft January 24, 2022 21:38
@dstansby dstansby added the Needs Adoption PRs that were abandoned but should be picked up again and merged in label Jan 25, 2022
@nabobalis nabobalis closed this May 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
map Affects the map submodule Needs Adoption PRs that were abandoned but should be picked up again and merged in No Backport A PR that isn't to be backported to any release branch. (To be used as a flag to other maintainers) timeseries Affects the timeseries submodule
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make sure that Map and TimeSeries are able to take file handlers and a filetype
6 participants