Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttributeError: '_io.TextIOWrapper' object has no attribute 'startswith' #529

Closed
gawbul opened this issue Feb 7, 2021 · 4 comments
Closed
Labels
question Further information is requested

Comments

@gawbul
Copy link

gawbul commented Feb 7, 2021

Getting the following error when I'm trying to use a package that depends on fsspec:

INFO:root:<_io.TextIOWrapper name='/Users/stephenmoss/.astropy/cache/download/url/141581d04d4001254d07601dfa7d983b/contents' encoding='UTF-8'>
Traceback (most recent call last):
  File "openomics_test.py", line 34, in <module>
    gencode = GENCODE(path="ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_32/",
  File "/Users/stephenmoss/.pyenv/versions/3.8.7/lib/python3.8/site-packages/openomics/database/sequence.py", line 67, in __init__
    super(GENCODE, self).__init__(path=path, file_resources=file_resources, col_rename=col_rename,
  File "/Users/stephenmoss/.pyenv/versions/3.8.7/lib/python3.8/site-packages/openomics/database/sequence.py", line 17, in __init__
    super(SequenceDataset, self).__init__(**kwargs)
  File "/Users/stephenmoss/.pyenv/versions/3.8.7/lib/python3.8/site-packages/openomics/database/base.py", line 39, in __init__
    self.data = self.load_dataframe(file_resources, npartitions=npartitions)
  File "/Users/stephenmoss/.pyenv/versions/3.8.7/lib/python3.8/site-packages/openomics/database/sequence.py", line 74, in load_dataframe
    df = read_gtf(file_resources[gtf_file], npartitions=npartitions)
  File "/Users/stephenmoss/.pyenv/versions/3.8.7/lib/python3.8/site-packages/openomics/utils/read_gtf.py", line 349, in read_gtf
    result_df = parse_gtf_and_expand_attributes(
  File "/Users/stephenmoss/.pyenv/versions/3.8.7/lib/python3.8/site-packages/openomics/utils/read_gtf.py", line 290, in parse_gtf_and_expand_attributes
    result = parse_gtf(
  File "/Users/stephenmoss/.pyenv/versions/3.8.7/lib/python3.8/site-packages/openomics/utils/read_gtf.py", line 195, in parse_gtf
    chunk_iterator = dd.read_table(
  File "/Users/stephenmoss/.pyenv/versions/3.8.7/lib/python3.8/site-packages/dask/dataframe/io/csv.py", line 659, in read
    return read_pandas(
  File "/Users/stephenmoss/.pyenv/versions/3.8.7/lib/python3.8/site-packages/dask/dataframe/io/csv.py", line 464, in read_pandas
    paths = get_fs_token_paths(urlpath, mode="rb", storage_options=storage_options)[
  File "/Users/stephenmoss/.pyenv/versions/3.8.7/lib/python3.8/site-packages/fsspec/core.py", line 619, in get_fs_token_paths
    path = cls._strip_protocol(urlpath)
  File "/Users/stephenmoss/.pyenv/versions/3.8.7/lib/python3.8/site-packages/fsspec/implementations/local.py", line 148, in _strip_protocol
    if path.startswith("file://"):
AttributeError: '_io.TextIOWrapper' object has no attribute 'startswith'
@andersy005
Copy link
Collaborator

@gawbul, it appears that your script is passing a stream wrapped in io.TextIOWrapper as path to fsspec and fsspec is expecting the path to be a string. Could you provide a reproducible example or share more details about the contents of openomics_test.py?

@andersy005 andersy005 added the question Further information is requested label Feb 7, 2021
@gawbul
Copy link
Author

gawbul commented Feb 7, 2021

Hi, @andersy005 👋

Thanks for the swift reply 🙏

Yeah, I was just thinking the same myself. I'm doing a code review of the OpenOmics package and just trying to get to the bottom of things at present. I think this is likely an issue with the package or one of its dependencies?

I'll let you know how I get on, but might be able to close this. Just trying to do some debugging to understand where it is setting it to a stream object?

I think it is somewhere in here https://github.com/BioMeCIS-Lab/OpenOmics/blob/master/openomics/database/base.py.

@gawbul
Copy link
Author

gawbul commented Feb 7, 2021

Yeah, it appears to be here https://github.com/BioMeCIS-Lab/OpenOmics/blob/master/openomics/database/base.py#L89-L90.

As per https://docs.python.org/3/library/gzip.html:

For text mode, a GzipFile object is created, and wrapped in an io.TextIOWrapper instance with the specified encoding, error handling behavior, and line ending(s).

I think it was expecting it to be returned as a standard file object, but the gzip library is behaving unexpectedly.

@gawbul
Copy link
Author

gawbul commented Feb 7, 2021

Happy to close. Thanks for your help 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants