Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wfdb.io.dl_database raises 404 error for folder containing numerics only #175

Closed
tompollard opened this issue Jul 8, 2019 · 4 comments
Closed
Assignees
Labels

Comments

@tompollard
Copy link
Member

Attempting to download the mimic3wdb/30 folder with wfdb.io.dl_database raises a "resource not found" error for: http://physionet.org/physiobank/database/mimic3wdb/30/3000031/3000031.hea

This error was reported at: MIT-LCP/physionet#114. It seems to be an issue with WFDB rather than the dataset itself.

From a quick look, I assume the problem has something to do with the fact that there are no waveforms in the folder (just numerics). Steps to reproduce:

x = 'mimic3wdb/30/'   
y = '/Users/tompollard/sand'
wfdb.io.dl_database(x,y)

Error:

---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
<ipython-input-12-4c6624b7e581> in <module>
----> 1 wfdb.io.dl_database(x,y)

/usr/local/lib/python3.7/site-packages/wfdb/io/record.py in dl_database(db_dir, dl_dir, records, annotators, keep_subdirs, overwrite)
   1631             allfiles.append(rec+'.hea')
   1632             dir_name, baserecname = os.path.split(rec)
-> 1633             record = rdheader(baserecname, pb_dir=posixpath.join(db_dir, dir_name))
   1634 
   1635             # Single segment record

/usr/local/lib/python3.7/site-packages/wfdb/io/record.py in rdheader(record_name, pb_dir, rd_segments)
    989     # Read the header file. Separate comment and non-comment lines
    990     header_lines, comment_lines = _header._read_header_lines(base_record_name,
--> 991                                                              dir_name, pb_dir)
    992 
    993     # Get fields from record line

/usr/local/lib/python3.7/site-packages/wfdb/io/_header.py in _read_header_lines(base_record_name, dir_name, pb_dir)
    726     else:
    727         header_lines, comment_lines = download._stream_header(file_name,
--> 728                                                               pb_dir)
    729 
    730     return header_lines, comment_lines

/usr/local/lib/python3.7/site-packages/wfdb/io/download.py in _stream_header(file_name, pb_dir)
     88 
     89     # Raise HTTPError if invalid url
---> 90     response.raise_for_status()
     91 
     92     # Get each line as a string

/usr/local/lib/python3.7/site-packages/requests/models.py in raise_for_status(self)
    938 
    939         if http_error_msg:
--> 940             raise HTTPError(http_error_msg, response=self)
    941 
    942     def close(self):

HTTPError: 404 Client Error: Not Found for url: http://physionet.org/physiobank/database/mimic3wdb/30/3000031/3000031.hea
@Lucas-Mc
Copy link
Collaborator

Hey @tompollard , it appears like WFDB is trying to look for a file called 3000031.hea when instead it is called 3000031n.hea. It looks like WFDB needs to explicitly determine the file names instead of doing it implicitly as in this case.
MIMIC_Dat

@Lucas-Mc Lucas-Mc added the bug label Apr 27, 2020
@tompollard
Copy link
Member Author

@Lucas-Mc see: https://www.physionet.org/content/mimic3wdb/

Each recording comprises two records (a waveform record and a matching numerics record) in a single record directory (“folder”) with the name of the record.

...

The numerics records (designated by the letter n appended to the record name) are not divided into segments, since the storage savings that would be achieved by doing so would be relatively little.

@Lucas-Mc
Copy link
Collaborator

See #200 for the beginnings of fixing this issue! 👍

@Lucas-Mc
Copy link
Collaborator

Lucas-Mc commented May 4, 2020

Hey @tompollard, I am running this locally and no error has happened so far which means the code now generates the correct nested record list. This appears to have been fixed by both #200 and #205. I'm going to close it now since the stated error here is fixed but will open a new issue if a new one occurs while finishing this command up! Thanks for the assistance!

@Lucas-Mc Lucas-Mc closed this as completed May 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants