Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
Already on GitHub? Sign in to your account
Built-in download of data #108
Comments
This was referenced Sep 13, 2016
tompollard
added a commit
that referenced
this issue
Sep 15, 2016
|
|
tompollard |
9317a82
|
|
I added a basic recipe for downloading MIMIC-III from PhysioNet with two variables - I didn't do anything with the checksums or look into support for non-unix systems, so I'm leaving this issue open in case we want to deal with these things later. |
herroannekim
commented
Jun 19, 2017
|
Will this issue be resolved soon? I think I have a related problem. After running |
herroannekim
commented
Jun 19, 2017
|
On physionet, there's no ADMISSIONS.csv, but there is a admissiondrug.csv and admissionDx.csv |
|
@herroannekim there are several projects on PhysioNet, so please could you explain which project you are looking at? (e.g. provide the URL). |
jeblundell commentedJul 10, 2016
Once #103 is finalised, something that I think is worth incorporating is download of data from PhysioNetWorks. Reason I suggest we wait for that to be merged is so we can just do it as a build target.
On PNW there's an automated download command:
wget --user YOURUSERNAME --ask-password -A csv.gz -m -p -E -k -K -np http://MIMICURLWe can literally just indent that and put
data:before it and then we can do "make data" to get the data. I'd suggest we also include variables for where to store the data and (obviously) the username. An alternative approach is to include each csv.gz as a build target.I'd suggest also throwing in "make verify" or modifying mimic-check to run the MD5 checksums.
Two issues:
(1) Is it alright to actually include the URL itself? I can't really see why not as it's protected with username/password, but thought I'd check anyway
(2) Again, we run into the cross-platform issue. Given that @alistairewj was pondering including a gzip binary, might be worth considering including a wget binary too for this reason. I forget whether OS X comes with it as standard. I suppose an objection is that we're creeping various binaries into this, which is not ideal. Maybe a better approach is to only including wget or curl and download gzip etc. from official sources.