Skip to content
This repository has been archived by the owner on Jan 3, 2018. It is now read-only.

added HDF5 lesson to intermediate R curriculum #687

Closed
wants to merge 4 commits into from
Closed

added HDF5 lesson to intermediate R curriculum #687

wants to merge 4 commits into from

Conversation

emhart
Copy link
Contributor

@emhart emhart commented Aug 28, 2014

Per Greg's request I've added in the lesson for working with HDF5 in R and an associated data set.

@jdblischak
Copy link
Contributor

Thanks for the PR, @emhart. Could you please send out a message to the r-discuss mailing list to let R instructors know there is a new lesson to review?

theme: united
---

HDF5 is a format that allows the storage of large heterogeneous data sets with self-describing metadata. It support compression, parallel I/O, and easy data slicing which means large files don't need to be completely read into RAM (a real benefit to `R`). Plus it has wide support in the many programming languages, `R` included. To be able to access HDF5 files, you'll need to first install the base [HDF5 libraries](http://www.hdfgroup.org/HDF5/release/obtain5.html#obtain). It might also be useful to install [HDFview](http://www.hdfgroup.org/products/java/hdfview/) which will allow you to explore the contents of an HDF5 file easily. HDF5 as a format can essentially be thought of as a file system that you load slices of at a time. HDF5 files consists of groups (directories) and datasets (files). The dataset holds the actual data, but the groups provide structure to that data, as you'll see in our example.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'It support' --> 'It supports'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed this typo

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you commit the typo fix? It should show the updated version once you do so. Currently the file in the PR still has the typo.

@emhart
Copy link
Contributor Author

emhart commented Sep 5, 2014

Thanks for all the feedback all. I'll take those into consideration and add some more commits. I will also definitely work on a bit about making your own hdf5 files.

As far as using a smaller file @sje30 I take your point. I was hoping to provide learners with a real world file as an example. Perhaps I can start off with writing and reading a file, and then go on to using the larger file.

@sje30
Copy link
Contributor

sje30 commented Sep 6, 2014

Great -- also, chcek out:

https://github.com/sje30/waverepo/blob/master/paper/waverepo_paper.Rnw

if you want a real example of combining hdf5+R+knitr to make a published
paper:
http://www.gigasciencejournal.com/content/3/1/3

Stephen

On Fri, Sep 05 2014, Edmund Hart wrote:

Thanks for all the feedback all. I'll take those into consideration and add some more commits. I will also definitely work on a bit about making your own hdf5 files.

As far as using a smaller file @sje30 I take your point. I was hoping to provide learners with a real world file as an example. Perhaps I can start off with writing and reading a file, and then go on to using the larger file.


Reply to this email directly or view it on GitHub:
#687 (comment)

Sent with my mu4e

@jdblischak
Copy link
Contributor

Thanks for the reviews, @chendaniely and @sje30.

@emhart, this PR received good reviews. Do you have time to address their suggestions in the next few days? The haste is due to the imminent breakup of the bc repo (see #759 and Greg's blog post). If not, we can close this issue and you can send a new PR in the future.

@emhart
Copy link
Contributor Author

emhart commented Oct 3, 2014

I'll send a new PR this weekend @jdblischak will that make it before the breakup?

@jdblischak
Copy link
Contributor

Thanks, @emhart. If you send in the final changes in the next few week you should be fine. You have two choices:

  1. You can continue to update the current PR. If you choose this option, you'll need to undo your last merge commit. Since you are adding a new file, there is no need to update all the other files in the repo in your PR.
  2. Create a new feature branch, cherry-pick your commits, and then send a PR from the new feature branch.

Please let me know if I need to explain more or if you need any help with this.

@emhart
Copy link
Contributor Author

emhart commented Oct 15, 2014

closing and the reissuing new PR from new fork with updated lessons.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants