Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve job.data, project.data (H5Store) examples. #235

Merged
merged 14 commits into from
Dec 19, 2019
Merged

Conversation

bdice
Copy link
Member

@bdice bdice commented Oct 25, 2019

Description

I created this PR to start addressing this issue: glotzerlab/signac-docs#50

@klywang Do you have specific suggestions on how to improve this? (I don't think I've "solved" the issue yet, since I haven't provided explicit examples for job.data as requested.) Feel free to edit this PR directly.

Motivation and Context

glotzerlab/signac-docs#50

Types of Changes

  • Documentation update
  • Bug fix
  • New feature
  • Breaking change1

1The change breaks (or has the potential to break) existing functionality.

Checklist:

If necessary:

  • I have updated the API documentation as part of the package doc-strings.
  • I have created a separate pull request to update the framework documentation on signac-docs and linked it here.
  • I have updated the changelog.

@bdice bdice changed the title Point to H5Store for job.data, project.data examples. Improve job.data, project.data (H5Store) examples. Oct 25, 2019
@vyasr
Copy link
Contributor

vyasr commented Nov 6, 2019

@klywang any feedback on this PR?

@bdice bdice marked this pull request as ready for review November 8, 2019 13:26
@codecov
Copy link

codecov bot commented Nov 8, 2019

Codecov Report

Merging #235 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #235   +/-   ##
=======================================
  Coverage   65.38%   65.38%           
=======================================
  Files          42       42           
  Lines        5697     5697           
=======================================
  Hits         3725     3725           
  Misses       1972     1972
Impacted Files Coverage Δ
signac/contrib/job.py 90.63% <ø> (ø) ⬆️
signac/contrib/project.py 90.88% <ø> (ø) ⬆️
signac/core/h5store.py 90.28% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2a57549...7ab8b48. Read the comment docs.

@csadorf
Copy link
Contributor

csadorf commented Nov 12, 2019

Why is this PR not assigned to anybody?

@vyasr
Copy link
Contributor

vyasr commented Nov 16, 2019

This PR was in draft mode until recently. When moved out of draft mode, what is our policy for assigning a maintainer in charge? I'm guessing the bot didn't add any reviewers, so nobody assigned themself to the issue.

@bdice bdice self-assigned this Nov 16, 2019
@bdice bdice requested a review from klywang November 16, 2019 05:37
@bdice
Copy link
Member Author

bdice commented Nov 16, 2019

I assigned myself to this PR and requested review from @klywang since she originally reported the documentation challenges. I marked it as "ready for review" (no longer draft mode) in hopes that it would get reviewed by someone, but I suppose the bot didn't assign reviewers.

@klywang
Copy link
Contributor

klywang commented Nov 17, 2019

Is there a specific reason you one should use H5Store over job.data or project.data? If there isn't a good reason, it would help to have an short example that shows its usage with the context manager. For example:

with job.data
net_energy = job.data.net_energy[:]
Just enough so that someone who wants to use the functionality can glance at the docs and know how to use it.

@bdice
Copy link
Member Author

bdice commented Nov 18, 2019

@klywang The purpose of linking to H5Store is that the job.data is an H5Store. I was hesitant to duplicate examples for H5Store in job.data and project.data, just to keep the code organized (changes to H5Store would have to be reflected in several examples, not just job.data and project.data but also job.stores, etc. -- hard to know exactly where to draw the line). That said, I see why it would be user-friendly to have a snippet like that in the docs. I'll adapt your example and ping you for review.

@csadorf
Copy link
Contributor

csadorf commented Nov 18, 2019

@klywang The purpose of linking to H5Store is that the job.data is an H5Store.

Also might be worthwhile to try to make that relationship more explicit in the documentation to avoid confusion.

@klywang
Copy link
Contributor

klywang commented Nov 19, 2019

@csadorf
Poor wording on my end. I meant, under what circumstances should a user use H5Store or job.data\ project.data? I thought the documentation clearly communicated the fact that these features were H5Stores. The primary source of confusion was that there were a lot of different ways to do a similar operation and it was difficult to understand which version I should use, especially before I knew how to use it.

@csadorf
Copy link
Contributor

csadorf commented Nov 20, 2019

@csadorf
Poor wording on my end. I meant, under what circumstances should a user use H5Store or job.data\ project.data? I thought the documentation clearly communicated the fact that these features were H5Stores. The primary source of confusion was that there were a lot of different ways to do a similar operation and it was difficult to understand which version I should use, especially before I knew how to use it.

@klywang I see. So it's more about when to use job.data, project.data, or a direct instance of H5Store. Could you maybe propose a paragraph that would have helped you to understand that better/earlier?

@bdice
Copy link
Member Author

bdice commented Nov 20, 2019

(I would also include job.stores in that description if possible)

@klywang
Copy link
Contributor

klywang commented Nov 20, 2019

For job.data
job.data lets you read and write data associated with your job. It should be used for large array-like data, which can't be stored efficiently in the job document. For examples and usage, see [link for topic guide].

For Topic Guides > Jobs > Job Data Store
Some revised examples for the Topic Guide already exist, but I rewrote them to streamline the rest of these examples.

Writing data
To write data to a store:

job.data['velocities'] = np.ones([10, 3, 2])

Data may also be written in subgroups:

job.data['net_energy/frame_10'] = np.ones([10, 3])

Reading data
To access data as an attribute:

with job.data:
    velocities = job.data.velocities[:]
    net_energy = job.data.net_energy.frame_10[:]

To access data as a key:

with job.data:
    velocities = job.data['velocities'][:]
    net_energy = job.data['net_energy/frame_10'][:]

Other uses
To view keys associated with job.data or in a subgroup:

print(list(job.data.keys())
[Output]: ['velocities', 'net_energy']

print(list(job.data.net_energy.keys()))
[Output]: ['frame_10']

To iterate over keys in job.data or a subgroup:

for frame_n in job.data.net_energy.keys():
    with job.data:
        frame_energy = job.data.net_energy[frame_n][:]

Data written with job.data is stored in a file named signac_data.h5 in the associated job folder. For cases where job-associated data may be accessed from multiple sources at the same time or other instances where multiple files may be preferred to one large file, job.stores should be used instead of job.data.

For more information about supported data types and the file format, please see documentation for the H5Store.

Also, note for myself -- there is a typo in the H5Store doc that I will fix.
P.S. Thank you @vyasr @bdice for the input!

@vyasr
Copy link
Contributor

vyasr commented Nov 25, 2019

@atravitz @b-butler I've requested some additional reviews on here, please read through @klywang's suggested improvements to our documentation in her most recent comment on this thread and provide some commentary. Thanks!

@atravitz
Copy link
Collaborator

As someone who has only just recently started using job.data, this is incredibly clear and helpful! One minor thing: it might be better to name frame_10 something else. The way it currently reads is that it's an alternate way of accessing a frame, rather than being a sub-key.

@vyasr vyasr assigned vyasr and unassigned bdice Dec 4, 2019
@vyasr
Copy link
Contributor

vyasr commented Dec 4, 2019

@klywang could you incorporate @atravitz's suggestion and then add your snippet to the relevant places in the documentation (i.e. update topic guides, then add links in the docstrings to point to the topic guides instead of just to the H5StoreManager and related classes).

@bdice bdice added this to the v1.3.0 milestone Dec 4, 2019
Copy link
Member

@b-butler b-butler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@klywang suggestions LGTM with @atravitz's correction.

@bdice
Copy link
Member Author

bdice commented Dec 11, 2019

@klywang We'll be making a release on Friday. If you have a chance to make the documentation changes by then (we also have a sprint on Thursday), that would be great!

atravitz
atravitz previously approved these changes Dec 11, 2019
Copy link
Collaborator

@atravitz atravitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@bdice
Copy link
Member Author

bdice commented Dec 13, 2019

It doesn't look like @klywang (or anyone) pushed any commits with the drafted text that is in her comment above. I'm confused about whether @b-butler and @atravitz meant to approve the file changes as they are currently (which is incomplete), or approve the concept of @klywang's changes. The changes @klywang suggested look great to me, but we need to ...actually implement them in the docs. 😕

I'll be making a release of signac tomorrow, could someone from (@glotzerlab/signac-committers, @klywang) take that on and copy-paste the snippets into the right places in the docs? If so, please reply on this PR. Otherwise I will try to apply the changes myself before making the release. It seems like we're generally in agreement over their intended content.

@bdice bdice requested a review from a team as a code owner December 13, 2019 03:45
@atravitz
Copy link
Collaborator

atravitz commented Dec 13, 2019 via email

@atravitz atravitz dismissed their stale review December 13, 2019 13:54

adding code snippets

@b-butler
Copy link
Member

@bdice I meant to approve the suggestion. I should have been more clear about that. Otherwise, like you said the PR is incomplete.

@atravitz
Copy link
Collaborator

Update: @klywang is adding these in, then I'll review

@vyasr
Copy link
Contributor

vyasr commented Dec 13, 2019

Thanks for the update @atravitz!

@klywang klywang requested a review from a team as a code owner December 13, 2019 18:21
Copy link
Collaborator

@atravitz atravitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few fixes to make it more consistent with the other doc strings

signac/contrib/job.py Outdated Show resolved Hide resolved
signac/contrib/project.py Outdated Show resolved Hide resolved
@klywang
Copy link
Contributor

klywang commented Dec 13, 2019

Topic guide updates are here

Copy link
Contributor

@klywang klywang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can wrap this up now! Everything looks good.

@bdice bdice merged commit 2c0d9e4 into master Dec 19, 2019
@bdice bdice deleted the docs/h5store-examples branch December 19, 2019 20:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants