Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tutorial about reading and exploring an NWB File #1453

Merged
merged 37 commits into from May 13, 2022

Conversation

weiglszonja
Copy link
Collaborator

@weiglszonja weiglszonja commented Apr 15, 2022

Add new tutorial about reading, exploring and doing basic visualisations with an existing NWB File.

This tutorial demonstrates how to:

  • download an NWB file from DANDI or stream from S3
  • read the NWB file
  • access stimulus data
  • plot images from OpticalSeries data
  • plot spiking activity relative to stimulus onset
  • access trials
  • plot stimuli matched on trials stim_on_time column
  • tools for exploring an NWB file: nwbwidgets, HDFView

Checklist

  • Did you update CHANGELOG.md with your changes?
  • Have you checked our Contributing document?
  • Have you ensured the PR clearly describes the problem and the solution?
  • Is your contribution compliant with our coding style? This can be checked running flake8 from the source directory.
  • Have you checked to ensure that there aren't other open Pull Requests for the same change?
  • Have you included the relevant issue number using "Fix #XXX" notation where XXX is the issue number? By including "Fix #XXX" you allow GitHub to close issue #XXX when the PR is merged.

@bendichter
Copy link
Contributor

this looks great, @weiglszonja! One issue with how this is currently set up is that it requires the NWB files to exist locally, which is tricky, because the server that runs this won't have access to that file. Maybe we could also offer an s3 streaming version and show the rest through that.

@bendichter
Copy link
Contributor

in fact, I think we could combine the s3 streaming tutorial with this one

@oruebel
Copy link
Contributor

oruebel commented Apr 15, 2022

in fact, I think we could combine the s3 streaming tutorial with this one

If only for find-ability, I think it will be useful to keep a separate tutorial around for S3, even if that tutorial is then very brief and only shows how to open a file and then points to this tutorial for details. Other than that, I think using S3 for this tutorial is fine, but I think it would be useful to at least briefly discuss the main difference between downloading a file and using it via S3.

@weiglszonja
Copy link
Collaborator Author

I noticed there is a section here about reading NWB files with PyNWB but is currently empty. Do you think @oruebel it would make sense to place a link there that redirects people to this tutorial?

@bendichter
Copy link
Contributor

@weiglszonja yes, once this tutorial is ready we should route people to it. We should also use this section to route users to more advanced and specialized read tutorials as they become available. In general, the nwb-overview documentation is for providing a bit of context and then routing users to more in-depth tutorials on specific repos.

@oruebel
Copy link
Contributor

oruebel commented Apr 20, 2022

replace all code sections with inline code-blocks

As far as I know, using inline code-blocks has the disadvantage that the code-blocks are not being executed and as such are not being tested.

@weiglszonja
Copy link
Collaborator Author

@oruebel
Yes, the reason why I did this is because dandi is a dependency that is not being installed automatically when the CI test is running for this tutorial. The streaming tutorial uses inline code-blocks too, so I thought I'd apply them here to avoid this issue. Do you have a suggestion how to resolve this issue?

@oruebel
Copy link
Contributor

oruebel commented Apr 20, 2022

I did this is because dandi is a dependency that is not being installed automatically

I see, thanks for the clarification. I think the S3 tutorial we didn't end up making DANDI a dependency because it was such a brief tutorial. Since you are significantly expanding this now here, I think it is worth revisiting this question since, ideally, we want to be able to have all these tutorial be run to ensure they keep working in the future. The options we have then are:

  1. Add dandi to the requirements-doc.txt so that we can use it in the docs

  2. Download a file via python directly in the tutorial so that it can then be used locally

  3. Have code that requires web access in code-blocks (so they are not being run) and then work of small local files that are either added to the repo or dummy files that are generated on the fly

  4. has the problem that it makes the tutorials less readable and works around the problem. Because of this, I think 1. or 2. are probably preferred but have the caveat that they require web access (so tests could fail if web access fails) and they generate traffic on DANDI. @rly @bendichter thoughts?

@weiglszonja
Copy link
Collaborator Author

I'm not sure how frequently this readthedocs build is happening in the background, but it will definitely slow it down if we go for option 1. Also is it feasible given that streaming requires a special setup with the ROS3 driver? (let me know @CodyCBakerPhD if I'm missing something here)

If we go for option 3, is it possible to hide code snippets? So the tutorial would look as it is (showing how to download a file or stream it from S3) but in the background it would actually read a small file added to the repo?

@CodyCBakerPhD
Copy link
Collaborator

CodyCBakerPhD commented Apr 25, 2022

Also is it feasible given that streaming requires a special setup with the ROS3 driver? (let me know @CodyCBakerPhD if I'm missing something here)

What I've had to do in the past so setup CI to perform streaming would be difficult to do on the readthedocs environment side, and also pretty costly to do every time a new build is done. If you just want to test the code blocks inside the docs to make sure they always continue running, I'd suggest just adding them as tests to the main testing suite that I assume is already setup to test other streaming behavior.

If we go for option 3, is it possible to hide code snippets? So the tutorial would look as it is (showing how to download a file or stream it from S3) but in the background it would actually read a small file added to the repo?

We did something like that a while back on the NWBCT - I no longer have the sphinx builds from that, but this snippet is what did it: catalystneuro/nwb-conversion-tools@8e9e6dc#diff-5adb8bffbac6c004685f555404d3586533e8c94f26c82fbe3592f81d977ae513R165-R176

Basically it adds an arrow that expands the code block when you click it, but otherwise minimizes it.

@weiglszonja
Copy link
Collaborator Author

Thank you @rly, looks good! 👍

# .. seealso::
# You can learn more about streaming data in the :ref:`streaming` tutorial.
#
# Then, we will use the ``DandiAPIClient`` to obtain the S3 URL that points to the NWB File
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use intersphinx to link to the API documentation for DANDIAPIClient?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh sure, I'll add it!

@bendichter
Copy link
Contributor

@weiglszonja it looks like you are becoming a sphinx expert!

@bendichter bendichter self-requested a review May 5, 2022 14:42
bendichter
bendichter previously approved these changes May 5, 2022
CHANGELOG.md Outdated Show resolved Hide resolved
oruebel
oruebel previously approved these changes May 5, 2022
Copy link
Contributor

@oruebel oruebel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thanks for all the hard work in putting this tutorial together.

@rly
Copy link
Contributor

rly commented May 11, 2022

In two spots in this tutorial, it is noted "# Reverse the last dimension because the data were stored in BGR instead of RGB"

@bendichter Could we correct this in the dandiset and release a new version where this transformation is not necessary?

Co-authored-by: Oliver Ruebel <oruebel@users.noreply.github.com>
@bendichter
Copy link
Contributor

I don't have permission to merge this. @weiglszonja, do you?

@bendichter
Copy link
Contributor

@rly I generally don't make changes to datasets that I wasn't involved in converting and publishing to begin with

@bendichter
Copy link
Contributor

also for this particular dataset I know they have Python and MATLAB code associated with it, and I would not want to make changes that might break their code

@weiglszonja
Copy link
Collaborator Author

@bendichter I don't have merge permission

@bendichter bendichter merged commit 3b886ca into NeurodataWithoutBorders:dev May 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants