Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spaceranger 2.1 update #7141

Merged

Conversation

stephenwilliams22
Copy link
Contributor

@stephenwilliams22 stephenwilliams22 commented Apr 10, 2023

This pull request makes updates to
Load10X_Spatial and Read10X_h5 to accommodate the new output of spaceranger 2.1 (which is still in development)

This PR

  • Allows for the new raw_probe_bc_matrix.h5 to be loaded by Load10X_Spatial
    • This file contains individual probe level counts and information
  • Adds probe meta-data to the Spatial assay of the seurat object
  • New function called Read10x_probe_metadata

The structure of the raw_probe_bc_matrix.h5 is as follows

/matrix                  Group
/matrix/barcodes         Dataset {4987}
/matrix/data             Dataset {17581240/Inf}
/matrix/features         Group
/matrix/features/feature_type Dataset {21178}
/matrix/features/filtered_probes Dataset {21178}
/matrix/features/gene_id Dataset {21178}
/matrix/features/gene_name Dataset {21178}
/matrix/features/genome  Dataset {21178}
/matrix/features/id      Dataset {21178}
/matrix/features/name    Dataset {21178}
/matrix/features/probe_region Dataset {21178}
/matrix/features/target_sets Group
/matrix/features/target_sets/Visium\ Mouse\ Transcriptome\ Probe\ Set Dataset {19779}
/matrix/filtered_barcodes Dataset {4987}
/matrix/indices          Dataset {17581240/Inf}
/matrix/indptr           Dataset {4988}
/matrix/shape            Dataset {2}

I added an example raw_probe_barcode_matrix.h5 to the visium test folder

This will probably require a fresh roxygenize but I don't know your workflow and didn't want to overstep

@stephenwilliams22 stephenwilliams22 marked this pull request as draft April 10, 2023 17:28
@stephenwilliams22 stephenwilliams22 marked this pull request as ready for review April 10, 2023 17:36
@stephenwilliams22 stephenwilliams22 marked this pull request as draft April 10, 2023 17:37
@stephenwilliams22 stephenwilliams22 marked this pull request as ready for review April 10, 2023 19:42
Co-authored-by: Shaun Jackman <sjackman@gmail.com>
@AustinHartman
Copy link
Contributor

Hey @stephenwilliams22, apologies for the slow reply and thanks for the PR. This looks good, I'd just like to test a bit more on my end and have a couple of questions.

  1. Could you send a full set of spaceranger 2.1 output files? (feel free to share over email if that makes more sense: ahartman@nygenome.org)
  2. Around when will spaceranger 2.1 be released? We don't have a Seurat CRAN release planned for a little while, but could do an intermediate release containing these updates.

@stephenwilliams22
Copy link
Contributor Author

Hey @stephenwilliams22, apologies for the slow reply and thanks for the PR. This looks good, I'd just like to test a bit more on my end and have a couple of questions.

  1. Could you send a full set of spaceranger 2.1 output files? (feel free to share over email if that makes more sense: ahartman@nygenome.org)
  2. Around when will spaceranger 2.1 be released? We don't have a Seurat CRAN release planned for a little while, but could do an intermediate release containing these updates.

Hey @AustinHartman now I need to apologies for the late reply!
Unfortunately because of legal hurdles, a full set of SR 2.1 outputs will only be available upon SR 2.1 release which is coming very soon (within a month). SR 2.1 and public data will most likely be out before anyone tries to analyze it in Seurat so I'll give you a link to all outputs as soon as it's available.

An intermediate CRAN release would be a HUGE help! This is the way that most 10x users i've spoken with use Seurat.

@AustinHartman
Copy link
Contributor

Thanks @stephenwilliams22, that makes sense. I'll plan on waiting until SR 2.1's release so I can do a bit of testing. Assuming that goes well, I'll merge to develop and aim for a CRAN release with a few additional updates shortly after.

@stephenwilliams22
Copy link
Contributor Author

@AustinHartman I just wanted to let you know that SR 2.1 datasets are live on the 10x website. Please let me know if you run into any bumps in the road! Data found here

@AustinHartman AustinHartman self-requested a review May 24, 2023 17:16
Comment on lines 549 to 554
file_path <- file.path(data.dir, filename)
infile <- hdf5r::H5File$new(filename = file_path, mode = 'r')
if("matrix/features/probe_region" %in% hdf5r::list.objects(infile)) {
probe_metadata <- Read10x_probe_metadata(data.dir)
Misc(object = object[['Spatial']], slot = "probe_metadata") <- probe_metadata
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @stephenwilliams22, I'm having a hard time understanding these lines. It seem like with filename as the default value, the probe info won't be loaded since file_path will point to the filtered h5. Should there be a separate variable (like probes.filename) for raw_probe_bc_matrix.h5 which contains the probe information (and also 'matrix/features/probe_region')? Let me know what you think or if I'm misunderstanding

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @AustinHartman. I actually wrote it this way if we ever decided to include the probe metadata in the raw or filtered feature_bc_matrix.h5 it will work as is. The way the code is written if you don't use filename=raw_probe_bc_matrix.h5 if("matrix/features/probe_region" %in% hdf5r::list.objects(infile)) will catch that that slot isn't there and skip making the meta-data. if you'd rather I update the code to not run file_path <- file.path(data.dir, filename) infile <- hdf5r::H5File$new(filename = file_path, mode = 'r') I'm happy to do that as well. As it stands the code works for raw_probe_bc_matrix.h5, raw_feature_bc_matrix.h5, and filtered_feature_bc_matrix.h5 and will only write probe_metadata for raw_probe_bc_matrix.h5

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. In that case, do you think filename should be passed to Read10x_probe_metadata in the event that the raw_probe_bc_matrix.h5 has a different name? It seems possible for the if condition to pass and Read10x_probe_metadata to fail with File not found.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's a good idea. I'll make the fix

@stephenwilliams22
Copy link
Contributor Author

@AustinHartman Feel free to run through this one more time. I ran roxygen2::roxygenise() but if that's not the way you all like to update docs. please let me know

@AustinHartman
Copy link
Contributor

Great, thanks for the PR! Hoping to have a CRAN release somewhat soon.

@AustinHartman AustinHartman merged commit 6b07da6 into satijalab:develop May 25, 2023
1 check passed
@stephenwilliams22
Copy link
Contributor Author

Great, thanks for the PR! Hoping to have a CRAN release somewhat soon.

Awesome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants