Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update QC report based on CITE-seq post-processing #336

Closed
sjspielman opened this issue May 30, 2023 · 3 comments
Closed

Update QC report based on CITE-seq post-processing #336

sjspielman opened this issue May 30, 2023 · 3 comments
Assignees
Labels
QC Relevant to the HTML QC report made available to users

Comments

@sjspielman
Copy link
Member

sjspielman commented May 30, 2023

Following changes made to incorporate CITE-seq post-processing into the pipeline, we'll need to update QC report.

qc_report.rmd

  • Update relevant phrasing throughout
    • It would be nice to be able to have logic for whether certain text appears, for example here:
      The raw counts from all cells that remain after filtering low quality cells are then normalized prior to selection of highly variable genes and dimensionality reduction.
  • The string for scpca_filter_method metadata has now changed, so think about that...
  • Some changes needs to happen to reflect that filtering in the _processed.rds object is filtered for both RNA and ADTs.
  • Update the basic_statistics table appropriately. Needs an if to check for CITE-seq and rows for more stats
    basic_statistics <- basic_statistics |>
    mutate(
    "Method used to filter low quality cells" = format(processed_meta$scpca_filter_method),
    "Cells after filtering low quality cells" = format(dim(processed_sce)[2], big.mark = ',', scientific = FALSE),
    "Normalization method" = format(processed_meta$normalization),
    "Minimum genes per cell cutoff" = format(processed_meta$min_gene_cutoff)
  • TBD: Should this plot only show RNA-based filtering? As written, it will include ADT-based filtering as well, which seems reasonable to me if there are ADT counts! Just need to update text (and maybe make the text conditional)
    # add column to coldata labeling cells to keep/remove based on filtering method
    filtered_coldata_df <- colData(filtered_sce) |>
    as.data.frame() |>
    tibble::rownames_to_column("barcode") |>
    dplyr::mutate(scpca_filter = ifelse(barcode %in% colnames(processed_sce),
    "Keep",
    "Remove"))
    ggplot(filtered_coldata_df, aes(x = detected, y = subsets_mito_percent, color = scpca_filter)) +

cite_qc.rmd

  • Also select target type
    antibody_tags <- as.data.frame(rowData(cite_exp)) |>
    • TBD: We end up creating this column if users didn't provide. Do we want to tweak the implementation to add a metadata indicator if this information was user-provided?

CC @allyhawkins @jashapiro , tagging (this is a pun) if you have thoughts on some of the TBD items above, or anything else! This might not be an exhaustive list of changes, but it's enough to get started and more will gel once I start getting going.

@sjspielman sjspielman self-assigned this May 30, 2023
@sjspielman sjspielman added the QC Relevant to the HTML QC report made available to users label May 30, 2023
@jashapiro
Copy link
Member

  • TBD: Should this plot only show RNA-based filtering? As written, it will include ADT-based filtering as well, which seems reasonable to me if there are ADT counts! Just need to update text (and maybe make the text conditional)

I think it is fine to show all of the filtering that was applied: I will note though that I think we can just use the values in filtered_sce rather than checking whether they are present from processed_sce? We do put those labels in both SCE files, correct?

  • TBD: We end up creating this column if users didn't provide. Do we want to tweak the implementation to add a metadata indicator if this information was user-provided?

I assume you mean the target column here: adding an indicator is fine, but I'm not sure it is needed: the reason to include this info is to indicate how it was used, which I think the column alone does cover.

@sjspielman
Copy link
Member Author

I think it is fine to show all of the filtering that was applied: I will note though that I think we can just use the values in filtered_sce rather than checking whether they are present from processed_sce? We do put those labels in both SCE files, correct?

💯, will clean this up!

@sjspielman
Copy link
Member Author

Closed with #343

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
QC Relevant to the HTML QC report made available to users
Projects
None yet
Development

No branches or pull requests

2 participants