Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

batch effects and PCs blog posts #40

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

paupaiz
Copy link

@paupaiz paupaiz commented Nov 21, 2023

looking forward to your feedback!

@stemangiola stemangiola linked an issue Nov 21, 2023 that may be closed by this pull request
Copy link
Collaborator

@william-hutchison william-hutchison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @paupaiz, the blog posts look great! The only problem is that I am unable to get BlogDown to render them. I have had a little look and I think there are a few reasons for this:

  • The PCs_SingleCell post requires the manual download of data. Instead, this data can be placed within the post's directory and included in the pull request.
  • The BatchEffects post requires the MergedNML object created in the PCs_SingleCell post. Instead this should be recreated within the BatchEffects directory so that each post can work independently.
  • The two new post directories should be placed in the tidyomicsBlog/blog/content/post directory.

Hopefully the blog posts will be good to go once these changes are made. To check, you can run the commands blogdown::build_site(build_rmd = 'newfile') then blogdown::serve_site() to render the site locally with your changes. You may need to set the working directory to tidyomicsBlog/blog for BlogDown to find the blog files.

Let me know if you have any questions or if I have misunderstood anything. Thank you for your work.

@paupaiz
Copy link
Author

paupaiz commented Dec 1, 2023

@william-hutchison Thank you! I will implement your feedback next week after finals 😅

@paupaiz
Copy link
Author

paupaiz commented Dec 31, 2023

@william-hutchison happy holidays! could you please enable LFS in the repo (setting > options > Git LFS). I get this error:
batch response: @paupaiz can not upload new objects to public fork paupaiz/tidyomicsBlog error: failed to push some refs to 'https://github.com/paupaiz/tidyomicsBlog.git'Git LFS is the only way to place the data in the post's directory

@stemangiola
Copy link
Collaborator

Hello All,

weshould be parsimonious with rds object size and try to make them less than the standard limit for github. this because LFS is free for a limited space.

we should try to make the object as light as possible, use xz compression, and if still too big, think about deleting some info still keeping the boggle interesting and information rich.

well done both, let's try to solve this issue soon, and we will have the January post already!

@paupaiz
Copy link
Author

paupaiz commented Jan 30, 2024

Hi team, sorry for the delay. Handling/compressing data this large was new territory for me. I compressed the files for the PCA tutorial as you suggested. for the batch effects, the data is publically available on Latch (similar to AWS), anyone can download it. Thank you!

@william-hutchison
Copy link
Collaborator

Hi @paupaiz, thanks for the update! I have a few more requests and then the blog posts should be good to go:

  • The data required by PCs_SingleCell.Rmd is now in included in compressed form, but as far as I can tell is not loaded or decompressed within the blog posts Rmd file. It should be possible to do this, but maybe an easier solution would be to create a Seurat object on your computer and subset the data as much as possible. Once the Seurat object is small enough, you can place it in the post directory as an Rds file which can then be loaded in the Rmd.
  • The MergedNML object required by BatchEffects.Rmd is not accessible to the Rmd file (although thank you for proving the download link). If the file is too large to load, maybe you could try subsetting the data as described above?
  • The posts are currently in the tidyomicsBlog/blog/content directory, but will need to be moved to the tidyomicsBlog/blog/content/post directory

Because we hope to automatically render the blog page every time someone adds a new blog post, everything each Rmd file needs to run will need to already be available without manual intervention. To check if the blog is able to render successfully, you can set your working directory to tidyomicsBlog/blog then run blogdown::build_site(build_rmd = 'newfile') and blogdown::serve_site(). You can see more information about the blog side of things here https://github.com/tidyomics/tidyomicsBlog/blob/master/CONTRIBUTING.md#how-can-i-add-a-post-to-the-blog

Let me know if you have any questions and thanks again for your work!

@stemangiola
Copy link
Collaborator

@william-hutchison, it would be good to add all these in the guidelines, so a new person that contributes knows pretty much what to do and not to do.

@william-hutchison
Copy link
Collaborator

@stemangiola good point. I have updated the blog contribution guide in #45.

@william-hutchison
Copy link
Collaborator

Hi @paupaiz, do you have any updates on your progress? The posts you wrote look great and we would love for them to be visible to others. Let me know if I can provide any assistance. Thanks again

@paupaiz
Copy link
Author

paupaiz commented Mar 2, 2024

@william-hutchison submitting revision today :) thanks for following up!

@paupaiz
Copy link
Author

paupaiz commented Mar 8, 2024

Hi @william-hutchison @stemangiola, I've implemented your feedback. Had to change the datasets so they are small now and in the correct paths. Thank you for your patience!

@william-hutchison
Copy link
Collaborator

Amazing, thank you @paupaiz! The posts are now rendering correctly and look great.

If I could request one final change, would you be able to update the code with a few examples from our tidy ecosystem where possible? A few possible changes could be:

  • Replace Seurat's subset to remove cells (e.g. high mitochondrial) with tidyseurat's filter.
  • Replace Seurat's VlnPlot, FeatureScatter and DimPlot with tidyseurat's ggplot.

You can find examples of this here: https://github.com/stemangiola/tidyseurat.

I have also made a few minor changes:

  • Added you as author
  • Changed dates
  • Removed warnings and messages
  • Extracted bundled data
  • Removed seurat object save

Thanks again and let me know if I can help with anything.

@stemangiola
Copy link
Collaborator

Amazing! Can someone attach here the HTML of the vignette so I can give feedback?

Thanks!

@william-hutchison
Copy link
Collaborator

Amazing! Can someone attach here the HTML of the vignette so I can give feedback?

Thanks!

Sure, they are a little bit hard to read in this format without any styling, but hopefully this is enough for you to provide feedback.

new_posts.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create a post in the tidyomics blog
3 participants