Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposed contribution task for Outreachy applicants: Link xenTro10 (UCSC genome) to UCB_Xtro_10.0 (NCBI assembly) #48

Closed
hpages opened this issue Oct 14, 2022 · 8 comments

Comments

@hpages
Copy link
Contributor

hpages commented Oct 14, 2022

This task depends on issues #46 and #47 being completed first (i.e. PRs accepted and merged, and issues closed). Although it's not a requirement that the 3 tasks be completed by the same applicant, it will be a more interesting learning experience if they are.

The purpose of "linking" a UCSC genome to the NCBI assembly that it is based on, is to support the map.NCBI argument of the getChromInfoFromUCSC() function. Try getChromInfoFromUCSC("hg19", map.NCBI=TRUE). See what happens? Now try getChromInfoFromUCSC("xenTro10", map.NCBI=TRUE). See what the problem is? Check the documentation of the map.NCBI argument in ?getChromInfoFromUCSC to learn more about what this argument does.

Linking a UCSC genome to its NCBI assembly is done by defining an NCBI_LINKER object in the registration file for the UCSC genome (xenTro10.R in this case). There's some very succinct information about what NCBI_LINKER should look like in the README.TXT file located in GenomeInfoDb/inst/registered/UCSC_genomes/. Don't hesitate to look at other registration files to see examples of how NCBI_LINKER is defined.

IMPORTANT NOTES TO OUTREACHY APPLICANTS:

  • Make sure to complete all the Preliminary tasks listed here before you start working on this task. In particular, make sure that you have R 4.2 and that you are set up to use the devel version of Bioconductor (currently 3.16).
  • Only one applicant can work on this task. If you choose to work on this task, please make sure to assign yourself so other applicants know that the task is already being worked on. If later on you change your mind, please unassign yourself. It's ok to change your mind!
  • To work on this task, please fork the GenomeInfoDb repository. Then do your work on that fork.
  • Always test your changes before you commit them to your fork. This consists in installing the modified package, starting R, loading the package, and playing around with the new functionality. This process is called "ad hoc manual testing". Once everything behaves and looks as expected, run R CMD build and R CMD check on the package. Note that R CMD check should always be run on the source tarball produced by R CMD build.
  • R CMD check might produce some NOTEs and even some WARNINGs. These are ok if they existed before your changes. You can check that by taking a look at the daily report produced by our automated builds here: https://bioconductor.org/checkResults/devel/bioc-LATEST/ Make sure to not introduce new NOTEs or WARNINGs!
  • Once your work is ready to be merged, please submit a PR (Pull Request).
  • Remember to record your contribution on Outreachy at https://www.outreachy.org/outreachy-december-2022-internship-round/communities/bioconductor/refactor-the-bsgenomeforge-tools/contributions/.
@hpages hpages changed the title Proposed contribution tasks for Outreachy applicants: Link xenTro10 (UCSC genome) to UCB_Xtro_10.0 (NCBI assembly) Proposed contribution task for Outreachy applicants: Link xenTro10 (UCSC genome) to UCB_Xtro_10.0 (NCBI assembly) Oct 17, 2022
@Simplecodez
Copy link
Contributor

Good day sir, please assign me to this task.

@hpages
Copy link
Contributor Author

hpages commented Oct 27, 2022

@Simplecodez Don't forget to record your contribution #46 on Outreachy!

@Simplecodez
Copy link
Contributor

Okay sir. Thank you

@Simplecodez
Copy link
Contributor

Sir, just created a PR for this task. I will be anticipating your feedback.
Thank you.

@hpages
Copy link
Contributor Author

hpages commented Oct 28, 2022

All right @Simplecodez , I merged that (PR #74). Thank you!

Did you try to do getChromInfoFromUCSC("xenTro10", map.NCBI=TRUE)? Does it look like it's working properly?

You can also try getChromInfoFromUCSC("xenTro10", map.NCBI=TRUE, assembled.molecules.only=TRUE) to produce a smaller output. It will be easier to inspect visually.

@Simplecodez
Copy link
Contributor

Simplecodez commented Oct 28, 2022

All right @Simplecodez , I merged that (PR #74). Thank you!

Thank you sir.

Did you try to do getChromInfoFromUCSC("xenTro10", map.NCBI=TRUE)? Does it look like it's working properly?

Yes sir, I tried that and it returned the expected result

You can also try getChromInfoFromUCSC("xenTro10", map.NCBI=TRUE, assembled.molecules.only=TRUE) to produce a smaller output. It will be easier to inspect visually.

I will try this now

@Simplecodez
Copy link
Contributor

All right @Simplecodez , I merged that (PR #74). Thank you!

Did you try to do getChromInfoFromUCSC("xenTro10", map.NCBI=TRUE)? Does it look like it's working properly?

You can also try getChromInfoFromUCSC("xenTro10", map.NCBI=TRUE, assembled.molecules.only=TRUE) to produce a smaller output. It will be easier to inspect visually.

I just tried this now and everything works well.
Thank you.

@hpages
Copy link
Contributor Author

hpages commented Oct 28, 2022

Perfect ✔️

Next task in your group is issue #56. Whenever you are ready, go there and ask to be assigned.

Also don't forget to record your contributions on Outreachy at https://www.outreachy.org/outreachy-december-2022-internship-round/communities/bioconductor/refactor-the-bsgenomeforge-tools/contributions/

@hpages hpages closed this as completed Oct 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants