Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
SIG: Bioconductor Infrastructure for Base Modifications #35
I am a new PhD Student at the Walter and Eliza Hall institute in Melbourne, Australia. My project is based around methods and tools for the analysis of DNA methylation in long reads using Oxford Nanopore sequencers. My formal background is in statistics but I mainly work on developing software and have a keen interest in efficient and user-friendly computational methods and visualisation.
Researchers who are interested in base modifications of all kinds, I am interested in DNA but the developed structure should equally support RNA modifications.
Should it be held during Developer Day
Description of the topic
(Will update this section after I do some more research and take suggestions)
I think there are things to keep in mind for this:
As far as I'm aware there's not a specialised widely supported Bioconductor structure for storing base modification information that also facilitates straightforward querying of common issues. The basics would be to ask for the methylation proportions in a specific region, there should be metadata within objects to separate groups for which this can be asked as well as reporting of coverage at the loci. Additionally it would be useful to query within-read methylation patterns, to inspect correlation between methylation sites within molecules. Compactness of representation is also going to be important, sparse or on-disk representations would be useful to consider, features and query performance probably take second place to storage size.
I'd like to establish a set of queries of interest and a general abstract idea of what data structure(s) might be appropriate.
From the discussion it sounds like
For a general description of modifications, if we want to look toward other projects for consistency, the NCBI C++ toolkit has an analogous
For a dictionary of DNA or RNA nucleotide modifications have a look at the
Here's my notes from the day (copied below):
RNAmodR is basically structured like this:
General ideas and experiences: