Skip to content

Exploration, use and implementation of Python and R tools for DNA sequencing. (Course : Analyse de Séquences)

License

Notifications You must be signed in to change notification settings

gmagannaDevelop/BioGenesis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BioGenesis

This README will contain at least a section in English (main) and another in French.

Disclaimer

This project is of academic scope and individually mantained. It may not follow best practices or suit your particular purpose of use. Therefore use it at your own risk.

If it were helpful in any way, please let me know! If you would like to contribute to it, hit me up (PR welcome!).

So, what does it do, excatly ?

A lot of things? None of them?

This project has radically changed. At first I thought it would be a good idea to explore the tools related to my "analyse de séquences" introductory course. Afterwards I found out that other courses of my first semester were closely related (what a surprise huh? Bioinformatics lectures are related? Who would have thougt!) I decided to use it as a centralised repository for various things, all bioinformatics related.

I should probably remove or modify the table of contents, given that it is no longer true. Note that each course will probably have its own branch. This seems to be the most sensible approach, for now.

English

Exploration, use and implementation of Python and R tools for DNA sequencing. (Course : Analyse de Séquences)

Main components

The repository contains both R and Python code. These are the package / virtual environment managers that I use, along with their respective config file. Experimentally I'm adding support for Julia.

  • Python is managed via poetry.

    • pyproject.toml
  • R is managed via renv.

    • renv.lock (json format)

I chose these tools because they enable many things which I consider desirable in a project:

  • Reproducibility: unlike conda, poetry virtual environments are reproducible).

  • Isolation: No more cluttering your global Python/R libraries.

Table of Contents

  • Introduction genomes / sequence databases
  • Sequence Alignment
  • Motif Lookup
  • Phylogenetic trees
  • Annotation

Extra notes

If you plan accessing any service provided online by the NIH (like BLAST), you should consider getting an API key. You can find more info [here](NCBI Insights : New API Keys for the E-utilities).

Français

Exploration, emploi et développement d'outils dans les langages de programmation Python et R pour l'analyse des séquences. (Cours : M1 Analyse de Séquences)

Composants Principaux

Ce dépôt contient du code en R et Python. J'utilise deux gestionnaires de paquets / environnements virtuels , un pour chacun. Ce sont les suivants, accompagnés de leurs fichiers de configuration principaux (C'est-à-dire que si vous avez les gestionnaires installés et vous avez les fichiers, vous saurez capables de recréer les environnements afin de pouvoir utiliser les codes trouvés dans ce dépôt).

  • Python est géré via poetry.

    • pyproject.toml
  • R is managed via renv.

    • renv.lock (format json)

Je les ai choisis sur d'autres outils car ils possèdent des caractéristiques que je considère désirables dans un projet, notamment:

  • Reproductibilité: les environnements virtuels de poetry sont reproductibles,  pas comme ceux de conda.

  • Isolation: Pas de saturation de vos librairies globales (niveau système ou utilisateur). Chaque projet est contenu dans un dossier séparé.

About

Exploration, use and implementation of Python and R tools for DNA sequencing. (Course : Analyse de Séquences)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages