Skip to content

bio-century/Sequence_Analysis

Repository files navigation

Sequence Analysis (in Progress)

Table of Content

Abstract

This collection of code examples aims analysing (DNA-) sequence data with python. Initiated as a little private "learning by doing" project on a basic level, the focus is on getting coding experience within a limited amount of free time, sharing ideas and getting inspirations rather then finding the most elegent coding solution. The underlying data can be obtained freely from many scientific bioinformatics databases (check out my website www.bio-century.net for further info). More parts are to come...
Ideas and (fully executable) example code snippets are presented in a Jupyter-Notebook-(.ipynb-)fileformat. Jupyter-NB or equivalent extensions in the IDE of your choice is thus required to modify it.

Gallery

The SequenceAnalysis.ipynb file contains examples of basic sequence analysis, e.g. sequence indentification, classification of mutations (silent, missense and nonsense),

graphical representation of genes in sequences

and colorcoding the different segments of a tRNA.

Two highlights may be the implementation and visualization of the Needleman-Wunsch-Algorithm for sequence alignment and the graphical user interface for showing multiple sequences of interest (SOIs) within the target sequence.

Getting Started

All you need is a running jupyter notebook distribution of some sort as well as python fulfilling the requirements listed in section Requirements. Strongly recommended is vs code with it's .ipynb-extension

Inspirations & ToDo's

Here is room for your inspiration, which is very much appreciated! Please be patient as concerns implementationof your ideas, since the resources (time and personnel) are limited.

  • Progress in groundwork towards NGS-sequencing
  • Next Idea 1
  • Next Idea 2
  • ...

Folder structure

Sequence Analysis Repo
|
|   LICENSE
|   README.md
|   SequenceAnalysis.html                                  html-transformed output of the .ipynb-file for representation purposes
|   SequenceAnalysis.ipynb                                 Main .ipynb-file explaining tasks and giving example code to solve them
|
+---ExternalPackages
|   |   TerminalColors.py                                   External package defining the colors used to print sequences in the Jupyter-Notebook-terminal
|   \---__pycache__ +++                                     (COLLAPSED): Auto-generated pycache
|
+---Figures +++                                             (COLLAPSED): Example images for clarification
|
+---Figures_scientific
|
+---Icons +++                                               (COLLAPSED): Icons / Logos of bio-century.net
|
+---ModulesExternal
|
+---ModulesOwn                                              Functions / methods developed for Sequence Analysis
|   |   A_Groundwork.py
|   |   B_SimpleTabbedGUI.py
|   |   D_KmerAnalysis.py
|   |
|   +---A_Groundwork_Data
|   +---B_SimpleTabbedGUI_Data
|   +---D_KmerAnalysis
|   \---__pycache__                                         (COLLAPSED): Auto-generated pycache
|
+---requirements
|   \---requirements.txt                                    (COLLAPSED): Auto-generated pycache
|
\---_themes +++                                             (COLLAPSED) Themes for simple GUI in order to make the window look nicer

Requirements

Listed in ./requirements/requirements.txt

License

This work is published under the GPL-2.0 license.

Contributors & Acknowledgments

Many thanks to the comber.io admin for inspirations, code reviews and for initializing the bio-century.net website.

Sources

Sources are given directly in the respective code sections.

Contact

info@bio-century.net

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published