Dibby

Dibby is a Python program toolkit aiming to discern the positions of disulphide bridges in a known protein from tandem spectrometry data. Dibby first matches in-silico generated fragments to the measured peaks, and then aggregates evidence from the matched fragments to determine the positions of disulphide bonds.

Dibby's most interesting feature is its powerful fragment matching algorithm. It is able to identify even complicated multiply-linked fragments, or fragments with internal disulphide bonds.

An example output for lysozyme digested with trypsin can be seen below. You can see that three of the four bonds have been identified — only bidirectional green edges count as bond identifications.

Dibby started as a project for my bachelor thesis in the bioinformatics program at Charles University, Prague. Some details about Dibby can be found in chapter 2 and 3 of said thesis.

Preparing data for Dibby

Namely, we made the Dibby with the following experiment setup in mind, but in theory it should be adaptable to different setups as well:

LC-MS/MS, protease used for digestion can be configured
ESI
HCD fragmentation
high-accuracy Orbitrap analyser

The data from the mass spectrometer should be exported in the MGF format. We do not recommend using trypsin for digestion, due to the common issue of disulphide bond scrambling.

Running the analysis

The analysis has three steps: matching precursors, matching fragments, and producing the visualization. They need to be done in this order; the first two stages produce .pickle files to cache the results.

Namely, you can perform the analysis by following the following steps:

Run src/precursor_matching.py from the command line. Instead of supplying the paths to the fasta of the analyzed protein, and the path to the MGF file with the measured data, directly, you have to specify the name of the protein. The data will be loaded automatically based on the name of the protein from data/fasta/___, and data/mgf/___ respectively — do not forget to supply these files. For other parameters, pass --help to the script, or check the source code.
Run src/fragment_matching.py. For the parameters, pass --help to the script, or check the source code. The script will automatically look for the pickle file generated by the previous stage, based on the passed parameters.
Run src/visualize_bonds.py. Make sure a folder named out/plots is at the root of the project --- the output plots will be saved there.

This workflow is a work in progress, and will change as a part of a bigger rewrite of Dibby. We do not recommend using Dibby in production yet.

The near future

We will further research the viability of this approach to disulhpide bond mapping. Should it prove useful, we will rewrite Dibby in a more performant language, and redesign the whole analysis workflow during the transition. We also hope to provide better and more transparent scoring system for the fragment matches that will be based on probabilistic scores.

Name		Name	Last commit message	Last commit date
Latest commit History 77 Commits
.idea		.idea
.vscode		.vscode
img		img
src		src
.gitignore		.gitignore
README.md		README.md
bp-code.Rproj		bp-code.Rproj
data_interpretation.Rmd		data_interpretation.Rmd
data_interpretation.html		data_interpretation.html
fragment_interpretation.R		fragment_interpretation.R
generated_data_testing.R		generated_data_testing.R
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dibby

Preparing data for Dibby

Running the analysis

The near future

About

Releases

Packages

Languages

Eugleo/dibby

Folders and files

Latest commit

History

Repository files navigation

Dibby

Preparing data for Dibby

Running the analysis

The near future

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages