Skip to content

Latest commit

 

History

History
49 lines (30 loc) · 2.89 KB

README.md

File metadata and controls

49 lines (30 loc) · 2.89 KB

Scripts for plotting nucleotide imbalance from DNA sequences

The scripts

  • nucleotide_difference_imbalance_plot_stylized_like_Figure_8_of_Morrill_et_al_2016.py

FASTA sequence for a region of a chromosome/contig/scaffold --> plot of GvsC and AvsT imbalance

See Figure 8, panel B of Morrill et al 2016(PMID: 27026700).

There is a demonstration of this script available in my set of sequence analysis related demonstrations notebooks here. To run it actively, launch a binder session by clicking on the launch binder badges here and select the link to 'Demo of script to plot nt imbalance for sequence span' from the list of notebooks to run it actively. The particular notebook can be viewed statically, nicely displayed here.

Example output:

example_imbalance_plot

  • two_nucleotides_in_proximity_difference_imbalance_plot.py

FASTA sequence for a region of a chromosome/contig/scaffold --> plot of imbalance of the sum of two bases vs the sum of the other two

This is like nucleotide_difference_imbalance_plot_stylized_like_Figure_8_of_Morrill_et_al_2016.py; however, it allows specifying a combination of any two basepairs to see of the amount of those in proximity is out of balance vs the sum of the other two basepairs. For example, GandC vs. AandT.

Using the script is similar to nucleotide_difference_imbalance_plot_stylized_like_Figure_8_of_Morrill_et_al_2016.py and so look over the demo notebook about that and then to use two_nucleotides_in_proximity_difference_imbalance_plot.py add an argument of two nucleotides to group, such as GC when calling it on the command line. Or specify dibase_text1, such as dibase_text1="GC", when calling the main function of the script when in a Jupyter notebook cell or in IPython. You only need to specify the one pair, the script or function deduces the other two from the remaining Example invocations of the script or main function:

  • from command line:
python two_nucleotides_in_proximity_difference_imbalance_plot sequence.fa 20000 GC
  • from importing main function when in Jupyter or IPython:
%matplotlib inline
from two_nucleotides_in_proximity_difference_imbalance_plot import two_nucleotides_in_proximity_difference_imbalance_plot
two_nucleotides_in_proximity_difference_imbalance_plot("sequence.fa", 20000, dibase_text1="GC", return_plot=True);

Related items created by myself

?

Related items by others

?