Skip to content

Latest commit

 

History

History
65 lines (44 loc) · 3.91 KB

base_shift_and_eventalignment.md

File metadata and controls

65 lines (44 loc) · 3.91 KB

Base shift calculation

  • User can shift the base sequence to the left by n number of bases by providing the argument --base_shift -n to plot and plot_pileup commands.
  • This is helpful to correct the signal level to the base. A negative n value will shift the base sequence to the left.
  • Base shift concept was motivated from the idea presented in pore_model document about the most contributing base to the current level.
  • We can programmatically calculate a base_shift to nicely align the signal to the base (color). The calculation is implemented here.
  • However, the user is adviced to use --profile (documented here) which automatically sets the --base_shift.

Example 1

  • At site A21 there is a heterozygous variant A/T.

  • The first pileup (Fig. 1) has a wrong base shift (-5) and second pileup (Fig. 2) has the correct base shift (-6).

  • Hence, the second pileup (Fig. 2) has both the variant (blue box - bed annotation) and the signal differences aligned to each other.

  • If the base shift was some other value instead of -6 the difference in the signals will not align with base.

  • This variant example is documented here. image Figure 1

  • image Figure 2

Example 2

  • Fig. 3 and 4 have the same dna_r10.4.1_e8.2_400bps signal pileup with a base shift of 0 and -6 respectively.
  • The base colors in Fig. 3 do not nicely aligned to the signal.
  • However, in Fig. 4 the signal is moving from low to high when a base T is met.
  • These signal pileups were generated using f5c eventalign.
  • F5c used dna_r10.4.1_e8.2_400bps pore model to align these signals. Its most contributing base index is -6 and hence the appropriate base shift in this scenario is also -6.

image

Figure 3: base shift 0 link

image

Figure 4: base shift -6 link

Example 3

  • Consider a dna_r10.4.1_e8.2_400bps forward and reverse mapped pileups for the same genomic region.
  • Fig.5 has 0 base shift for both tracks.
  • In Fig. 6, note that the reverse mapped pileup has a -2 base shift. This is because the signal sequencing direction is from right to left (more information).
  • In Fig. 6, both pileups have the signals going from low to high when a base T is met.

image

Figure 5: base shift 0, 0 link

image

Figure 6: base shift -6, -2 link

Example 4

Fig. 7 shows the forward and reverse mapped pileups generated using f5c eventalign for dna_r9.4.1_450bps data. F5c used the 6mer model (more information).

image

Figure 7: base shift -2, -3

Example 5

  • Fig. 8 shows the forward mapped pileups generated using f5c eventalign for rna_r9.4.1_70bps data. F5c used the rna 5mer model (more information).
  • Note that squigualiser always plot the RNA reads in its correct sequencing direction (reverse mapped RNA reads are skipped; reverse mapped RNA reads exist if a genome was used as the reference instead of a transcriptome).

image

Figure 8: base shift -3