Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Controlling granularity #4

Open
egoltsman opened this issue Oct 9, 2020 · 4 comments
Open

Controlling granularity #4

egoltsman opened this issue Oct 9, 2020 · 4 comments

Comments

@egoltsman
Copy link

Hi Erik,
I'm using edyeet to induce a graph (w seqwish) on a small set of sequences that contain mostly large indels (~4-6kb). It seems like in this case edyeet is trying too hard to do base-level alignment where it should've either terminated or opened a large gap. In the first case below, there is a 5 kb inverted duplication (I know it because it was synthetically introduced) at pos 7544324 on Accn1, but the aligner is attempting to extend the alignment past the breakpoint following the initial ~50kb match.
Similarly, in the second case a 5kb inversion occurs at pos 7,573,027, but instead of terminating the alignment, edyeet is pushing through the area of virtually no identity. This leads to tiny graph segments and structures that later get called as bogus variants. I tried raising the -p cutoff to 95%, but that results in the entire 50kb block containing the inversion not being reported. It seems that this cutoff applies across the entire block. Is there anything else you could suggest tweaking that works at a local level, sort of like a gap extension vs mismatch penalty in smith-waterman ?
Thanks!

Accn1   75071545        7500000 7550000 +       Accn2   75021975        7490030 7539604 49550   50000   23      id:f:0.99538    ma:i:49550      mm:i:13 ni:i:423        nd:i:11 ns:i:14 ed:i:461        al:i:50011      se:f:0.00921797 cg:Z:44326=10D4998=5I2=7I1=7I1=6I1=1I2=3I1=1I2=1I2=1I2=1X1=1X1I1=1I2=1X1=11I1=8I2=2I1=3I1=7I1=7I1=1I1=2I1=4I1=2I1=1I1=1I1=3I1=3I2=3I1=11I1=1I1=3I1=2I1=2I2=5I1=6I1=9I1=3I1=1I1=1I1=9I1=3I1=6I2=2I2=3I1=1I1=4I3=4I2=3I1=1I1=1I2=2I1=1I3=1I1=1D2=1I1=2I1=3I1=1X2=3I3=3I1=2I4=5I2=1I1=1I1=1I3=4I2=5I2=2I1=2I1=2I3=5I1=5I1=1I2=3I3=13I1=1I2=4I1=3I3=6I1=4I2=2I2=2I1=3I2=5I2=1I1=1I1=2I1=5I5=7I1=1I1=2I1=1I1=1I2=1X5I1=1I1=2I1=2I2=5I2=3I2=7I1=1I3=1I2=13I5=1X2=2I1=2I2=3I3=1X1=2I3=1X3I1=2I2=1I3=3I2=3I1=2I4=3I1=1I2=2I1=3I1=3I2=2I2=1X7I1=1I2=1I1=1I1=6I1=1I1=5I1=5I1=3I1=5I1=1X1=1I1=1I1=2I5=1X2I1=1I2=1I1=2X5I1=1I1=1I1=1I2=1I2=1I1=1I2=14I
Accn1   75071545        7550000 7600000 +       Accn2   75021975        7545041 7594371 47271   50000   14      id:f:0.958261   ma:i:47271      mm:i:1450       ni:i:609        nd:i:609        ns:i:670        ed:i:3338       al:i:50609      se:f:0.0659566cg:Z:23026=1X2D1=1X1=1D1=1I1=2D1=1X1D2=1X1=1D1=1D1=1D1=1X1D3=1X2=1D1=1X1D1=1D1=1X3=1X1D1=1X1D4=1X1=2I3=2D1=1D4=1I3=2X4=4X1=2I1=1I1=1X2=1X2=1I1=1X2=1I2=1I1=1I1=3X1=1X1=2I2=2X2=1X3=3X1D3=2X1=1X1I1=2X1=2X1=1X2I1=2X6=1X1=1X2=1D3=1I1=1X2I1=3I5=3X2=1X1D2=1X2=1X1=1X1D1=1D2=1I1=1I4=1X1=1D1=1D2=2I1=3X1I2=1I1=1D1=1X1=1D1=1X1I1=1D2=1X1D1=1I2=2X4=1X3=2X1I4=2X2I3=2I3=1X1=1X1D2=3X1=1D1=1X1I2=1D2=1I2=1D1=1D1=1I1=1X1=1X1=2D1=1D2=2D6=1X3D1=1X1=2X2=1I1=1X1=1I3=1X1=1X2=1X2D1=2D1=1D3=1X1=1D2=2X1D1=1X1=1I5=1I2=3X3=1X1=1D1=1I2=2X1=1I1=3X1=1X1I1=1X1=1D2=1X1I1=2X2=2X1=1X1=1I1=1X1D1=1I2=1X1I1=1X1=1X1=1X1I2=1X3=2X1=5D4=1X1=2X1=1I6=1X1D1=1I2=2X2=2X2I1=2X2=1X1=1D1=1D1=1X1D1=1I2=1X1I1=2X3=1X2=1X1=1I2=1D1=1I2=1D4=1D1=5D3=2D1=1X1=1X1=1X1=2X2=1X2D1=1D3=1I1=1X3=2X1D2=1X1=1X2=1X1D1=1X2D1=1X4=2X1=3X2D2=1X1=1D2=1X2=1X3=1X1D2=1X1D1=1D1=1X2=1D1=1X1D4=1X1D1=1X2=1D2=2D1=1X2=1I3=2X1I1=2X3=2D3=2X1=1I2=1X1D3=1X1=1X1=2X1D1=1D1=2D2=2X1D2=1D1=1X1=2X1=1I2=1D1=2D1=1D3=1I1=2D3=3X1=2X1D4=1I1=1I2=2I5=1X1=1I2=1X1D3=1X2D2=1X2=2X1D2=2X3=1X1=1X1D1=1X1=2D2=4D1=1X2=1X1=1I2=1D1=2D1=1D2=1X1=2D3=1D3=1D1=1I2=2X1=1X1D1=1X1=2X1=2X1I2=1I1=1X1=1X1=2I3=1D2=2X2=1D3=1I1=1X1=2X2=2D1=1D2=1X1=1X1=1D1=3D1=1D2=2X1=2X1=1X3=1I2=1X1=2X2=2X1=1D1=3X3D1=1D2=1X2=1I1=1X3=1X2=1X2=1X1=2D5=1X1=1X1=1X1D2=1X1=1D1=1X1D4=1I2=1X2=1D4=3X1D2=1D1=1X1=3X1=2X3=2X1I2=2D3=2X1=1X1I2=1X1D1=1I2=1I1=1X1I3=3X1=4X1D1=1D1=1X1=3X3D1=1X3=1X1=2D1=1X3=1X1D1=1X2=1D1=1X1=3X1=1X1=2D1=2X3=1D2=1X2=1X1=1D1=3D1=2D1=1X1D4=1X1D2=1X1=1X1I3=1X2=1X1=1X3=2X1D2=1X1D2=1I2=2X2=1D1=1X1=3X2I1=2X2=1X2=4D5=1D2=1X2=1X1I2=2X1=1X1D3=3X2=1X2=1D2=1X1=1D1=1I1=1X1=1X1=1X1=1X1=1X1I2=1I2=2X3=2I4=1X1=1D4=2I1=1X1I2=1X2=1X2=2X2=1X1=2X1=1X1=1D2=4I2=1I3=1X1D4=1X1=1X2=2I3=1X1=1X1=1D1=1X1=1X2=1I2=1D1=1X1=1X2D3=2D2=1X2=1X1I2=1I1=1X1=1X1I3=3X2=1X1=2X1=1X1=1X1=2D1=1D1=2D1=1D3=2X1D3=1X1=1X2=1D2=1I1=1X1=1I2=1I2=1I1=1D1=1X3=3X1=1X1=1X4=1D2=1X1=3X1=1X1=2X3=1X1=1D2=3D6=1X1=1X1=1X2=1D2=2X1=1X1I1=1D1=1D3=1X2=1X1=3X1=1I2=1I4=1X1=3I1=1I1=1X1=1X1=1X2=1D1=1X1I2=1D3=2D1=1X1=1I1=1D1=1D2=1X2=1X3=1D1=2X1D1=3I2=1X2=1X1=1X1=1I2=1I1=1I1=1X3=1X2=1X1=1X1=1X1=1X1=1X2=1D3=1D1=3I3=1X1=2X2=1X1=2X2=4X2=1X1I1=1X1=1I1=1X1=1X1D1=1I3=2X1=1I1=1I1=1D2=2X1=1X3=1X1=1X1=1X3=1X2D1=2D2=2X2=3X2=1X1D1=1D2=2X2=1X1=1D2=1I2=1D1=1X1D1=2D1=1D2=2X4=1X2D1=1D1=1I2=1X1=1I2=1X2=1I1=1I2=1X4=2D2=1I1=1X1=1X3=2X2=1X2=1D2=1X1=4X1=1I2=1X1=1X1=2X1D2=2D4=1D2=1D2=3D3=2X1D1=1I1=1D2=1X2=2X2=1I1=1I1=1X1=1D1=1X2=1X1=1I3=1X1=1X2=2X2=1X1=2X1D1=1D1=3X1=1I3=1X1=1I2=1I1=1I2=1I2=1X2=1X1=1I4=1I1=1I2=1I1=1X4=3I2=3X2=1D2=1D1=1I1=1X3=1D1=1X3=1X1I2=1I1=1I1=1I2=1X2D1=2D2=1X1=1I2=1X1=1D4=1I1=3X1=2X1=1X1D1=1X2=1X4=1X1=1X1I2=1X1=1I1=1I3=2X2=1X1=2X2D3=1X2D4=3X3=2I1=1X2=1X1=2I2=1X1=1X4=1X1=1X1D2=3X2=1X4=1I1=1X3I2=2X1I2=2X1=1I1=2I5=1X1D1=1X1=1I2=1I2=1I1=1X2=1X1=1I1=1I3=2I2=1X1=2I1=2I2=1I3=3I1=2X1=1I3=1X2I1=1I2=3X2=1X2I1=1X1=1I1=1D4=1I2=1X2I1=1X1=1X1=2I1=1I1=1I4=1X1=1X1=1I1=2I1=3I3=2X2=1I2=1I1=2I3=1X3=1X1=1I2=1X2=1X3D2=1I1=1D2=1X4=6X4=2X2I2=1I4=1X2=2X1D1=1D1=1X1=1D1=1D1=3X3=1X1=1X1I1=1X2=1I1=1X5=4D4=1X1=1D1=1X1=2X1=6X1=2X2=1X1I1=1X4=1X2D6=2X1D4=2X1=1D2=1X1=1X1=2D2=2D1=1X5=2X1I1=1I4=2X4=1X1I2=2X1=1D2=1X1I2=1I1=2D2=2D4=1X2=1X1I1=1X1I1=1I1=1X1=1I1=1I1=1I1=3X3=1I1=1X2=1I1=1I1=1X1=1X1=1X1D4=2X3I5=1X2I4=5X1=1X1=1X2=1X1D4=2X1=4X1=1I1=2X1=1X5=1I1=1X2=1X1=2X1=2X1=4X1D1=2X1=1D1=2X2=1X1=1X1D2=1D1=1D3=1X4=1X1=1D1=1X2=1X1D2=1D2=1D1=2X1D1=2D2=1D1=1X1D1=1X1=1D2=1X1D1=1X1=1D1=2D3=1X2D1=1D1=1D5=1I2=1X1D2=1I1=2X1=2X4=1D1=1X1D3=2D2=1X1=2D2=1X1I1=1X1=2X1I1=2X2=1I3=1X2=3X1=1X2=1X1=1X1I1=1X2=1D2=1X1=3X1=1X4=2X1=6X2=1I1=1X1I2=1D1=3X1=1I1=2D2=1X1=1I1=2X1=1X1=1X1=1X4=1X1D1=1X2=2X1=3X1=2X1D1=1X1D1=1X1=1X1=1X1=3X1=1D1=1X1=1X2=2D2=1X1D1=1D3=1X1I4=1I3=1D1=1X1=1D4=2D2=1X2D1=1D1=1D1=1D4=1X1D1=1D3=1X1=1X1I2=1X1I2=1X1D2=1X2=1I1=1I4=1X1=2X1=1X2=1I1=2X2=2X2=2X1=1D2=1X1=2X1=1X4=1D1=1D2=1X2=1X1I2=1X1D2=2X1=1D4=1X2I4=1I1=1I1=1I1=1X1=1I1=3I5=1X1I1=1I3=1D4=1X1D3=1X2=1I1=3I2=1X1=1X1=1X1=2X1I1=1X1=1X1=1X1=1X1=2X1I1=3X1I1=2X2=1X1=1X1=1I3=1X1=1X1=1X1=2X1=1D1=1X2=2X1=1D1=1X2I1=1I2=1I4=7X4=1X2D1=1D1=3X1=1X2=1I2=1X1=1X1D1=1X2=1X1=3X2=1X3=1D2=2X1=2X1D1=1X1=1X1D2=2I1=1X2=2I3=1X3=2I2=2X1=2X1=1X1=1I2=1D1=1D5=1I1=1X1=3I3=1I1=1I1=1X1I1=1X1=1I2=1X1=1X2I3=2X1I1=3I3=2I2=1X1I2=1X1=1X1=1I4=1X3=1I1=1I2=1X1I1=1X2=2X2=2X1=4X2I1=2X1=2X1=1X2=1X1=1D5=1X1=2X1=1D1=4X1=2X4=1X1I2=1X1=1X1=5X4=1X2D5=2X3D4=1X1=2X1=1I1=1D1=1D2=1X1=1D3=3X1=1D2=1D1=2X1=1D1=2D1=1X1=1D1=1X4=2I2=2X1I2=1X1D3=2X1I2=1X1D4=2X5=2X1D1=1D4=1X1=2I2=1I1=1X1I1=1X3=2X1=1I3=2X1I6=1X2I4=1X1=1X1=1D1=2X1=6X1=2X1=1X1=1I1=1X4=4I5=1X2=2I3=1I2=1X1=1X1I1=1X1D1=1X1D1=1X1=1I1=1X1=1X1=1D1=1X4=1X2=1X2D2=1D2=6X4=1X2=1I1=1D2=1X1=3I1=1X3=1X2=1D1=1X3=2D1=1D2=1D2=2X3=1X2D1=2D2=2D1=1X4=1D1=1D1=2D1=1X1=1X1=1X2D3=1D3=1I1=1D1=1X1=1X2=3X2=1X1=2D2=3D1=1D1=2X1=1D2=2D1=1D2=2X1=1D3=4D5=1D1=1X2=2D1=1D3=1D1=1X1D1=1X1I5=2D2=2X1D2=2X1D2=1X1=4D4=1X2=3X2=1X1=1X1I4=1X1=1X2=2D1=1X2=1X3=2D1=3X4=2X2=1I1=1D2=1D2=3I1=2I3=1X1=1D1=1D2=2X1=1D4=1X2=1X1=2X1=1X1=2X1D1=1X1I4=1I1=1X2=1D1=1X2=2I1=1X2I3=1X1=2D3=2D2=1X1=1I3=1X1=1D1=1I2=1I2=3X3=1X2=1X1=1X1=1D2=1X1=1D3=1I1=1X3=1X1D2=1X2=1X2=2D1=1D3=1X2=5X1=2D1=1D2=2D2=1X1=1X3=1X1=1D2=1X1=1I1=1X2=1D2=2X2=1X1D2=1I1=1D1=2X1I3=3I2=1I2=1I4=2I2=2X1=2X2=1I1=1D1=4X1=1X3=1I1=1X2=2X3=1X1=1X1=1D2=2I4=1X2=1X1=1D2=1D3=1X1D2=1D1=1X2=2X3=2I1=1X1=1I1=1X1I2=1D1=1X3=1D2=1I1=2I3=1X2=4X1=3I2=2I2=1X1I1=3I3=1X1=1X1=1X3=1X1=2X2=1I1=1D2=2X1=1D2=1D1=2X1=1D1=1I1=1X1=1X1D2=4X2=2X1=1X2=2X1=1X3=2X1D1=1D2=1I2=1X1=1X1=1X1=1X1=1X2=1X3=1X2=2D3=2X1=1D2=1X2=3D1=2X2=2I2=1X2=1X2=1I1=1I2=1X1D1=2I3=1I2=1X1D1=1I2=1X1=1X1=1X1=1X1=1D1=1D2=2D2=1D3=3X1=1X1D2=1X3=1I1=1I1=1X1D1=2X2=1I2=1X1=1X1=1X6=3I2=1I1=1X3=2X1=1X1=3X1=1X2=1I4=1X1=1X1=3X3=1X1=1I2=1D1=1D3=1X1D1=1D3=1I1=1X1=1X3=2X1I3=1I1=2I1=1I1=1I1=1X1=2X1=1X1I1=1X2=3X3=1X1D1=1X2=1D1=1X2=1X1D2=1X1=1I2=1X1=1I1=3X1I2=1D1=1X1=1X1=1D2=1I2=1X1=1X3=1X1I4=1X1I4=1D1=4D2=1X1=1I1=2X1=1X2=2X2=1X2=1X2=1X1=2D1=1D4=1X1I4=2D3=2X2=1D2=1X1D1=1X1=1X1=1X1=1X1=1D1=1X1=1I3=1I1=1X2=3X3=1X1I1=2X2=1X1D2=1X2=1I5=3X1=3X1I2=1D1=3X1I1=1I1=1I2=2X3=1D1=1X1I2=2X1I3=1X1=1X2=1X3=1X1D1=1X2=1X1I4=1X1I1=2I2=1X1=1X3I3=1I1=1I3=2X1=1I1=1X1I1=3X1=1X2=1I1=1X1=1X1=1I2=1X2=1X2I3=1X1=5X1=2X1I2=5X1=1X1I1=1D1=1D1=2X3=3I2=1X1D1=2X3=2I2=1I1=1D2=1D1=2X1=3X1=1X1=1I2=3X2=1I2=1I2=1X4=1D2=1X1=2I1=1X2=2X1=1I1=1X5=1X1I1=1I2=1X2=1X3=1X1=1D2=1X2=1I1=3X1I1=2I1=3X2=2X1=1I2=1X3=1D1=2X1=2X2=1X2=4I1=1I1=1X2=1X1=2X2=2X1I5=1D1=1D1=1D3=2X1=2D3=1X1=2I1=1X1=1X1=2X1I2=1D2=1X1=2X2I1=1I3=1I3=1X1I1=1I4=2X1I2=1X1I1=1X1=1I1=4I2=1X1=2X1=2I1=1I3=2X2=2X1I2=1X2=1X2I3=1X1I2=1D1=1X5=2D2=1D1=1D4=3X1=2X1I3=2I3=1D1=1I2=1I1=3X1=2X1=1I1=1I2=2X1I2=1X2=2X3I1=1X1=1I3=1X1I3=1I1=2D4=3X1=2X1=1X1I3=1D2=1X2=2I1=1I2=1X1=1X1I4=1X1I1=1I2=1X1=1X1=2I2=1X2=1I1=1X2=1X3=1X1=1I1=5X1=2I4=1X1=1X1I1=1X2I2=1X1=1X2=2X1I3=1X1=1D3=1X2=1I1=1X1D1=1D2=1X1=3X1=3X3=1I1=1X1D1=1I2=1X1=2X1D1=1X1=1X1=1I1=1I2=1X3=1X1=1I1=3X1I4=1X1=2X1=1X2I1=1X2=1I1=1I1=1X1I1=1I2=1I3=1D1=2X1=1X5=2X1D1=2X1I2=2X1D2=1X1=3X1I1=1I1=1D2=1X2=1D1=1I1=1I1=1I1=2X1=1I1=1I1=1D3=3X1=1X1=3D1=2X2=1D2=1X1=1I2=3X2=1D6=1X1D1=2X2=1X1I1=1I3=1I1=1X1=3I2=1X1I1=1X3=1D1=1X2=1X1=1X1D1=1X1=1X1I1=2I5=2I3=1I1=1X1D1=2X2=2X3=1X1=1D1=1I1=2X1I1=1I1=2I2=2X1=1I3=2D3=2X2D4=2X1D3=1X4=2X2=1D1=1X1I2=1I1=1X1D2=1X1I1=1I2=1X1=1X1I1=2X1=2X1=2D1=1D5=2D2=1X2=1X2I2=1X2=1X1=1X1=2X1I5=1D1=1X4D1=1D3=1I2=1X1=1X6=2X1=1X2D1=2X1=2X1=1X1D1=2X3=3X1=1X2=1I2=2X2=2D1=1X1=1I1=1X1D1=1D1=1D2=1D2=1X1=1D2=1X2=1X1=1D1=2D1=4X4=2X3=1D4=1I2=2I2=2D1=1X4=2X2=2I2=1X2=1X1I1=1I1=1I1=1X3=1X1I2=1X1I3=1X1=1I1=3X2=4I1=1X2I21303=670I
@ekg
Copy link
Owner

ekg commented Oct 10, 2020 via email

@egoltsman
Copy link
Author

Thanks! But what is different about running pggb vs running edyeet followed by seqwish and smoothxg? From the description on the pggb page it seems like it runs these three tools precisely.

@ekg
Copy link
Owner

ekg commented Oct 10, 2020 via email

@egoltsman
Copy link
Author

Ok, gotcha. Are there perhaps more aggressive smoothing setting in smoothxg that you would tweak to help get this properly represented in the graph?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants