Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gridss_annotate_insertions_repeatmaster Creates Mismatching Chromosome Names #268

Closed
DarioS opened this issue Oct 24, 2019 · 0 comments
Closed

Comments

@DarioS
Copy link

DarioS commented Oct 24, 2019

Even though I have the same chromosome name format for the RepeatMasker annotation

   SW  perc perc perc  query      position in query           matching       repeat              position in  repeat
score  div. del. ins.  sequence    begin     end    (left)    repeat         class/family         begin  end (left)   ID

  463   1.3  0.6  1.7  chr1        10001   10468 (248945954) +  (TAACCC)n      Simple_repeat            1  463    (0)      1
 4005  11.3 21.5  1.3  chr1        10469   11447 (248944975) C  TAR1           Satellite/telo       (399) 1712    483      2
  535  21.2 15.9  3.1  chr1        11485   11676 (248944746) C  L1MC5a         LINE/L1              (510) 5667   5447      3
  263  29.4  1.9  1.0  chr1        11678   11780 (248944642) C  MER5B          DNA/hAT-Charlie       (74)  104      1      4
  309  23.0  3.7  0.0  chr1        15265   15355 (248941067) C  MIR3           SINE/MIR             (119)  143     49      5

and the VCF file

##contig=<ID=chr1,length=248956422>
##contig=<ID=chr2,length=242193529>
##contig=<ID=chr3,length=198295559>
##contig=<ID=chr4,length=190214555>

the names no longer match

Warning message:
In .Seqinfo.mergexy(x, y) :
  Each of the 2 combined objects has sequence levels not in the other:
  - in 'x': chr1, chr8, chr2, chr10, chr11, chr9, chr14, chr3, chr15, chr7, chr19, chr13, chr16, chr4, chr5, chr12, chrX, chr17, chr6, chr22, chr18, chr21, chr20, chrY, chrM
  - in 'y': 1, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 2, 20, 21, 22, 3, 4, 5, 6, 7, 8, chr1_GL383518v1_alt, chr1_GL383519v1_alt, chr1_GL383520v2_alt, chr1_KI270759v1_alt

because GRIDSS mangles them on line 64 of the annotation script by forcing them to be NCBI-style seqlevelsStyle(grrm) = "NCBI". The problem isn't detected and the processing continues, saving a VCF file to disk.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants