Sequence variation descriptions generated by the Position Converter are NOT checked by Mutalyzer's Name Checker.
The Position Converter will convert the positions of the variation description from the chromosomal position for a specific human genome build to a position relative to RefSeq transcript reference sequences or the other way around. It does not take mismatches between these reference sequences into account. The Position Converter uses a local database containing the mapping information from the seq_gene.md.gz file at the MapView ftp site for human genome builds hg18 (NCBI 36) and hg19 (GRCh37). The specified version of the RefSeq transcript Accession number has to be present in the local mapping database. The latest RefSeq version might not be available when the NCBI has not yet updated the seq_gene.md.gz file.
Please note that chromosomal variant descriptions will only return transcript descriptions when the transcripts are mapped within 5000 nucleotides of the submitted chromosomal positions. Otherwise, the Position Converter will return: No transcripts found in mutation region. An alternative to convert genomic positions from one build to the other and between organisms is the UCSC tool LiftOver
We expect that we will be able to support conversions to and from the next human genome build hg38 (GRCh38) following a Mutalyzer update this fall. See Bio-IT's interview with Deanna Church about the past, present and future of the reference genome. See Genome Reference Consortium.
Why you should check the Position Converter output
NC_000009.11:g.134385436A>G for hg19 returns:
Found transcripts in mutation region:
Checking NM_007171.3:c.752A>G with the Name Checker results in the error: A not found at position 954, found G instead. Reason: transcript is mapped on reverse strand. The Position Converter only converts the position, but not the substitution part of the description.
Checking NM_001077365.1:c.699+53A>G with the Name Checker results in the error: "Intronic position given for a non-genomic reference sequence". Reason: the transcript sequence only contains exons, no intronic sequences. Therefore, intronic positions cannot be checked.
A slice of the chromosomal reference sequence, UD_132118472510, generated by the Reference File Loader can be used to check all transcript descriptions. Transcript POMT1_v002 corresponds to NM_007171.3, but the POMT1_v002:c.752A>G description is approved by the Name Checker. This discrepancy is caused by the presence of different alleles of SNP rs2296949 in the chromosomal reference sequence NC_000009.11 and NM_007171.3. The latter contains the G allele at position 752 to be consistent with the RefSeqGene record NG_008896.1.
The Position Converter depends on mapping information from the NCBI. When RefSeq transcripts can only be aligned with indels to the reference genome, exon boundaries will be calculated incorrectly. When converting chromosomal positions to transcript positions, we recommend using the Name Checker to check exon boundary positions.
The Position Converter of the Mutalyzer builds before 2.0.beta-11 used mapping information from the UCSC genome browser for human genome builds hg18 (NCBI 36) and hg19 (GRCh37). Since the UCSC uses Blat to map RefSeq transcripts, small exons (3-6 nts) were mapped incorrectly leading to incorrect conversions.