Skip to content
Martin Asser Hansen edited this page Oct 2, 2015 · 6 revisions

Biopiece: transliterate_vals

Description

Transliteration is ultra fast search and replace (or search and delete) of characters in values and is useful for things as converting sequence from RNA to DNA or removing indels from patterns.

Usage

... | transliterate_vals [options]

Options

[-?          | --help]               #  Print full usage description.
[-k <list>   | --keys=<list>]        #  List of values to transliterate
[-s <string> | --search=<string>]    #  String of chars to locate and replace
[-r <string> | --replace=<string>]   #  String of chars for replacing
[-d <string> | --delete=<string>]    #  String of chars to delete
[-I <file!>  | --stream_in=<file!>]  #  Read input from stream file  -  Default=STDIN
[-O <file>   | --stream_out=<file>]  #  Write output to stream file  -  Default=STDOUT
[-v          | --verbose]            #  Verbose output.

Examples

To convert RNA sequence to DNA:

transliterate_vals -k SEQ -s Uu -r Tt

To remove indels from patterns:

transliterate_vals -k PATTERN -s '._-~' -d

To visualize FASTQ quality scores, consider this FASTQ entry in the file test.fq:

@ILLUMINA-52179E_0004:2:1:1045:16499#TTAGGC/1
CTTGGTGCCCGTCACGCGCACTGCGTCGCCCTGAATGCTCGCCTGNNCCT
+ILLUMINA-52179E_0004:2:1:1045:16499#TTAGGC/1
ceceeee\e``cd^^Yb`b`cc``c\accccZT`YTbYb`Y\VZYBBa\Y

Using transliterate_vals we can do:

read_fastq -i test.fq |
transliterate_vals -k SCORES -s "[@-h]" -r "           ..........ooooooooooOOOOOOOOOO" |
write_fastq -x

Thus:

  • Q30-Q40 is replaced with O
  • Q20-Q30 is replaced with o
  • Q10-Q20 is replaced with .
  • Q0-Q10 is replaced with blanks

And this outputs:

@ILLUMINA-52179E_0004:2:1:1045:16499#TTAGGC/1
CTTGGTGCCCGTCACGCGCACTGCGTCGCCCTGAATGCTCGCCTGNNCCT
+
OOOOOOOoOOOOOOOoOOOOOOOOOoOOOOOooOooOoOOooooo  Ooo

See also

transliterate_seq

Author

Martin Asser Hansen - Copyright (C) - All rights reserved.

mail@maasha.dk

August 2007

License

GNU General Public License version 2

http://www.gnu.org/copyleft/gpl.html

Help

transliterate_vals is part of the Biopieces framework.

http://www.biopieces.org

Clone this wiki locally