Skip to content
Martin Asser Hansen edited this page Oct 2, 2015 · 6 revisions

Biopiece: replace_vals

Description

If you need to replace the values for a given key in all records to e.g. replace an ID with a description, you can use replace_vals. To replace a single value use the switches -s and -r but to replace many diffrent values you need to specify these in a table file that is given as argument to the -f switch. This table file is read - skipping lines starting with # - into a hash where the search strings are used as keys and the replace strings are values. It is also possible to change the delimiter of the columns using the -d switch.

Usage

... | replace_vals -k <key> [options]

Options

[-?          | --help]                #  Print full usage description.
[-k <string> | --key=<string>]        #  Key whos values should be replaced.
[-s <string> | --search=<string>]     #  Search string.
[-r <string> | --replace=<string>]    #  Replacement string.
[-f <file!>  | --file=<file!>]        #  File with table of search/replace columns.
[-S <uint>   | --search_col=<uint>]   #  Column with search strings   -  Default=1
[-R <uint>   | --replace_col=<uint>]  #  Column with replace strings  -  Default=2
[-d <string> | --delimiter=<string>]  #  Table delimiter              -  Default='\s+'
[-I <file!>  | --stream_in=<file!>]   #  Read input from stream file  -  Default=STDIN
[-O <file>   | --stream_out=<file>]   #  Write output to stream file  -  Default=STDOUT
[-v          | --verbose]             #  Verbose output.

Examples

Consider the following FASTA entries in the file test.fna

>test1
AAGTGTATGAGCCCAGTCGCCCTA
>test2
CGGGAACCTGATCAGCTGTCTACA

To replace the values for the SEQ_NAME key matching test2 with foo do:

read_fasta -i test.fna | replace_vals -k SEQ_NAME -s test2 -r foo

SEQ_NAME: test1
SEQ: AAGTGTATGAGCCCAGTCGCCCTA
SEQ_LEN: 24
---
SEQ_NAME: foo
SEQ: CGGGAACCTGATCAGCTGTCTACA
SEQ_LEN: 24
---

To replace multiple different values we need to specify these in a table file. Consider the following table in the file test.tab:

test1   foo
bar     test2

Per default the search strings are in the first column (search_col default is 1) and the default replace strings are the second column (replace_col default is 2). Using the -f will cause the table file to be read and a hash is build with the elements in the first column as keys and the elements in the second column as values. Thus we can replace like this:

read_fasta -i test.fna | replace_vals -k SEQ_NAME -f test.tab    

SEQ_NAME: foo
SEQ: AAGTGTATGAGCCCAGTCGCCCTA
SEQ_LEN: 24
---
SEQ_NAME: test2
SEQ: CGGGAACCTGATCAGCTGTCTACA
SEQ_LEN: 24
---

It is possible to change the search_col and replace_col:

read_fasta -i test.fna | replace_vals -k SEQ_NAME -f test.tab -S 2 -R 1

SEQ_NAME: test1
SEQ: AAGTGTATGAGCCCAGTCGCCCTA
SEQ_LEN: 24
---
SEQ_NAME: bar
SEQ: CGGGAACCTGATCAGCTGTCTACA
SEQ_LEN: 24
---

See also

read_fasta

merge_records

Author

Martin Asser Hansen - Copyright (C) - All rights reserved.

mail@maasha.dk

December 2011

License

GNU General Public License version 2

http://www.gnu.org/copyleft/gpl.html

Help

replace_vals is part of the Biopieces framework.

http://www.biopieces.org

Clone this wiki locally