-
Notifications
You must be signed in to change notification settings - Fork 95
/
README
62 lines (46 loc) · 2.66 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
Overview:
pslMap is an alignment tool than combines two alignment, sharing a common
sequence, to produce another alignment. This is an implementation of the
TransMap alignment algorithm on PSL files.
Given alignments of
a to b
b to c
it produces an alignment of a to c by projecting through b. This differs from
liftOver in that it does a base-by-base mapping, which may insert or delete
bases within a block. The mappings produced by lifeOver are block-level,
which may expand or contract the size of blocks.
Description:
pslMap [options] inPsl mapFile outPsl
The pslMap program takes alignments the PSL format alignment in the inPsl file
and projects the alignments through the overlapping mapFile alignments, which
can be in PSL or chain format. The resulting alignments are written to outPsl
file in PSL format.
The target side of inPsl must be the same set of sequences as the query side
of mapFile. Input alignments with target sequence names that don't match any
mapping file query sequence names are discarded. If the matching target and
query name have different sequence sizes, an error will be generated. The
options -swapMap and -swapIn can be used to swap the query and target sides of
either alignment if the are not in the required orientation.
Special handling is provide to support mapping proteins to a genome. If the
input PSL is a protein to DNA PSL, the protein coordinates will be converted
to CDS coordinates in the output PSL. That is, each coordinate in the protein
will be multiplied by three. This is used when mapping proteins to the genome
by mapping protein to mRNA alignments with mRNA to genome alignments. Unlike
most protein to genome alignment process, the resulting CDS to genome
alignment is able to represent amino acids that are coded for by spliced
codons.
Examples:
Mapping cDNAs between organism using syntenic chains of genomic alignments:
# map a PSL file of mouse cDNA to genomic (mm8) alignments to hg18 in
# mmCDna.mm8.psl
chainDir=/cluster/data/mm8/bed/blastz.hg18/axtChain
# create a subset of syntenic chains for
netFilter -syn $chainDir/mm8.hg18.net.gz >mm8.hg18.syn.net
netChainSubset -wholeChains mm8.hg18.syn.net $chainDir/mm8.hg18.all.chain.gz mm8.hg18.syn.chain
# since target of the chains is mm8, we must use -swapMap option
pslMap -chainMapFile -swapMap mmCDna.mm8.psl hg18.syn.chain mmCDna.hg18.psl
Citation:
Jingchun Zhu, J. Zachary Sanborn, Mark Diekhans, Craig B. Lowe, Tom H. Pringle, and David Haussler.
Comparative genomics search for losses of long-established genes on the human lineage.
PLoS Computational Biology, 3:e247 EP , Dec 2007.
http://dx.doi.org/10.1371/journal.pcbi.0030247