Skip to content

Commit

Permalink
cdd2cog v0.2 NCBI FASTA header
Browse files Browse the repository at this point in the history
adapted to new NCBI FASTA header format for CDD RPS-BLAST+ output
  • Loading branch information
aleimba committed Feb 16, 2017
1 parent 5a3a41c commit 1b2388f
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 4 deletions.
7 changes: 5 additions & 2 deletions cdd2cog/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ cdd2cog
perl cdd2cog.pl -r rps-blast.out -c cddid.tbl -f fun.txt -w whog

## Description
For troubleshooting and a working example please see issue [#1](https://github.com/aleimba/bac-genomics-scripts/issues/1).

The script assigns COG ([cluster of orthologous
groups](http://www.ncbi.nlm.nih.gov/COG/)) categories to proteins.
Expand Down Expand Up @@ -77,7 +78,7 @@ Several files are needed from NCBI's FTP server to run the RPS-BLAST+ and `cdd2c

## Usage

### RPS-BLAST
### RPS-BLAST+

rpsblast -query protein.fasta -db Cog -out rps-blast.out -evalue 1e-2 -outfmt 6
rpsblast -query protein.fasta -db Cog -out rps-blast.out -evalue 1e-2 -outfmt '7 qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore qcovs'
Expand Down Expand Up @@ -164,4 +165,6 @@ For [citation](https://github.com/aleimba/bac-genomics-scripts#citation), [insta

## Changelog

* v0.1 (01.08.2013)
* v0.2 (2017-02-16)
* Adapted to new NCBI FASTA header format for CDD RPS-BLAST+ output
* v0.1 (2013-08-01)
5 changes: 3 additions & 2 deletions cdd2cog/cdd2cog.pl
Original file line number Diff line number Diff line change
Expand Up @@ -179,7 +179,8 @@ =head2 C<cdd2cog.pl>
=head1 VERSION
0.1 01-08-2013
0.2 update: 2017-02-16
0.1 2013-08-01
=head1 AUTHOR
Expand Down Expand Up @@ -298,7 +299,7 @@ =head1 LICENSE
}
$Skip = $line[0];

my $pssm_id = $1 if $line[1] =~ /^gnl\|CDD\|(\d+)/; # get PSSM-Id from the subject hit
my $pssm_id = $1 if $line[1] =~ /^CDD\:(\d+)/; # get PSSM-Id from the subject hit
my $cog = $CDDid{$pssm_id}; # get the COG# according to the PSSM-Id as listed in 'cddid.tbl'
$Cog_Stats{$cog}++; # increment hit-number for specific COG

Expand Down

0 comments on commit 1b2388f

Please sign in to comment.