Skip to content

Commit

Permalink
Massive reorganization.
Browse files Browse the repository at this point in the history
  • Loading branch information
daisieh committed Jul 16, 2013
1 parent 8536a32 commit edfa2b5
Show file tree
Hide file tree
Showing 23 changed files with 111 additions and 165 deletions.
27 changes: 18 additions & 9 deletions README
@@ -1,20 +1,29 @@
These are the scripts I write as I do my work. You're free to use them but please cite me if you do.
This repo contains visualization and analysis scripts that I've written to help me perform
phylogenomic analyses. Almost all are written in Perl. Some require the Bioperl package.

Notes:
Many of the scripts require subfuncs.pl and circlegraph.pl, so unless you are
planning to always call the scripts from the main directory, you might want to add
this directory to your $PERL5LIB variable.
Scripts that require the Bioperl package will either call the Bio package directly or require
subfunctions that are in the file bioperl_subfuncs.pl. It might be easiest if you include
this directory in your $PERL5LIB path so that the various subfunction files are easy for
the interpreter to find.

CircleGraph.pm is an object-oriented package that makes it easier to draw graphs along
a circle. It makes postscript objects.
a circle. It makes postscript objects and uses the Postscript::Simple package.

subfuncs.pl is a random hodgepodge of utility functions.
subfuncs.pl is a random hodgepodge of utility functions. As I write more of them, I may
move some to more specific subfunction files to reduce overhead.

bioperl_subfuncs.pl has helper functions that require Bioperl packages.

circlegraph.pl has helper functions that draw specific kinds of circular graphs.

The subdirs contain scripts for various tasks:
- converting: helper scripts to convert files from one format to another
- analysis: scripts that perform various analyses on datasets.
introgression: scripts to calculate statistics for introgression analysis.
selection: scripts that use Bioperl, PAML, HyPhy to analyze selection on genes.
- converting: scripts to convert files from one format to another or to combine files
in various ways.
- filtering: scripts that reduce complexity.
- parsing: uses Bioperl to slice and dice sequence files in various ways.
- plastome: scripts that take input about the plastome and draw circle graphs to
represent that data.
- selection_analysis: scripts that use Bioperl, PAML, HyPhy to analyze selection on genes.
- reporting: scripts that take vcf or sequence files and report statistics about them.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
138 changes: 0 additions & 138 deletions plastome/vcf2fasta.pl

This file was deleted.

File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
62 changes: 62 additions & 0 deletions reporting/vcf_to_depth.pl
@@ -0,0 +1,62 @@
use File::Basename;
use Getopt::Long;
use Pod::Usage;
require "subfuncs.pl";

if (@ARGV == 0) {
pod2usage(-verbose => 1);
}

my $runline = "running " . basename($0) . " " . join (" ", @ARGV) . "\n";

my $samplefile = 0;
my $help = 0;
my $outfile = "";

GetOptions ('samples|input|vcf=s' => \$samplefile,
'outputfile=s' => \$outfile,
'help|?' => \$help) or pod2usage(-msg => "GetOptions failed.", -exitval => 2);

if ($help) {
pod2usage(-verbose => 1);
}

print $runline;

my @samples = @{sample_list ($samplefile)};
if ($samplefile =~ /(.*?)\.vcf$/) {
@samples = ($1);
}

foreach my $sample (@samples) {
$name = basename($sample);
my $vcf_file = $sample . ".vcf";
if ($outfile ne "") {
vcf_to_depth ($vcf_file,$outfile);
} else {
vcf_to_depth ($vcf_file);
}
}


__END__
=head1 NAME
vcf_to_depth
=head1 SYNOPSIS
vcf_to_depth -samplefile -output [-threshold]
=head1 OPTIONS
-samples|input|vcf: name of sample or list of samples to convert
-outputfile: optional: prefix of output fasta file
=head1 DESCRIPTION
Generates a summary file with the read depths of each position in the inputted vcf file(s).
=cut
31 changes: 31 additions & 0 deletions subfuncs.pl
Expand Up @@ -501,5 +501,36 @@ sub meld_sequence_files {
return meld_matrices (\@matrixnames, \%matrices);
}


=head1
B<String $out_file meld_matrices ( String $vcf_file, [String $out_file] )>
Generates a summary file with the read depths of each position in the inputted vcf file.
$vcf_file: A vcf file.
=cut



sub vcf_to_depth {
my $vcf_file = shift;
my $out_file = shift;

$vcf_file =~ /(.+)(\.vcf)/;
my $basename = $1;

if ($out_file == 0) {
$out_file = "$basename.depth";
}

system ("awk 'BEGIN {OFS=\"\\t\"} /^.*?\\t(.*?)\\t/ {print \$1,\$2,\$8}' $vcf_file | awk 'BEGIN {OFS=\"\\t\"} {sub(/DP=/,\"\",\$3);sub(/;.*/,\"\",\$3);print \$1,\$2,\$3}' - > $out_file");

return $out_file;
}



# must return 1 for the file overall.
1;
18 changes: 0 additions & 18 deletions test_tree.pl

This file was deleted.

0 comments on commit edfa2b5

Please sign in to comment.