Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

{bio} [Celera Assembler aka Whole-Genome Shotgun Assembler ] (REVIEW) #960

Closed
wants to merge 8 commits into from
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# This file is an EasyBuild reciPY as per https://github.com/hpcugent/easybuild
# Author: Pablo Escobar Lopez
# Swiss Institute of Bioinformatics
# Biozentrum - University of Basel

easyblock = 'MakeCp'

name = 'Celera_Assembler'
version = '8.1'

homepage = 'http://wgs-assembler.sourceforge.net/'
description = """ Celera Assembler is a de novo whole-genome shotgun (WGS) DNA
sequence assembler. It reconstructs long sequences of genomic DNA from fragmentary
data produced by whole-genome shotgun sequencing. """

toolchain = {'name': 'goolf', 'version': '1.4.10'}

source_urls = [('http://sourceforge.net/projects/wgs-assembler/files/wgs-assembler/wgs-%(version)s/', 'download')]
sources = ['wgs-%(version)s.tar.bz2']

dependencies = [('bzip2', '1.0.6')]

parallel = 1

premakeopts = 'cd kmer && make install && cd .. && '
premakeopts += 'cd samtools && make && cd .. && '
premakeopts += 'cd src && '
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that CA uses some bioinformatics packages that are partially already in EB and they ship their own versions. I wonder what is more intuitive for the user?

The CA scripts have hard coded relative paths to use the internal versions of things like SAMtools, so this would be some additional work to make the external tools, that are already on the system as provided by EB, available. Would stick with how it is now. Just wanted to mention it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My first approach was to try to define kmer and samtools as depencies but the Makefiles for Celera Assembler looks for samtools libs in it's source folder. I tought that compiling it like the developers suggest is easier than patching the makefiles.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, disregard my previous comment!


files_to_copy = ["Linux-amd64/*"]

sanity_check_paths = {
'files': ["bin/%s" % x for x in ["bogus", "convert-fasta-to-v2.pl", "fastqAnalyze", "markUniqueUnique",
"pacBioToCA", "uidclient"]],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are Perl scripts in there, why not add a Perl dependency here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[escobar]@login11:bin$ head -1 convert-fasta-to-v2.pl 
#!/usr/bin/perl

I don't think adding perl as dependency makes any difference. Anyway I can add it if you prefer it...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm at least patch that to #!/usr/bin/env perl?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on making sure /usr/bin/env perl is used

'dirs': ["lib"],
}

moduleclass = 'bio'

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pescobar: remove empty line and address other comments