Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors in concoct tutorial: issue with script? #180

Closed
jonoave opened this issue Dec 5, 2017 · 10 comments
Closed

Errors in concoct tutorial: issue with script? #180

jonoave opened this issue Dec 5, 2017 · 10 comments

Comments

@jonoave
Copy link

jonoave commented Dec 5, 2017

Hello,

I installed concoct using anaconda to create a virtual environment, and then using "conda installl concoct". I was going through the concoct tutorial here: https://concoct.readthedocs.io/en/latest/complete_example.html, using the CONCOCT-test-data and encountered 2 errors:

  1. Generate coverage table.
    Upon running the steps there, I get this error:
Traceback (most recent call last):
File "/home/user1/anaconda3/envs/concoct_env/scripts/gen_input_table.py", line 132, in <module>
   raise Exception("Nr of names in samplenames should be equal to nr of given bamfiles")
Exception: Nr of names in samplenames should be equal to nr of given bamfiles
  1. Generate linkage table
Traceback (most recent call last):
File "/home/user1/anaconda3/envs/concoct_env/scripts/bam_to_linkage.py", line 50, in <module>
    import pysam
File "/home/user1/anaconda3/envs/concoct_env/lib/python2.7/site-packages/pysam/__init__.py",
line 5, in <module> from pysam.libchtslib import *
ImportError: /home/user1/anaconda3/envs/concoct_env/lib/./libcurl.so.4: undefined symbol: SSL_CTX_set_alpn_protos

Is this an issue with my installation or changes in the python packages? Thanks!

@alneberg
Copy link
Member

alneberg commented Dec 5, 2017

Hello @jonoave,

I need to know slightly more in order to know exactly what's going on. Did you investigate whether the error message in 1 make sense, namely, do you have the same nr of names in samplenames as the number of .bam files? If you do, can you paste the exact command?

For 2 there seems to be some information missing in the error message, can you please paste the full message?

@jonoave
Copy link
Author

jonoave commented Dec 5, 2017

Hello @alneberg,

Thanks for your quick reply. As I said, I was following the tutorial exactly to get a feel for concoct, and everything was ran on the concoct test data. I'm still quite new to this, so I'm not fully familiar with what one would typically expect from the input/output of each step.

  1. For error 1, I have no idea what it really means. I looked at the number of folders in Concoct-test-data-0.3.2/map and my generated outputs in Concoct-complete-example/map.
    a. There are the same number of folders (i.e. same number of samples)
    b. Ok I think see the problem. In my output folders, there are no bam index files (.bai).

I reran the previous step again (Map the read onto the contigs), and I think for each sample the last step with sam-tools are not executed properly?

for f in $CONCOCT_TEST/reads/*_R1.fa; do
> mkdir -p map/$(basename $f);
> cd map/$(basename $f);
> bash $CONCOCT/scripts/map-bowtie2-markduplicates.sh -ct 1 -p '-f' $f $(echo $f | sed s/R1/R2/) pair $CONCOCT_EXAMPLE/contigs/velvet_71_c10K.fa asm bowtie2;
> cd ../..;
> done

A similar output for each sample:

Using: /home/user1/anaconda3/bin/bowtie2
Using: /home/user1/anaconda3/envs/concoct_env/bin/samtools
Using: /home/user1/anaconda3/bin/genomeCoverageBed
100000 reads; of these:
  100000 (100.00%) were paired; of these:
    61498 (61.50%) aligned concordantly 0 times
    38072 (38.07%) aligned concordantly exactly 1 time
    430 (0.43%) aligned concordantly >1 times
    ----
    61498 pairs aligned concordantly 0 times; of these:
      21002 (34.15%) aligned discordantly 1 time
    ----
    40496 pairs aligned 0 times concordantly or discordantly; of these:
      80992 mates make up the pairs; of these:
        76320 (94.23%) aligned 0 times
        3170 (3.91%) aligned exactly 1 time
        1502 (1.85%) aligned >1 times
61.84% overall alignment rate
[bam_sort] Use -T PREFIX / -o FILE to specify temporary and final output files
Usage: samtools sort [options...] [in.bam]
Options:
  -l INT     Set compression level, from 0 (uncompressed) to 9 (best)
  -m INT     Set maximum memory per thread; suffix K/M/G recognized [768M]
  -n         Sort by read name
  -t TAG     Sort by value of TAG. Uses position as secondary index (or read name if -n is set)
  -o FILE    Write final output to FILE rather than standard output
  -T PREFIX  Write temporary files to PREFIX.nnnn.bam
      --input-fmt-option OPT[=VAL]
               Specify a single input file format option in the form
               of OPTION or OPTION=VALUE
  -O, --output-fmt FORMAT[,OPT[=VAL]]...
               Specify output format (SAM, BAM, CRAM)
      --output-fmt-option OPT[=VAL]
               Specify a single output file format option in the form
               of OPTION or OPTION=VALUE
      --reference FILE
               Reference sequence FASTA FILE [null]
  -@, --threads INT
               Number of additional threads to use [0]

@alneberg
Copy link
Member

alneberg commented Dec 5, 2017

Aha, this is probably related to the pull request #179. The format of the samtools sort have been changed since the tutorial was created. The pull request will be incorporated as soon as we release a new version of concoct, but I think it would be faster for you to try to edit the script yourself according to the pull request.

Please come back regarding problem 2 which I think is caused by something else.

@jonoave
Copy link
Author

jonoave commented Dec 5, 2017

Hi @alnerberg,

Thanks for the reply. For problem2, I've updated the original post with the missing line.

@alneberg
Copy link
Member

alneberg commented Dec 5, 2017

Hmm, yes for problem 2 it seems to be a more sophisticated problem which I don't think I'm able to solve. I think it's related to the installation of the package libcurl used by pysam or possibly even deeper (versions of OpenSSL)

From some googling, this issue seems related:

https://stackoverflow.com/questions/40339325/undefined-symbol-ssl-ctx-set-alpn-protos#40351810

But I can't promise it's the exact same issue.

Hope you can find help elsewhere for this issue.

Johannes

@jonoave
Copy link
Author

jonoave commented Dec 12, 2017

Hi @alneberg ,

I've fixed the script markDuplicates.sh for issue 1 and I've also managed to sort out issue 2 (i.e. no error with pysam). But both steps still return the error:

raise Exception("Nr of names in samplenames should be equal to nr of given bamfiles")

likely due to the missing .bai files.

So I reran the previous step of the tutorial "Map the Reads onto the Contigs", with the help menu now for java:

Using: /home/user1/anaconda3/bin/bowtie2
Using: /home/user1/anaconda3/envs/concoct_env/bin/samtools
Using: /home/user1/anaconda3/bin/genomeCoverageBed
100000 reads; of these:
  100000 (100.00%) were paired; of these:
    79893 (79.89%) aligned concordantly 0 times
    19907 (19.91%) aligned concordantly exactly 1 time
    200 (0.20%) aligned concordantly >1 times
    ----
    79893 pairs aligned concordantly 0 times; of these:
      11360 (14.22%) aligned discordantly 1 time
    ----
    68533 pairs aligned 0 times concordantly or discordantly; of these:
      137066 mates make up the pairs; of these:
        132076 (96.36%) aligned 0 times
        4090 (2.98%) aligned exactly 1 time
        900 (0.66%) aligned >1 times
33.96% overall alignment rate
OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=1g; support was removed in 8.0
Usage: java [-options] class [args...]
           (to execute a class)
   or  java [-options] -jar jarfile [args...]
           (to execute a jar file)
where options include:
    -d32          use a 32-bit data model if available
    -d64          use a 64-bit data model if available
    -server       to select the "server" VM
                  The default VM is server,
                  because you are running on a server-class machine.


    -cp <class search path of directories and zip/jar files>
    -classpath <class search path of directories and zip/jar files>
                  A : separated list of directories, JAR archives,
                  and ZIP archives to search for class files.
    -D<name>=<value>
                  set a system property
    -verbose:[class|gc|jni]
                  enable verbose output
    -version      print product version and exit
    -version:<value>
                  Warning: this feature is deprecated and will be removed
                  in a future release.
                  require the specified version to run
    -showversion  print product version and continue
    -jre-restrict-search | -no-jre-restrict-search
                  Warning: this feature is deprecated and will be removed
                  in a future release.
                  include/exclude user private JREs in the version search
    -? -help      print this help message
    -X            print help on non-standard options
    -ea[:<packagename>...|:<classname>]
    -enableassertions[:<packagename>...|:<classname>]
                  enable assertions with specified granularity
    -da[:<packagename>...|:<classname>]
    -disableassertions[:<packagename>...|:<classname>]
                  disable assertions with specified granularity
    -esa | -enablesystemassertions
                  enable system assertions
    -dsa | -disablesystemassertions
                  disable system assertions
    -agentlib:<libname>[=<options>]
                  load native agent library <libname>, e.g. -agentlib:hprof
                  see also, -agentlib:jdwp=help and -agentlib:hprof=help
    -agentpath:<pathname>[=<options>]
                  load native agent library by full pathname
    -javaagent:<jarpath>[=<options>]
                  load Java programming language agent, see java.lang.instrument
    -splash:<imagepath>
                  show splash screen with specified image
See http://www.oracle.com/technetwork/java/javase/documentation/index.html for more details.

I think this is due to picard having also changed from the way it is described in your tutorial, and there is no more MarkDuplicates.jar but everything is bundled into picard.jar. Do you have a newer version of the script that fixes this issue?

Thanks.

@JuanmaMedina
Copy link

Hello all!

I came out with the same problem as described in issue2. It seems to be a problem of an older version of openssl.

For what is worth, I managed to solve this issue 2 by upgrading openssl1.0.1 to openssl1.0.2, for example with:
conda update --all

I am sorry for not being able to help with issue1, I am currently adapting the test dataset to a single sample.

Thanks for the feedback and the comments!

@alneberg
Copy link
Member

Sorry for the delay @jonoave. I agree completely with your understanding of the problem, however, this script is only used for this tutorial and therefore it has not been updated in quite some time. The main point with this script is to create .bam files that can be consumed with the input table script. Quite often, I'd say you can use bam files produced by samtools sort directly and skip MarkDuplicates.

Hi @JuanmaMedina!

Great that you've managed to solve it. Do you know from where the original openssl1.0.1 package was installed? Was it as a direct or indirect dependency for CONCOCT?

@s4251484
Copy link

s4251484 commented Sep 4, 2018

Hi Johannes @alneberg

First of all, thanks for this amazing tool!
I'm practising CONCOCT via docker using the test data and I encounter the same samtool problem (no bam file is generated). I'm totally new to this and have no idea how to fix the script markduplicate.sh, I can't figure out how to can make changes to the script which is in the docker container/image.

Any suggestions?

@alneberg
Copy link
Member

alneberg commented Sep 4, 2018

Hi @s4251484, I'm afraid the development of CONCOCT is lagging to say the least. There has been proposed changes to the markduplicate script as you can see here:

https://github.com/BinPro/CONCOCT/pull/179/files

Hopefully those changes should be easy enough to understand. I haven't tested them yet though.

@alneberg alneberg closed this as completed Aug 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants