Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test_backend folder missing for test_suite.py #13

Closed
jeffkimbrel opened this issue Aug 24, 2017 · 10 comments
Closed

Test_backend folder missing for test_suite.py #13

jeffkimbrel opened this issue Aug 24, 2017 · 10 comments

Comments

@jeffkimbrel
Copy link

I'm checking to see if dRep installed correctly, so I am running the test_suite.py script from the tests folder. I get an error:

FileNotFoundError: [Errno 2] No such file or directory: '/XXX/XXX/drep/tests/../tests/test_backend/ecoli_wd'

Indeed, this isn't in the tests folder. Is there a work around to get the test scripts to work?

Thanks.

@jeffkimbrel
Copy link
Author

I see this is similar to #11 . I'll close and wait to see if the test_suite.py gets fixed in the future.

@MrOlm
Copy link
Owner

MrOlm commented Aug 25, 2017

Hi Jeff,

Sorry about this inconvenience. All that's needed is a blank folder in the tests folder called test_backend.

You can either make it yourself, or download the new version (1.1.2) which will include it.

Best,
-Matt

@jeffkimbrel
Copy link
Author

Thanks, that worked to get it started. I did get an error, however:

Traceback (most recent call last):
  File "test_suite.py", line 633, in <module>
    test_short()
  File "test_suite.py", line 617, in test_short
    cluster_test()
  File "test_suite.py", line 581, in cluster_test
    verifyCluster.run()
  File "test_suite.py", line 328, in run
    self.skipsecondary_test()
  File "test_suite.py", line 367, in skipsecondary_test
    assert compare_dfs(db1, db2), "{0} is not the same!".format('Mdb')
AssertionError: Mdb is not the same!

It isn't clear to me at which calculation this failed, but it was step4 after functional test 1 passed.

Let me know if I should open this up in a new issue.

Jeff

@MrOlm
Copy link
Owner

MrOlm commented Aug 25, 2017 via email

@jeffkimbrel
Copy link
Author

Yep, it looks like there are dependencies missing, but the missing ones are listed as optional. I suppose that the test python script uses these dependencies?

$ dRep bonus test --check_dependencies
Loading work directory
Checking dependencies
mash.................................... all good        (location = /usr/local/bin/mash)
nucmer.................................. all good        (location = .../scripts/MUMmer3.23/nucmer)
checkm.................................. !!! ERROR !!!   (location = None)
ANIcalculator........................... !!! ERROR !!!   (location = None)
prodigal................................ all good        (location = .../scripts/prodigal)
centrifuge.............................. !!! ERROR !!!

@jeffkimbrel jeffkimbrel reopened this Sep 7, 2017
@MrOlm
Copy link
Owner

MrOlm commented Sep 7, 2017

Hmm- yes some of those dependencies are used by the test suite, but with the ones you have working I wouldn't really expect it to crash there.

Is the program working when you use it on your own data?

@jeffkimbrel
Copy link
Author

jeffkimbrel commented Sep 7, 2017

It seems like I keep running into problems with checkM not installed. I am trying to run all of this on my laptop, and therefore don't have checkM because it requires >16gb of memory. I actually do have it installed, but don't have all of the data files downloaded.

Running on my own files...

$ dRep dereplicate_wf dRep_test -g bins/*.fasta --skipCheckM

When I use the --skipCheckM flag I can get through the Filtering and Clustering steps, but it fails on the Choose step after it also attempts to run checkM (disregarding the flag).

Also, I manage python environments using Anaconda rather than pyenv. My default python is 3.4.5. I wonder if that is also messing something up... I think checkM is the only thing requiring python 2.X, correct?

Thanks for your help.

@MrOlm
Copy link
Owner

MrOlm commented Sep 8, 2017 via email

@jeffkimbrel
Copy link
Author

I have figured out how to run checkM on NERSC... is it possible to use an "external" checkM results dataset with dRep?

My goals are pretty much what the advertised purpose is. I have tons of metagenomes that would be too computationally expensive to combine and co-assemble. So I want to take bins from either single metagenomes, or co-assembled replicates, and "merge" the bins.

@MrOlm
Copy link
Owner

MrOlm commented Sep 29, 2017

Yes- there is a way to use "external" checkM results.

When using the dereplicate_wf (probably what you want), there's an option --Chdb which can be used to provide external checkM results.

They need to be in the --table_table format, though. The checkM command to generate this is:

checkm qa --tab_table -o 2

An example of how it should look is attached.

Finally, make sure that for the "Bid Id" column, you have the name of the genome WITH the file extension, and WITHOUT the path to the genome (as is the case in the example sheet provided).

Best,
-Matt

Chdb.csv.zip

@MrOlm MrOlm closed this as completed Oct 19, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants