Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pandas.io.common.EmptyDataError: No columns to parse from file #14

Closed
hzafeng opened this issue Oct 3, 2017 · 10 comments
Closed

pandas.io.common.EmptyDataError: No columns to parse from file #14

hzafeng opened this issue Oct 3, 2017 · 10 comments

Comments

@hzafeng
Copy link

hzafeng commented Oct 3, 2017

Hallo,
I can run the first Step of dRep with CheckM now,but when I go to the second module "Cluster",


..:: dRep Step 2. Cluster ::..

Step 1. Parse Arguments
Step 2. Perform MASH (primary) clustering
2a. Run pair-wise MASH clustering
[====================] 100.00%
Traceback (most recent call last):
File "/home/zjs/tools/drep/bin/dRep", line 26, in
controller.parseArguments(args)
File "/home/zjs/.pyenv/versions/3.5.1/lib/python3.5/site-packages/drep/controller.py", line 144, in parseArguments
self.dereplicate_wf_operation(**vars(args))
File "/home/zjs/.pyenv/versions/3.5.1/lib/python3.5/site-packages/drep/controller.py", line 86, in dereplicate_wf_operation
drep.d_workflows.dereplicate_wrapper(kwargs['work_directory'],**kwargs)
File "/home/zjs/.pyenv/versions/3.5.1/lib/python3.5/site-packages/drep/d_workflows.py", line 36, in dereplicate_wrapper
drep.d_cluster.d_cluster_wrapper(wd, **kwargs)
File "/home/zjs/.pyenv/versions/3.5.1/lib/python3.5/site-packages/drep/d_cluster.py", line 288, in d_cluster_wrapper
Cdb, Mdb, Ndb = cluster_genomes(Bdb, data_folder, **kwargs)
File "/home/zjs/.pyenv/versions/3.5.1/lib/python3.5/site-packages/drep/d_cluster.py", line 104, in cluster_genomes
Mdb = all_vs_all_MASH(Bdb, data_folder, **kwargs)
File "/home/zjs/.pyenv/versions/3.5.1/lib/python3.5/site-packages/drep/d_cluster.py", line 632, in all_vs_all_MASH
table = pd.read_csv(file,sep='\t',header = None)
File "/home/zjs/.pyenv/versions/3.5.1/lib/python3.5/site-packages/pandas/io/parsers.py", line 645, in parser_f
return _read(filepath_or_buffer, kwds)
File "/home/zjs/.pyenv/versions/3.5.1/lib/python3.5/site-packages/pandas/io/parsers.py", line 388, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "/home/zjs/.pyenv/versions/3.5.1/lib/python3.5/site-packages/pandas/io/parsers.py", line 729, in init
self._make_engine(self.engine)
File "/home/zjs/.pyenv/versions/3.5.1/lib/python3.5/site-packages/pandas/io/parsers.py", line 922, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/home/zjs/.pyenv/versions/3.5.1/lib/python3.5/site-packages/pandas/io/parsers.py", line 1389, in init
self._reader = _parser.TextReader(src, kwds)
File "pandas/parser.pyx", line 538, in pandas.parser.TextReader.cinit (pandas/parser.c:5896)
pandas.io.common.EmptyDataError: No columns to parse from file


and if I use the parameter of "Skipmash" I can pass the step,but when I face the final Step ,it happened again:

[zjs@www drep]$ /home/zjs/tools/drep/bin/dRep evaluate ./drep_out/ -e all
will compare winners
[====================] 100.00%
Traceback (most recent call last):
File "/home/zjs/tools/drep/bin/dRep", line 26, in
controller.parseArguments(args)
File "/home/zjs/.pyenv/versions/3.5.1/lib/python3.5/site-packages/drep/controller.py", line 161, in parseArguments
self.evaluate_operation(**vars(args))
File "/home/zjs/.pyenv/versions/3.5.1/lib/python3.5/site-packages/drep/controller.py", line 81, in evaluate_operation
drep.d_evaluate.d_evaluate_wrapper(kwargs['work_directory'],**kwargs)
File "/home/zjs/.pyenv/versions/3.5.1/lib/python3.5/site-packages/drep/d_evaluate.py", line 29, in d_evaluate_wrapper
Wmdb, Wndb = compare_winners(wd,**kwargs)
File "/home/zjs/.pyenv/versions/3.5.1/lib/python3.5/site-packages/drep/d_evaluate.py", line 67, in compare_winners
Wmdb = dClust.all_vs_all_MASH(Bdb,data_folder)
File "/home/zjs/.pyenv/versions/3.5.1/lib/python3.5/site-packages/drep/d_cluster.py", line 632, in all_vs_all_MASH
table = pd.read_csv(file,sep='\t',header = None)
File "/home/zjs/.pyenv/versions/3.5.1/lib/python3.5/site-packages/pandas/io/parsers.py", line 645, in parser_f
return _read(filepath_or_buffer, kwds)
File "/home/zjs/.pyenv/versions/3.5.1/lib/python3.5/site-packages/pandas/io/parsers.py", line 388, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "/home/zjs/.pyenv/versions/3.5.1/lib/python3.5/site-packages/pandas/io/parsers.py", line 729, in init
self._make_engine(self.engine)
File "/home/zjs/.pyenv/versions/3.5.1/lib/python3.5/site-packages/pandas/io/parsers.py", line 922, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/home/zjs/.pyenv/versions/3.5.1/lib/python3.5/site-packages/pandas/io/parsers.py", line 1389, in init
self._reader = _parser.TextReader(src, **kwds)
File "pandas/parser.pyx", line 538, in pandas.parser.TextReader.cinit (pandas/parser.c:5896)
pandas.io.common.EmptyDataError: No columns to parse from file


Could you help me for this issue?

@MrOlm
Copy link
Owner

MrOlm commented Oct 3, 2017

Hello,

Are both Mash and mumer properly installed?

To check, please let me know the output from the following command:

$ dRep bonus test --check_dependencies

-Matt

@hzafeng
Copy link
Author

hzafeng commented Oct 5, 2017

Hello,Matt

This is the output of the dependency check:
[zjs@www maxbin]$ /home/zjs/tools/drep/bin/dRep bonus test --check_dependencies
Loading work directory
Checking dependencies
mash.................................... all good (location = /home/zjs/hhf/soft/Mash/bin/mash)
nucmer.................................. all good (location = /home/zjs/hhf/soft/MUMmer3.23/nucmer)
checkm.................................. all good (location = /home/zjs/tools/pitchfork/deployment/bin/checkm)
ANIcalculator........................... all good (location = /home/zjs/hhf/soft/ANIcalculator_v1/ANIcalculator)
prodigal................................ all good (location = /usr/local/bin/prodigal)
centrifuge.............................. all good (location = /home/zjs/hhf/soft/centrifuge-1.0.3-beta/centrifuge)

And I still face the problem of "pandas.io.common.EmptyDataError: No columns to parse from file"

@MrOlm
Copy link
Owner

MrOlm commented Oct 5, 2017

Hmmm... that pandas error is what happens when it tries to read an empty dataframe. So this isn't really a problem with pandas, but a sign that something with dRep failed along the way.

Would you mind sending me the log file? It's located in the log folder and is called logger.log

Thanks,
-Matt

@hzafeng
Copy link
Author

hzafeng commented Oct 6, 2017

logger.log

@MrOlm
Copy link
Owner

MrOlm commented Oct 6, 2017

Thank you for the log. This is strange- I'm not sure yet what the problem is.

Could you show me a list of what is in the folder:

...test_drep_out/data/MASH_files/

As well we as what is in the folder:

...test_drep_out/data/MASH_files/sketches/

?

Also, please let me know if any of the files in either of those folders is empty. I'm trying to figure out where Mash is messing up... thanks!

-Matt

@hzafeng
Copy link
Author

hzafeng commented Oct 6, 2017

Thank you for your prompt reply

The /MASH_files/ was empty:
[zjs@www MASH_files]$ ll
total 0
-rwxrwxrwx. 1 root root 0 Oct 6 10:45 MASH_table.tsv
drwxrwxrwx. 1 root root 0 Oct 6 10:45 sketches_

and It is also empty in the folder of sketches

Thank you

And if you want to ask me for same other information ,you can contact me with email so i can reply immediately

I will copy this message and send to your gmail

@MrOlm
Copy link
Owner

MrOlm commented Oct 6, 2017

Hello,

OK, so it seems that mash isn't working properly. This could be an issue with dRep, or an issue with the program mash. Could you see if mash is working properly on it's own? For example, makes some sketches:

mash sketch maxbinout_W0P1.004.fasta MASH_files/maxbinout_W0P1.004.fasta.msh

Let me know if that command works and actually generates a file.

Thanks,
-Matt

@hzafeng
Copy link
Author

hzafeng commented Oct 7, 2017

Hello,
I can run Mash on it's own like this:
mash sketch maxbinout_W0P1.004.fasta
And it will generate a file named:
maxbinout_W0P1.004.fasta.msh

But it failed when i type the order of :
mash sketch maxbinout_W0P1.004.fasta MASH_files/maxbinout_W0P1.004.fasta.msh

The Mash version was V1.1.1 .

@MrOlm
Copy link
Owner

MrOlm commented Oct 9, 2017

Hello,

Thats very strange that it's able to make the file normally, but not when it's to that specific folder... what is the error that it gives? Unfortunately, I've never encountered this issue before, but it sounds like an issue with permissions. May I ask what operating system you are using?

Using the "Skip_mash" parameter may be a way to get around this issue. Using that parameter makes the program take longer to run, but it will be just as accurate as if Mash was run, and in some cases even more accurate.

I'm worried that the underlying "permissions" issue may prevent other programs (like mummer, the program used is secondary clustering) from working as well. But if for some reason it's a mash-specific issue, using the "Skip_mash" parameter is a great idea.

Best,
-Matt

@hzafeng
Copy link
Author

hzafeng commented Oct 10, 2017

Hallo,Matt

ohhh,now i can only run drep with skipmash.

I used CentOS system before and I am supposed to try it on Ubuntu

THanks for your greatest Help!

HU

@MrOlm MrOlm closed this as completed Oct 19, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants