Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttributeError: 'DataFrame' object has no attribute 'sort_values' #65

Open
sminot opened this issue Sep 11, 2017 · 12 comments
Open

AttributeError: 'DataFrame' object has no attribute 'sort_values' #65

sminot opened this issue Sep 11, 2017 · 12 comments
Assignees

Comments

@sminot
Copy link

sminot commented Sep 11, 2017

Using the docker image for aweimann/traitar:release (38eaf28de0a1), I got the following error:

Traceback (most recent call last):
  File "/usr/local/bin/hmmer2filtered_best", line 15, in <module>
    aggregate_domain_hits(filtered_df, args.out_best_f)
  File "/usr/local/lib/python2.7/dist-packages/traitar/hmmer2filtered_best.py", line 52, in aggregate_domain_hits
    filtered_df.sort_values(by = ["target name", "query name"], inplace = True)
  File "/usr/lib/python2.7/dist-packages/pandas/core/generic.py", line 1815, in __getattr__
    (type(self).__name__, name))
AttributeError: 'DataFrame' object has no attribute 'sort_values'

FIX: My guess is that this is a problem with the version of Pandas in the image. So I updated pandas in the container to pandas==0.20.3 with pip. I also updated numexpr to 2.4.6 for compatibility with that version of pandas.

RESULT: After making those two updates, traitar ran with no errors.

SUGGESTION: Pin all the various versions of python packages that work (pip freeze > requirements.txt) and then use that list to install from in the Dockerfile (pip install -r requirements.txt) to avoid a confounding effect of the most recent version at build time.

@abremges
Copy link
Collaborator

Thank you, @sminot, greatly appreciated! 👍

@aweimann
Copy link
Owner

@sminot sorry for my late reply I was on holiday. Thank you for the great suggestions! 👍

@lhor
Copy link

lhor commented Oct 24, 2017

After following the installation I'm having the exact same issue reported by @sminot. Updating pandas and numexpr didn't solve the problem.

@lhor
Copy link

lhor commented Oct 26, 2017

@sminot Would you please provide the output for pip freeze of your working image? thanks!

@palomo11
Copy link

Hi,
Any news about this?

After installing traitar from virtual env and installing the older version of pandas:

virtualenv traitar-env 
source /home/name/traitar-env/bin/activate

PATH=$PATH:/home/name/traitar-env/bin/

source ~/.bashrc
pip install pandas==0.19
(traitar-env) python2.7
Python 2.7.14 
 
>>> import pandas
>>> print('The pandas version is {}.'.format(pandas.__version__))
The pandas version is 0.19.0.

I got this error:

/home/name/scripts/traitar-env/bin/hmmer2filtered_best.py:50: FutureWarning: sort(columns=....) is deprecated, use sort_values(by=.....)
 filtered_df.sort(columns = ["target name", "query name"], inplace = True)

/bin/sh: -c: line 0: syntax error near unexpected token `('

/bin/sh: -c: line 0: `domtblout2gene_generic.py traitar_bins/annotation/pfam/summary.dat  <(ls traitar_bins/annotation/pfam/*_filtered_best.dat) /home/name/traitar-env/lib/python2.7/site-packages/traitar/data/models/phypat.tar.gz'

Traceback (most recent call last):
 File "/home/name/traitar-env/bin/predict.py", line 92, in <module>
   annotate_and_predict(pt_models, args.annotation_matrix,  args.out_dir, args.voters) 
 File "/home/name/traitar-env/bin/predict.py", line 67, in annotate_and_predict
   m = ps.read_csv(summary_f, sep="\t", index_col = 0)
 File "/home/name/traitar-env/lib/python2.7/site-packages/pandas/io/parsers.py", line 645, in parser_f
   return _read(filepath_or_buffer, kwds)
 File "/home/name/traitar-env/lib/python2.7/site-packages/pandas/io/parsers.py", line 388, in _read
   parser = TextFileReader(filepath_or_buffer, **kwds)
 File "/home/name/traitar-env/lib/python2.7/site-packages/pandas/io/parsers.py", line 729, in __init__
   self._make_engine(self.engine)
 File "/home/name/traitar-env/lib/python2.7/site-packages/pandas/io/parsers.py", line 922, in _make_engine
   self._engine = CParserWrapper(self.f, **self.options)
 File "/home/name/traitar-env/lib/python2.7/site-packages/pandas/io/parsers.py", line 1389, in __init__
   self._reader = _parser.TextReader(src, **kwds)
 File "pandas/parser.pyx", line 373, in pandas.parser.TextReader.__cinit__ (pandas/parser.c:4019)
 File "pandas/parser.pyx", line 665, in pandas.parser.TextReader._setup_parser_source (pandas/parser.c:7967)
IOError: File traitar_bins/annotation/pfam/summary.dat does not exist

Any idea on what is going on wrong or how to solve it?

@aweimann
Copy link
Owner

Thank you for the reminder. Your error indicates that the bash process substitution is not working. This is because /bin/sh is being used instead of /bin/bash although this is hard coded. I'm not really sure why this would happen. Can you please make sure /bin/bash is the standard bash?

Many thanks,

Aaron

@fungs
Copy link
Collaborator

fungs commented Feb 26, 2018

@aweimann: Circumventing the BASH pipe syntax (in case you want to get rid of the requirement) you could read the list via stdin.

@aweimann
Copy link
Owner

aweimann commented Mar 2, 2018

Thanks @fungs I will look into that. @palomo11 sorry this is taking a bit longer and thanks for your continued interest!

@palomo11
Copy link

palomo11 commented Mar 2, 2018

Hi @aweimann Finally it worked! I followed your suggestions and after changing to /bin/bash, it went fine.

@nick-youngblut
Copy link

I'm not sure if this repo is maintained anymore, but pandas==0.20.3 is super old, and numexpr==2.4.6 isn't even available on conda-forge anymore. Are there are any plans on updating traitar?

Should we consider this package no longer maintained?

I could fork this package and try to update the pandas code (and maybe add some unit tests too), but @aweimann will you accept the pull request?

@aweimann
Copy link
Owner

Sorry I haven't had much time to look after the repo but will take a look in the next few days.

@nick-youngblut
Copy link

Thanks for the quick response! Let me know if I can help. Traitar seems to still be the state-of-the-art, given that the code from Farrell et al., 2018 seems to no longer be available (and the paper was never published in journal, as far as I can tell).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants