This repository has been archived by the owner. It is now read-only.

Bug 1265609: Fix pandas not getting installed #6292

Merged
merged 1 commit into from Oct 30, 2015

Conversation

Projects
None yet
5 participants
@dinhxuanvu
Contributor

dinhxuanvu commented Oct 27, 2015

Pandas package fails to install properly if it's included in setup.py for
Python 2 & 3 applications. This issue is possibly due to a bug with numpy
dependency of pandas package. This commit allows pandas package to be
installed using 'pip install -e .' instead of 'python setup.py install'
to work around the numpy bug.

Bug <1265609>
Link https://bugzilla.redhat.com/show_bug.cgi?id=1265609

Signed-off-by: Vu Dinh vdinh@redhat.com

@dinhxuanvu

This comment has been minimized.

Show comment
Hide comment
@dinhxuanvu

dinhxuanvu Oct 27, 2015

Contributor

@tiwillia @Miciah @dobbymoodge Please review when you have a chance. Thanks. [test]

Contributor

dinhxuanvu commented Oct 27, 2015

@tiwillia @Miciah @dobbymoodge Please review when you have a chance. Thanks. [test]

@openshift-bot

This comment has been minimized.

Show comment
Hide comment
@openshift-bot
Contributor

openshift-bot commented Oct 27, 2015

@dinhxuanvu

This comment has been minimized.

Show comment
Hide comment
@dinhxuanvu

dinhxuanvu Oct 27, 2015

Contributor

@soltysh @rhcarvalho Would you guys mind talking a look at this bug and its fix as well? I would really appreciate the consultation on this.

Here are my findings and explanation (excerpt from a sent email) regarding the bug:

I attach the copies (see [A] & [B]) of command execution verbose from both "python setup.py install" and the new "pip install -e .". Also, here is the link [1] to the unknown numpy dependency bug that is probably associated with the pandas bug.

[A] http://pastebin.test.redhat.com/323063
[B] http://pastebin.test.redhat.com/323066
[1] numpy/numpy#2434
[2] http://python-packaging-user-guide.readthedocs.org/en/latest/pip_easy_install/
[3] http://stackoverflow.com/questions/3220404/why-use-pip-over-easy-install

Essentually, 'pip install' option uses the pip install itself to install three main dependencies (including numpy) of the pandas package (please refer to [A] & [B] documents) while 'python setup.py install' is using setuptools (easy_install) I think (see [2]). It's simply better at dependency-resolving and that's all it is as far as I can see. Python install looks like it just goes straight to install pandas package without carefully verifying the dependencies associated it and end up failing. The known bug with numpy listed above is possibly contributing to the error as well.

In my very own point of view, pip install option is safe as I have tested with several popular packages such as django and it is fine. In fact, the pip install option is preferable option as I found several online conversations pointing its advantages (see [3]). We can simply replace the python install with pip install completely if necessary.

Thanks in advance!

Contributor

dinhxuanvu commented Oct 27, 2015

@soltysh @rhcarvalho Would you guys mind talking a look at this bug and its fix as well? I would really appreciate the consultation on this.

Here are my findings and explanation (excerpt from a sent email) regarding the bug:

I attach the copies (see [A] & [B]) of command execution verbose from both "python setup.py install" and the new "pip install -e .". Also, here is the link [1] to the unknown numpy dependency bug that is probably associated with the pandas bug.

[A] http://pastebin.test.redhat.com/323063
[B] http://pastebin.test.redhat.com/323066
[1] numpy/numpy#2434
[2] http://python-packaging-user-guide.readthedocs.org/en/latest/pip_easy_install/
[3] http://stackoverflow.com/questions/3220404/why-use-pip-over-easy-install

Essentually, 'pip install' option uses the pip install itself to install three main dependencies (including numpy) of the pandas package (please refer to [A] & [B] documents) while 'python setup.py install' is using setuptools (easy_install) I think (see [2]). It's simply better at dependency-resolving and that's all it is as far as I can see. Python install looks like it just goes straight to install pandas package without carefully verifying the dependencies associated it and end up failing. The known bug with numpy listed above is possibly contributing to the error as well.

In my very own point of view, pip install option is safe as I have tested with several popular packages such as django and it is fine. In fact, the pip install option is preferable option as I found several online conversations pointing its advantages (see [3]). We can simply replace the python install with pip install completely if necessary.

Thanks in advance!

( cd $OPENSHIFT_REPO_DIR; python ${OPENSHIFT_REPO_DIR}/setup.py develop $OPENSHIFT_PYTHON_MIRROR )
else
echo "Running pip install.."
( cd $OPENSHIFT_REPO_DIR; pip install -e . )

This comment has been minimized.

@soltysh

soltysh Oct 28, 2015

Member

You probably want to pass $OPENSHIFT_PYTHON_MIRROR $OPENSHIFT_PIP_TRUSTED_HOST like we do that a couple lines before.

@soltysh

soltysh Oct 28, 2015

Member

You probably want to pass $OPENSHIFT_PYTHON_MIRROR $OPENSHIFT_PIP_TRUSTED_HOST like we do that a couple lines before.

@soltysh

This comment has been minimized.

Show comment
Hide comment
@soltysh

soltysh Oct 28, 2015

Member

One nit, and the change itself LGTM.

As for the transition, from my research, it's perfectly ok to make the transition from python setup.py develop to pip install -e .. Even more, it's highly encouraged to do that. Additionally we are already using pip install -r requirements.txt couple lines before so this will unify the entire install process.
The only question is do we want such a big change? @danmcp

Member

soltysh commented Oct 28, 2015

One nit, and the change itself LGTM.

As for the transition, from my research, it's perfectly ok to make the transition from python setup.py develop to pip install -e .. Even more, it's highly encouraged to do that. Additionally we are already using pip install -r requirements.txt couple lines before so this will unify the entire install process.
The only question is do we want such a big change? @danmcp

@dinhxuanvu

This comment has been minimized.

Show comment
Hide comment
@dinhxuanvu

dinhxuanvu Oct 28, 2015

Contributor

@soltysh Thanks for the review and consultation. Certainly appreciated it. I agreed with the moving to pip install since it's newer and more reliable (as you said we already use it). If @danmcp is fine with this, I will just use pip install as default and remove the conditional statement + python install in the code as well.

Contributor

dinhxuanvu commented Oct 28, 2015

@soltysh Thanks for the review and consultation. Certainly appreciated it. I agreed with the moving to pip install since it's newer and more reliable (as you said we already use it). If @danmcp is fine with this, I will just use pip install as default and remove the conditional statement + python install in the code as well.

@danmcp

This comment has been minimized.

Show comment
Hide comment
@danmcp

danmcp Oct 28, 2015

Member

@soltysh @dinhxuanvu I am only ok with this in general if we are 100% confident in no regressions for existing apps. It seems like a big change.

Member

danmcp commented Oct 28, 2015

@soltysh @dinhxuanvu I am only ok with this in general if we are 100% confident in no regressions for existing apps. It seems like a big change.

@dinhxuanvu

This comment has been minimized.

Show comment
Hide comment
@dinhxuanvu
Contributor

dinhxuanvu commented Oct 28, 2015

@abhgupta

This comment has been minimized.

Show comment
Hide comment
@abhgupta

abhgupta Oct 28, 2015

Member

@soltysh @danmcp The reason we are considering going down the path of using pip for installing these dependencies was that python setup.py install was failing for at least one package. Additionally, we didn't want to make a one-off change in the code that hard-coded the specific package (pandas).

An alternative approach (not sure if its required or recommended), would be to always use python setup.py install first and if it fails then fall back to pip install. Does that make sense?

Member

abhgupta commented Oct 28, 2015

@soltysh @danmcp The reason we are considering going down the path of using pip for installing these dependencies was that python setup.py install was failing for at least one package. Additionally, we didn't want to make a one-off change in the code that hard-coded the specific package (pandas).

An alternative approach (not sure if its required or recommended), would be to always use python setup.py install first and if it fails then fall back to pip install. Does that make sense?

@danmcp

This comment has been minimized.

Show comment
Hide comment
@danmcp

danmcp Oct 28, 2015

Member

@abhgupta I think I would prefer an environment variable to let people choose to use pip and the default stay the same.

Member

danmcp commented Oct 28, 2015

@abhgupta I think I would prefer an environment variable to let people choose to use pip and the default stay the same.

@abhgupta

This comment has been minimized.

Show comment
Hide comment
@abhgupta

abhgupta Oct 28, 2015

Member

@danmcp I would prefer that as well. Keep python setup.py install as the default that can be overridden by the user env var. Keeping the existing install option as default will ensure we don't break existing applications.

Member

abhgupta commented Oct 28, 2015

@danmcp I would prefer that as well. Keep python setup.py install as the default that can be overridden by the user env var. Keeping the existing install option as default will ensure we don't break existing applications.

@abhgupta

This comment has been minimized.

Show comment
Hide comment
@abhgupta

abhgupta Oct 28, 2015

Member

@dinhxuanvu I believe we have settled on an approach

Member

abhgupta commented Oct 28, 2015

@dinhxuanvu I believe we have settled on an approach

@soltysh

This comment has been minimized.

Show comment
Hide comment
@soltysh

soltysh Oct 28, 2015

Member

👍

Member

soltysh commented Oct 28, 2015

👍

@dinhxuanvu

This comment has been minimized.

Show comment
Hide comment
@dinhxuanvu

dinhxuanvu Oct 29, 2015

Contributor

@abhgupta @tiwillia I have changed the code to follow the approach that we discussed above. Please review when you can. [test]

Contributor

dinhxuanvu commented Oct 29, 2015

@abhgupta @tiwillia I have changed the code to follow the approach that we discussed above. Please review when you can. [test]

@@ -1,6 +1,14 @@
#!/bin/bash
# Utility functions for use in the cartridge scripts.
function export_pip_install() {
if marker_present "pip_install"; then

This comment has been minimized.

@soltysh

soltysh Oct 29, 2015

Member

A nit, but can you fix the indentation - it reads better 😉

@soltysh

soltysh Oct 29, 2015

Member

A nit, but can you fix the indentation - it reads better 😉

@soltysh

This comment has been minimized.

Show comment
Hide comment
@soltysh

soltysh Oct 29, 2015

Member

LGTM. Don't forget to update the docs with info about the marker!

Member

soltysh commented Oct 29, 2015

LGTM. Don't forget to update the docs with info about the marker!

Bug 1265609: Fix pandas not getting installed
Pandas package fails to install properly if it's included in setup.py
for Python 2 & 3 applications.

After this commit, python cartridge control file will check to see if
'pip_install' marker is present. If it is, then control file will use
pip to install packages. If not, the standard 'python setup.py install'
is executed instead. Also, environment variable "OPENSHIFT_PYTHON_USE_PIP"
is set to 'enable' if the marker file exists. Otherwise, it is set to
'disable'.

In order to use pip install as default, a file named 'pip_install' needs
to be created in directory .openshift/markers/ inside the application git
repository.

Bug <1265609>
Link <https://bugzilla.redhat.com/show_bug.cgi?id=1265609>

Signed-off-by: Vu Dinh <vdinh@redhat.com>
@dinhxuanvu

This comment has been minimized.

Show comment
Hide comment
@dinhxuanvu

dinhxuanvu Oct 29, 2015

Contributor

@soltysh Thanks for the comment. Fixed and [test]

Contributor

dinhxuanvu commented Oct 29, 2015

@soltysh Thanks for the comment. Fixed and [test]

@openshift-bot

This comment has been minimized.

Show comment
Hide comment
@openshift-bot

openshift-bot Oct 29, 2015

Contributor

Evaluated for online test up to 9d05226

Contributor

openshift-bot commented Oct 29, 2015

Evaluated for online test up to 9d05226

@abhgupta

This comment has been minimized.

Show comment
Hide comment
@abhgupta

abhgupta Oct 30, 2015

Member

[merge]

Member

abhgupta commented Oct 30, 2015

[merge]

@openshift-bot

This comment has been minimized.

Show comment
Hide comment
@openshift-bot

openshift-bot Oct 30, 2015

Contributor

Evaluated for online merge up to 9d05226

Contributor

openshift-bot commented Oct 30, 2015

Evaluated for online merge up to 9d05226

@dinhxuanvu

This comment has been minimized.

Show comment
Hide comment
@dinhxuanvu

dinhxuanvu Oct 30, 2015

Contributor

@tiwillia The merge fails and it looks like an EC2 timeout error. Would you mind taking a look to verify that it's not related to my changes and re-merge it if possible? Thanks

Contributor

dinhxuanvu commented Oct 30, 2015

@tiwillia The merge fails and it looks like an EC2 timeout error. Would you mind taking a look to verify that it's not related to my changes and re-merge it if possible? Thanks

@openshift-bot

This comment has been minimized.

Show comment
Hide comment
@openshift-bot

openshift-bot Oct 30, 2015

Contributor

Online Merge Results: SUCCESS (https://ci.dev.openshift.redhat.com/jenkins/job/merge_pull_requests/6607/) (Image: devenv_5690)

Contributor

openshift-bot commented Oct 30, 2015

Online Merge Results: SUCCESS (https://ci.dev.openshift.redhat.com/jenkins/job/merge_pull_requests/6607/) (Image: devenv_5690)

openshift-bot added a commit that referenced this pull request Oct 30, 2015

@openshift-bot openshift-bot merged commit 75aad7f into openshift:master Oct 30, 2015

1 of 2 checks passed

Online Merge Results: Testing
Details
Online Test Results: Tested
Details

@dinhxuanvu dinhxuanvu deleted the dinhxuanvu:python-pandas branch Nov 2, 2015

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.