Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix csgraph py3k build #243

Merged
merged 3 commits into from
Jun 9, 2012
Merged

Fix csgraph py3k build #243

merged 3 commits into from
Jun 9, 2012

Conversation

rgommers
Copy link
Member

@rgommers rgommers commented Jun 6, 2012

Tests were also not installed with regular python setup.py install.

@rgommers
Copy link
Member Author

rgommers commented Jun 6, 2012

Now the tests install, there are also four failures left (release blocker): http://projects.scipy.org/scipy/ticket/1669

@jakevdp
Copy link
Member

jakevdp commented Jun 6, 2012

Thanks for checking into this!
It looks like some of the issues are related to the mixing of integer and long types. Does numpy in python 3 use different indexing types that python 2?
I'm going to have to get a python 3 build in order to start debugging this.

@rgommers
Copy link
Member Author

rgommers commented Jun 6, 2012

I'm guessing a 32-bit vs. 64-bit issue instead of Py2 vs Py3. Before the tests were only run for my in-place build, which is 32-bit Python 2.6. The 3.x builds on OS X are 64-bit.

@rgommers
Copy link
Member Author

rgommers commented Jun 6, 2012

It may be better to merge this and investigate the issue further on the ticket I opened, to not mix everything up.

@rgommers
Copy link
Member Author

rgommers commented Jun 6, 2012

I'll have a look at the 2.4 issues.

@jakevdp
Copy link
Member

jakevdp commented Jun 6, 2012

I saw the 2.4 fixes in #244. Just to be sure - the remaining 2.4 issues have no relation to csgraph, correct?
I'm running on a 64 bit system and have no problems. But my system seems to use int32 by default:

>>> import numpy
>>> numpy.dtype(float)
dtype('float64')
>>> numpy.dtype(int)
dtype('int32')

perhaps this has something to do with it? I'm not sure whether or not that is normal.

@rgommers
Copy link
Member Author

rgommers commented Jun 7, 2012

Correct, the remaining 2.4 issues are in the integrate module.

The 32/64-bits of Python can indeed be different from that of the underlying OS. On my machine

$ python2.6 -c "import numpy as np; print(np.dtype(int))"
int32
$ python2.7 -c "import numpy as np; print(np.dtype(int))"
int64
$ python3.2 -c "import numpy as np; print(np.dtype(int))"
int64

The Cython functions should work with all types of inputs, but it looks like you only considered 32-bit. There's some recent PRs that used Cython fused types to handle this issue, but I'm not an expert on that. You could ask on the mailing list.

@jakevdp
Copy link
Member

jakevdp commented Jun 7, 2012

The strange thing is I used the same typedefs (float64/int32) in the cython code for the scikit-learn Ball Tree implementation, and that's been in use for over a year with no problems reported. I'll try to see if I can figure out where it's going wrong.

@charris
Copy link
Member

charris commented Jun 7, 2012

Looks like you might be using Py_ssize_t somewhere. It isn't in 2.4 but numpy defines it to be int for python call compatibility.

@rgommers
Copy link
Member Author

rgommers commented Jun 7, 2012

Py_ssize_t is indeed used all over the place in the Cython-generated C files. But can you elaborate on why that's an issue, and how to avoid it? It's not explicitly used in any of the .pyx files.

@charris
Copy link
Member

charris commented Jun 8, 2012

Py_ssize_t was new in 2.5. I'm not sure what Cython does, but numpy defines it to an int for 2.4, which is 32 bits on 64 bit windows systems. It should only be used when calling the python c-api, so shouldn't be a problem if none of the .pyx/.pyd files uses it. I was just making a guess as to what might be going on.

rgommers added a commit that referenced this pull request Jun 9, 2012
@rgommers rgommers merged commit 1e8404c into scipy:master Jun 9, 2012
@rgommers
Copy link
Member Author

rgommers commented Jun 9, 2012

I need this to test release builds, so I merged it. Will also merge the 2.4 fixes.

@rgommers
Copy link
Member Author

@jakevdp I see you changed ITYPE_t to Py_ssize_t in jakevdp/pyDistances#1.

I asked this on ticket 1669 too, but can someone explain why

ctypedef np.int32_t ITYPE_t

in parameters.pxi should work for 64-bit systems?

Finally, in 511eda0 @pv changed int32_t to intp_t to make things work on 64-bit systems. However, this same change in parameters.pxi makes Cython crash.

If anyone has either a simple explanation or a reference on how to define integer types portably, that would be great.

@pv
Copy link
Member

pv commented Jun 10, 2012

I think the point with int32 is that currently Scipy's sparse matrices are always int32.

Of course, int32 will work also on 64-bit systems, if one is also careful with the types of the Numpy arrays involved.

@jakevdp
Copy link
Member

jakevdp commented Jun 11, 2012

I think @pv is correct: because all these functions need to work efficiently with the indices of sparse matrices, the ITYPE typedef needs to match what's being used for sparse indices. A mismatch would at best require array copies and conversions, and at worst cause the code to crash. I believe @pv is correct that sparse matrices use int32 on all systems. I'll dig through the source to double-check this.

@jakevdp
Copy link
Member

jakevdp commented Jun 11, 2012

The relevant piece is in the INSTANTIATE_ALL definition in scipy/sparse/sparsetools/sparsetools.i. As long as the int here is always int32, the code should be fine. The minimum spanning tree error reported in http://projects.scipy.org/scipy/ticket/1669, however, leads me to believe that the indices are stored as long in this case.

A quick and dirty fix for this particular error would be to use array.astype to explicitly set the type of the index arrays before any C function call, but this may lead to potentially costly array copies if types are mismatched on some systems.

Bottom line: we need to somehow assure that the int type used in scipy/sparse/csgraph/parameters.pxi is the same as the int type defined in the instantiations in scipy/sparse/sparsetools/sparsetools.i for all systems. Is there an easy way to do this?

@rgommers
Copy link
Member Author

@jakevdp: the fixes Pauli made, linked on http://projects.scipy.org/scipy/ticket/1669, already fix the issue. There's only one failure left now, which doesn't seem to be related to ITYPE.

@pv
Copy link
Member

pv commented Jun 12, 2012

All the failures should be fixed by c89abe6 in master

@jakevdp
Copy link
Member

jakevdp commented Jun 12, 2012

Thanks for all the work on this Pauli and Ralf!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants