Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pandas fix #50

Merged
9 commits merged into from Mar 10, 2011
Merged

Pandas fix #50

9 commits merged into from Mar 10, 2011

Conversation

charris
Copy link
Member

@charris charris commented Mar 8, 2011

This fixes the pandas segfault and all except one of the pandas tests pass. I don't think it is complete, however. In particular the error return (-1) is just passed back up the call chain whereas it should perhaps be caught at the first level where it is the return value. Mark, any comments about that?

@charris
Copy link
Member Author

charris commented Mar 8, 2011

I'm thinking raising an error might be the right thing to do as long as creating an object array still worked. As is, "setting an array element with a sequence" gets thrown and we could do that more directly.

@charris
Copy link
Member Author

charris commented Mar 8, 2011

Another thing I noticed in the discover_dimensions function is that it doesn't check for the array attribute. The Series object in pandas has that attribute and it may be that what pandas is doing is a workaround for the fact that the old code didn't handle that case either.

@charris
Copy link
Member Author

charris commented Mar 8, 2011

And here's the code of python's PySequence_Check

int
PySequence_Check(PyObject *s)
{
    if (s && PyInstance_Check(s))
         return PyObject_HasAttrString(s, "__getitem__");
    if (PyObject_IsInstance(s, (PyObject *)&PyDict_Type))
         return 0;
    return s != NULL && s->ob_type->tp_as_sequence &&
    s->ob_type->tp_as_sequence->sq_item != NULL;
}

@mwiebe
Copy link
Member

mwiebe commented Mar 8, 2011

Is Pandas using old-style classes? That's what PyInstance_Check is for, and it goes away in Python 3. If not, then it looks like it is implementing the sequence protocol incorrectly as I suspected.

@charris
Copy link
Member Author

charris commented Mar 8, 2011

They all seem to be derived from object somewhere down the line.

There are now four errors exposed in the numpy tests which I think result from passing the -1 error return back up the chain, it seems it need to be handled when first seen.

======================================================================
ERROR: test_new_iterator.test_iter_object_arrays_basic
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/nose/case.py", line 186, in runTest
    self.test(*self.arg)
  File "/usr/lib64/python2.7/site-packages/numpy/core/tests/test_new_iterator.py", line 907, in     test_iter_object_arrays_basic
    a = np.array([[1,2,3], None, obj, None], dtype='O')
TypeError: object of type 'NoneType' has no len()

======================================================================
ERROR: Ticket #1078: segfaults when creating an array with a sequence of 0d
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib64/python2.7/site-packages/numpy/core/tests/test_regression.py", line 1134, in test_array_from_sequence_scalar_array
    a = np.array((np.ones(2), np.array(2)))
TypeError: len() of unsized object

======================================================================
ERROR: Ticket #1081: weird array with strange input...
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib64/python2.7/site-packages/numpy/core/tests/test_regression.py", line 1148, in    test_array_from_sequence_scalar_array2
    t = np.array([np.array([]), np.array(0, object)] )
TypeError: len() of unsized object

======================================================================
ERROR: Ticket #239
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib64/python2.7/site-packages/numpy/core/tests/test_regression.py", line 298, in test_object_array_shape
    assert_equal(np.array([[1,2],3,4],dtype=object).shape, (3,))
TypeError: object of type 'int' has no len()

----------------------------------------------------------------------
Ran 3257 tests in 17.647s

@mwiebe
Copy link
Member

mwiebe commented Mar 8, 2011

You probably need to investigate which C code is triggering these exceptions - it might be the part copying the values into the array, not discover_dimensions.

@charris
Copy link
Member Author

charris commented Mar 8, 2011

Yeah, I'll come back to it this evening.

@charris
Copy link
Member Author

charris commented Mar 8, 2011

Here's the strange thing. The code differs from the original only in the addition of the checks. I put print statements in them and those paths are never taken. Yet the new code produces errors like

$charris@f13 ~$ python -c"import numpy; numpy.array([[1,1],1])"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
TypeError: object of type 'int' has no len()

Hmm...

@charris
Copy link
Member Author

charris commented Mar 9, 2011

Never mind, cut-n-paste error fixed.

@charris
Copy link
Member Author

charris commented Mar 9, 2011

I've implemented the error pass through for everything but KeyError, but I haven't uploaded it yet. I think it may be too finicky about other people's code. On the other hand, just getting an object might be confusing also.

processing except KeyError. In the latter case an object is forced.
@charris
Copy link
Member Author

charris commented Mar 9, 2011

OK, it's up ;) I guess we can see how it goes.

@mwiebe
Copy link
Member

mwiebe commented Mar 9, 2011

Looks good to me. Might be worth checking if this affects Ticket #1715 as well.

@charris
Copy link
Member Author

charris commented Mar 9, 2011

Ticket #1715 is a bit of a mystery. I get an array with the three rgb components and I have no idea why since the img object has neither
len nor getitem
So the constructor is doing something useful, but I have no idea why.

@mwiebe
Copy link
Member

mwiebe commented Mar 9, 2011

Interesting - I get a flipped image shape (603, 1024, 3) instead of (1024, 603, 3) as would seem natural, but otherwise looks good for me too.

@charris
Copy link
Member Author

charris commented Mar 9, 2011

I'm thinking that I have the decoder installed and that the image class must be checked for during the construction of the scalar. Probably Nil's needs to check this in the case where and exception is raised.

fangerer pushed a commit to hpyproject/numpy-hpy that referenced this pull request Jul 7, 2022
Merge in ~STEPAN.SINDELAR_ORACLE.COM/numpy-hpy from mq/nonzero_compress to labs-hpy-port

* commit 'a8645e2467ddbfc0a78d0f08cff9219214ab2511':
  Migrate PyArray_Nonzero and HPyArray_Compress
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants