Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Iterator support (TypeError: argument of type 'dict' is not iterable) #13

Open
hanslovsky opened this Issue Jun 24, 2017 · 10 comments

Comments

Projects
None yet
3 participants
@hanslovsky
Copy link

hanslovsky commented Jun 24, 2017

This is exciting and looks very promising! I was able to download the binaries and run them on my machine (Arch Linux). I was able to import numpy and tested basic functionality, e.g. this works:

import numpy
img = numpy.zeros( (3, 4) ) + 3
random = np.random.rand()

I was not able to generate a random int, though:

>>> np.random.randint(3)
Traceback (most recent call last):
  File "mtrand.pyx", line 970, in mtrand.RandomState.randint (numpy/random/mtrand/mtrand.c:15903)
TypeError: argument of type 'dict' is not iterable

This is a migration to a separate issue from #2 . Following up on @Stewori's comment I looked for PyDict_Iter but my search was not successful. My best guess would be to get the iterator from the list returned by PyDict_Items when requested. Maybe that happens already and it will be resolved automatically once ListIter is supported.

I am not very familiar with the Python C API but I would definitely have a look into it. If you could point me to where the iterator methods for the python objects are defined, that would make it a lot easier for me!

You can probably add Arch Linux to the systems JyNI has been tested on. I did not build it myself, though.

If you are interested in why we would like to use JyNI: I created a compatibility layer between numpy and imglib2 based on pyjnius: imglib2-imglyb. This library is the core for modern ImageJ and with that compatibility layer we try to get the best of both worlds. The python code of imglyb needs only slight modificatoins to work with JyNI, and shipping JyNI with Fiji would probably be much easier than shipping a whole CPython distribution, so we are very interested in what happens with JyNI.

@Stewori

This comment has been minimized.

Copy link
Owner

Stewori commented Jun 24, 2017

In CPython 2.7 dictobject.c there is a section starting with the comment /* Dictionary iterator types */. It defines several PyObject-types for iteration about dicts (i.e. for items, keys, values).
Since JyNI leaves dictionaries on Jython side and only wraps them by views on C-side, we should probably do it the same way with corresponding iterators. Maybe you know that Jython's idea of PyDictionary is two-split into PyStringMap and PyDictionary. These were somewhat unified under AbstractDict (an effort I undertook exactly to make things easier in JyNI). Unfortunately the unification does not include dict iterators except ValuesIter, see static class ValuesIter extends PyIterator { in org.python.core.AbstractDict. Then there is ItemsIter (class ItemsIter extends PyIterator {) in org.python.core.PyDictionary) and StringMapIter, StringMapValuesIter, KeysIter and ItemsIter in org.python.core.PyStringMap. The challenge here is to sort things out and determine in which way to back the native iterator types by the mentioned Jython ones. Won't be trivial!
(Have to leave now, but hope this can get you started a bit. Will reply to the pyjinius/imglib thoughts later...)

@hanslovsky

This comment has been minimized.

Copy link
Author

hanslovsky commented Jun 30, 2017

I had a look at dictobject.c. If I understand correctly, all the commented out lines define the iterator behaviour on the CPython side, and we would need to replace those lines with wrappers around the respective Jython iterators.

As this is a side project for me, I cannot promise that I will make any progress with this but I will come back to it when I have some spare time!

@Stewori

This comment has been minimized.

Copy link
Owner

Stewori commented Jul 3, 2017

If I understand correctly, all the commented out lines define the iterator behaviour on the CPython side, and we would need to replace those lines with wrappers around the respective Jython iterators.

Yes, that is exactly right. You can use
PyObject* JyNI_PyObject_FromJythonPyObject(jobject jythonPyObject)
and
jobject JyNI_JythonPyObject_FromPyObject(PyObject* op)
to convert between Java/JNI jobject representation of Jython's Java-side PyObjects and CPython-style native PyObject* (remember to handle refcounting of PyObject* objects obtained from JyNI_PyObject_FromJythonPyObject properly).
The main issue on this front is to get the case distinction between Jython's PyDictionary and PyStringMap right. Maybe it would be better to adjust this stuff in Jython to be a bit JyNI-friendlier (but that would tie this to a future Jython 2.7.2 release).
In https://github.com/Stewori/JyNI/blob/master/JyNI-C/src/JyNI.c#L1001 there are entries required for every builtin type that must be mapped between Jython and CPython side (unlike types that exist only for Jython or only for CPython; these are handled automatically). We would have to add corresponding entries for iterator types.

@Stewori

This comment has been minimized.

Copy link
Owner

Stewori commented Jul 3, 2017

Regarding ImageJ:
It is great to see some interest in the JyNI project.

shipping JyNI with Fiji would probably be much easier than shipping a whole CPython distribution

JyNI requires Jython, which is unfortunately not that lightweight in terms of shipping. Or does Fiji ship Jython anyway?
That said, I read through the thread at http://forum.imagej.net/t/jupyter-notebook-for-imagej/5421/16. I think the multiprocessing issue might be easy to fix, see http://bugs.jython.org/issue2600.
SciPy support requires buffer protocol support in JyNI, which is doable, but will need a serious amount of work.

@hanslovsky

This comment has been minimized.

Copy link
Author

hanslovsky commented Jul 3, 2017

Maybe it would be better to adjust this stuff in Jython to be a bit JyNI-friendlier (but that would tie this to a future Jython 2.7.2 release).

Do you think that this is likely to happen? If so, I would prefer to wait.

Thanks for pointing to JyNI.c:1001.

@hanslovsky

This comment has been minimized.

Copy link
Author

hanslovsky commented Jul 3, 2017

JyNI requires Jython, which is unfortunately not that lightweight in terms of shipping. Or does Fiji ship Jython anyway?

Yes, Fiji does ship Jython, so that dependency would not be an issue. Shipping the CPython extensions (numpy, etc.) will be more of a headache. Maybe we will be able to use conda for that.

Currently, as you might have seen in the ImageJ forum thread you referenced, there is imglyb which uses PyJNIus to start a JVM from a running python process for shared memory between python and Java. We can use this to launch ImageJ/Fiji from that Python process but I definitely see benefits in having the JVM as the main process, e.g. awt in python on OSX, so for us it is definitely worth looking into JyNI. That being said, the code changes required for imglyb to work with JyNI are very minor, so I think it is definitely worth to pursue both the Jython and CPython solutions at the same time.

@Stewori

This comment has been minimized.

Copy link
Owner

Stewori commented Jul 3, 2017

Do you think that this is likely to happen? If so, I would prefer to wait.

That depends on us. I can push such a change to Jython if it's reasonable. It is unlikely that someone else will take hands on this dictionary stuff. The question is if we see a way to unify iterator types of PyStringMap and PyDictionary (i.e. in terms of AbstractDict). I currently don't have time to look at this stuff in detail, but if you see a good way to do it, tell me. Otherwise we will need a case distinction between these backend types in native code. (Should be doable as well.)

darjus pushed a commit to jythontools/jython that referenced this issue Sep 25, 2017

Refined type hierarchy of dictionary iterators to be cleaner regardin…
…g AbstractDict and to better match CPython's variant. Simplifies fixing of Stewori/JyNI#13.
@Stewori

This comment has been minimized.

Copy link
Owner

Stewori commented Sep 26, 2017

With https://hg.python.org/jython/rev/ddb684156587 I refined things in Jython a bit.

  • dict iterators for keys, values and items now have distinct classes; this allows us to match them directly with CPython counterparts
  • iterators of the same kind from PyDictionary and PyStringMap now always have a common ancestor in AbstractDict, so JyNI won't need to distinguish these cases when mapping types

@Stewori Stewori changed the title TypeError: argument of type 'dict' is not iterable Iterator support (TypeError: argument of type 'dict' is not iterable) Jul 20, 2018

@CalumFreeman

This comment has been minimized.

Copy link
Contributor

CalumFreeman commented Aug 24, 2018

At some point you said that some iterators were implemented to some degree, the sequence iterator in particular. There are a number of iterators in different files, the sequence iterator is the only one with code that seems to manage GC in the JyNI way, so the others may have memory leaks, but I'm not sure.

File Type to be iterated Status (as of 24/08/2018)
classobject.c PyInstance Not Implemented
descrobject.c PyDictProxy_Type Relies on PyObject_GetIter
dictobject.c PyDict (key, value, item) ML on my branch
iterobject.c PySeqIter Working
listobject.c PyList (list, listrev) possible ML
setobject.c PySet ML on my branch
setobject.c PyFrozenSet(uses set) ML on my branch
tupleobject.c PyTuple possible ML
weakrefobject.c _PyWeakref_ProxyType Relies on PyObject_GetIter
weakrefobject.c _PyWeakref_CallableProxyType Relies on PyObject_GetIter

To implement more, and deal with #32, I think I need to understand GC better. I think the focus of my confusion is around the core question: where do and don't we need to increment reference counts.

What I (think I) know:

I think C variables created without malloc stick around until they are out of scope. But there is something to do with PyObjects where malloc is used so free would be needed, but I don't know the details. PyObjects also seem to store a refcnt so I think GC deals with them. When GC occurs it maps the tree of references and uses that to work out what it can free. I don't think I need to know the details of exactly how it does that (although I do know some from talks and papers on jyni.org) so long as I know/trust that things with references pointing to them won't be GC'd.

What I don't know (about PySeqIter and ref counting):

I'm not really sure what a JyObject is, or what _JyObject_New does

JyObject looks like it is a wrapper for storing a jweak, some attributes and some flags. jweak is typedef'd to jobject, this suggests it's a weak reference but I'm not sure if it is. Does it have something to do with converting between JythonPyObjects and PyObjects? I'm guessing _JyObject_new initialises one from a type and the info in builtinTypes, is there a reference count for the JyObject, if so: where is it stored and is it set to 1 when created. I suspect there isn't a reference count and that it isn't managed by GC at all(since the macro AS_JY_NO_GC is used). Presumably it is expected that when you're done with it you clean it up manually?

how are java objects created in C managed

ie how does java know there is a ref from C to the object? do we need to tell it/is there a ref count somewhere?

What is JyNIDebugOp(JY_NATIVE_FINALIZE, it, -1); doing?

Is it just there for debugging so you can know when the object is going to be finalized in debug mode, it seems to have something to do with JyReferenceMonitor, is it checking there are no outstanding references so it can throw an error if something tried to deallocate an object that is still referenced?

Sorry to give you a whole bunch of questions(and I realise it's almost a question per line of code), but understanding what is going on is the main problem at the moment.

@Stewori

This comment has been minimized.

Copy link
Owner

Stewori commented Aug 24, 2018

These are very good questions and I'll try to answer them. However maybe not all right now.

A JyObject is an extended header over a PyObject. The additional header resides in memory really right before the PyObject, so accessing it is just an issue of memory arithmetics. The macro AS_JY contains this arithmetics. Look at https://arxiv.org/pdf/1404.6390.pdf figure 6. The JyObject would be the pointer to the left. As you already mentioned, this extended header contains the jobject and some flags and optionally an entry to a various purpose attribute list, see JyAttributes.c. Not every PyObject has a JyObject header. So an attempt to access it in invalid case can segfault or yield undefined behavior. There is some way to check this (HAS_JY?). AS_JY_WITH_GC and AS_JY_NO_GC are just shortcuts if you know for sure whether a PyGC_Head exists.

JyNIDebugOp is one of several macros and functions that feed JyReferenceMonitor with data. JyReferenceMonitor is a class that aims to keep track of memory leaks in JyNI. By default it is inactive. The monitoring must be turned on explicitly if required. See JyNIGCDemo.py and test_JyNI_gc.py for usage examples. Also see https://arxiv.org/pdf/1607.00825.pdf section 3.4.

Regarding GC handling, in JyNI there operates kind of a native part of GC that cares about modeling the native reference graph on Java side with JyGCHeads. That's in gcmodule.c. This only tangents native objects that hold native references to other native objects. E.g. PyTuple does that. E.g. PyInt does not. Now consider PyDictionary where things are less obvious. Primitive types like PyInt never hold references to anything so it is no surprise. In CPython PyDictionary is subject to native reference cycle search and in general a very relevant object type to gcmodule. In JyNI this is not the case. Here it behaves almost like PyInt because it stays fully on Java side and holds no references on native level. I think that iterators all behave like PyDictionary. But I might not be aware of every detail right now.

So much for now. We can iterate on details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.