Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to correctly set default argument "none" for pybind11::array_t ? #1953

Closed
WeinaJi opened this issue Oct 15, 2019 · 8 comments
Closed

How to correctly set default argument "none" for pybind11::array_t ? #1953

WeinaJi opened this issue Oct 15, 2019 · 8 comments

Comments

@WeinaJi
Copy link

WeinaJi commented Oct 15, 2019

Dear Pybind,

I am working with tag v2.4.3. I would like to bind a structure containing an array of int and set the array's default value to "None" in the constructor.

My simple unit test is

#include "pybind11_tests.h"
#include <pybind11/numpy.h>
#include <iostream>
using array_i = pybind11::array_t<int, pybind11::array::c_style | pybind11::array::forcecast>;

struct myObj_array {
    array_i ids;
};

TEST_SUBMODULE(defaultnone, m) {
    py::class_<myObj_array>(m, "myObj_array_int")
        .def(py::init([](array_i py_ids) {
                 std::cout << "ndim " << py_ids.ndim() << std::endl;
                 if (py_ids.ndim())
                     return myObj_array{py_ids};
                 else
                     return myObj_array{array_i(0)};
             }), py::arg("py_ids") = (array_i*) nullptr
        );

And my python script is

import pytest
from pybind11_tests import defaultnone as m

def test_default_array(doc):
    array_int = m.myObj_array_int()

for which I got the error message

>       array_int = m.myObj_array_int()
E       TypeError: __init__(): incompatible constructor arguments. The following argument types are supported:
E           1. pybind11_tests.defaultnone.myObj_array_int(py_ids: numpy.ndarray[int32] = None)
E       
E       Invoked with:

However, if I change the c++ code to bind an array of float or double and set the default to nullptr, the script above is running fine without error.

Is None argument allowed for pybind11::array, and why is there difference between type int and float/double ?

Many thanks in advance!
Regards

@WeinaJi WeinaJi closed this as completed May 26, 2020
@ervinYZ
Copy link

ervinYZ commented Jul 9, 2020

I have the same problem for setting default argument "none" for pybind11::list and pybind11::dict.

@YannickJadoul
Copy link
Collaborator

@WeinaJi That is a strange inconsistency indeed.

However, the nullptr seems to be converted to nan, for an array with float/double, so that seems wrong to me.

If you want to accept None, I would suggest to have an argument that's std::optional<array_i>, because pybind11 will not consider None to be a valid value for a array, I believe.

The same goes for list and dict, @ervinYZ. isinstance(None, list) evaluates as false, so you need to accept std::optional<py::list>.

@YannickJadoul
Copy link
Collaborator

Reopening to have the conversion from (py::array_t<double>*) nullptr or None to nan investigated, though:

#include <pybind11/pybind11.h>
#include <pybind11/numpy.h>

namespace py = pybind11;

using array = py::array_t<double, pybind11::array::c_style | pybind11::array::forcecast>;

PYBIND11_MODULE(example, m) {
        m.def("f", [](array a) { py::print(a); }, py::arg("a") = py::none());
}
>>> import example
>>> example.f([])
[]
>>> example.f()
nan

@YannickJadoul YannickJadoul reopened this Jul 9, 2020
@bstaletic
Copy link
Collaborator

You're taking the following constructor:

    array_t(const object &o) : array(raw_array_t(o.ptr()), stolen_t{}) {
        if (!m_ptr) throw error_already_set();
    }

The raw_array_t() is what produces nan. Then you just propagate nan through all the base constructors, until you initialize the py::handle base with it.

    static PyObject *raw_array_t(PyObject *ptr) {
        if (ptr == nullptr) {
            PyErr_SetString(PyExc_ValueError, "cannot create a pybind11::array_t from a nullptr");
            return nullptr;
        }
        return detail::npy_api::get().PyArray_FromAny_(
            ptr, dtype::of<T>().release().ptr(), 0, 0,
            detail::npy_api::NPY_ARRAY_ENSUREARRAY_ | ExtraFlags, nullptr);
    }

Since we know we just took a reference from PyNone(), ptr can't be NULL. Also ExtraFlags is py::array::forcecast. Therefore:

    static PyObject *raw_array_t(PyObject *ptr) {
        return py::detail::npy_api::get().PyArray_FromAny_(
            ptr, py::dtype::of<double>().release().ptr(), 0, 0,
            py::detail::npy_api::NPY_ARRAY_ENSUREARRAY_ | py::array::forcecast, nullptr);
    }

Now... PyArray_FromAny_() is this monstrocity, ptr is py::none().ptr(), py::dtype::of().release().ptr()` is... Wait a moment.

 

Two hours later...

 

Wow, that was a fun rabbit hole. I'm pretty sure this None -> nan is not pybind, but numpy itself. Bear with me.

After a lot of inlining and manual evaluation, here's how the above looks like:

static PyObject *raw_array_t(PyObject *ptr) {
    auto& npy_api_ref = py::detail::npy_api::get();
    return npy_api_ref.PyArray_FromAny_(
    		ptr,
    		npy_api_ref.PyArray_DescrFromType_(12), // 12 == NPY_DOUBLE_
    		0,
    		0,
    		0x50,
    		nullptr);
}
int main() {
	Py_Initialize();
        PyObject* h2 = ::raw_array_t(Py_None);
        PyObject* sys = PyImport_ImportModule("sys");
        PyObject* out = PyObject_GetAttrString(sys, "stdout");
	PyObject* write = PyObject_GetAttrString(out, "write");
	PyObject* t = PyTuple_New(1);
	PyTuple_SetItem(t, 0, PyObject_Str(h2));
	PyObject_Call(write, t, NULL);
	Py_Finalize();
}

At this point:

  1. This isn't even C++ any more. The only thing left of C++ is that one & in the npy_apr_ref and a call to py::detail::npy_api::get(). Everything else is good old C.
  2. From the above, you can see that the only thing left to pybind11 to do here is... fetch a reference to the NumPy API.

We can do better!

 

Another 40 minutes later...

 

#include <Python.h>

enum functions {
	API_PyArray_DescrFromType = 45,
	API_PyArray_FromAny = 69,
};
struct npy_api {
	PyObject *(*PyArray_FromAny_) (PyObject *, PyObject *, int, int, int, PyObject *);
	PyObject *(*PyArray_DescrFromType_)(int);
};
static struct npy_api lookup() {
	PyObject* multiarray = PyImport_ImportModule("numpy.core.multiarray");
	PyObject* array_api = PyObject_GetAttrString(multiarray, "_ARRAY_API");
	void **api_ptr = (void **) PyCapsule_GetPointer(array_api, NULL);
	struct npy_api api;
	api.PyArray_DescrFromType_ = api_ptr[API_PyArray_DescrFromType];
	api.PyArray_FromAny_ = api_ptr[API_PyArray_FromAny];
	return api;
}
static PyObject *raw_array_t(PyObject *ptr) {
	const struct npy_api npy_api_ref = lookup();
	return npy_api_ref.PyArray_FromAny_(
			ptr,
			// 12 == NPY_DOUBLE_
			npy_api_ref.PyArray_DescrFromType_(12),
			0,
			0,
			0x50,
			NULL);
}
int main() {
	Py_Initialize();
	PyObject* h2 = raw_array_t(Py_None);
	PyObject* sys = PyImport_ImportModule("sys");
	PyObject* out = PyObject_GetAttrString(sys, "stdout");
	PyObject* write = PyObject_GetAttrString(out, "write");
	PyObject* t = PyTuple_New(1);
	PyTuple_SetItem(t, 0, PyObject_Str(h2));
	PyObject_Call(write, t, NULL);
	Py_Finalize();
}

This time, it really is pure C. Now is this a bug? I don't know. Perhaps NumPy considers this expected behaviour.
In case someone is worried about magic numbers being wrong:

 

A much faster way of confirming this is regular numpy behaviour is python -c 'print(__import__("numpy").array(None, dtype = float))'.

@bstaletic
Copy link
Collaborator

I shall now perform an honorary closing of this issue.

@ervinYZ
Copy link

ervinYZ commented Jul 10, 2020

@WeinaJi That is a strange inconsistency indeed.

However, the nullptr seems to be converted to nan, for an array with float/double, so that seems wrong to me.

If you want to accept None, I would suggest to have an argument that's std::optional<array_i>, because pybind11 will not consider None to be a valid value for a array, I believe.

The same goes for list and dict, @ervinYZ. isinstance(None, list) evaluates as false, so you need to accept std::optional<py::list>.

Thanks for the prompt reply. I found the same suggestion that I should use std::optional in the pybind doc. It works perfect.

But I have to say the current way to allow/prohibiting none argument is confusing, since I can set and compile the py::arg("list").none(true) or py::arg("list") = py::none() but it does nothing.

@YannickJadoul
Copy link
Collaborator

But I have to say the current way to allow/prohibiting none argument is confusing, since I can set and compile the py::arg("list").none(true) or py::arg("list") = py::none() but it does nothing.

Yes, that's kind of true. The use of none is documented, though: https://pybind11.readthedocs.io/en/stable/advanced/functions.html#allow-prohibiting-none-arguments

@yxlao
Copy link

yxlao commented Aug 30, 2020

With C++17's std::optional, here's an example that should work. For earlier versions of C++, you might want a backported optional.h and implement your own optional_caster similar to the one in pybind11/stl.h.

Say you want this function:

def add(a, b=None):
    # Assuming a, b are int.
    if b is None:
        return a
    else:
        return a + b

Here's the equivalent C++ implementation:

m.def("add",
    [](int a, std::optional<int> b) {
        if (!b.has_value()) {
            return a;
        } else {
            return a + b.value();
        }
    },
    py::arg("a"), py::arg("b") = py::none()
);

In python, this function can be called with:

add(1)
add(1, 2)
add(1, b=2)
add(1, b=None)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants