Python bindings for MoveTK #16

aniketmitra001 · 2020-09-25T11:54:07Z

Create Python bindings for MoveTK

aniketmitra001 · 2020-09-25T11:57:35Z

@heremaps/movepy Creating this feature request for creating python bindings for MoveTK

aniketmitra001 · 2020-09-25T13:13:16Z

@craigrbarnes some relevant links

CGAL uses SWIG for python bindings
The python bindings for GUDHI
The project that computes persistent homology uses PyBind11
Binder is a tool for automatic generation of python bindings for C++11 projects using Pybind11
Binding template functions
Tutorial on PyBind11

The possible options are

Boost.Python
SWIG
PyBind11

PyBind11 seems to be heavily influenced by Boost.Python and seems to be targeted towards c++11, c++14 and C++17 language features with support for iterators , ranges, lambdas....

Similarly, the advantage of PyBind11 over SWIG seems to be that bindings with PyBind11 results with much simpler and easier to debug code. Also SWIG's auto generated code seems to be not optimal for performance . Last year a RFC was submitted to migrate the bindings in tensorflow from SWIG to PyBind11

@craigrbarnes , @mstroila your thoughts on this?

aniketmitra001 · 2020-09-27T22:40:01Z

@craigrbarnes @mstroila

I have made an attempt to write a binding for MoveTKPoint.

#include "movetk/geom/CGALTraits.h"
#include "movetk/geom/BoostGeometryTraits.h"
#include "movetk/geom/GeometryInterface.h"
#include <pybind11/pybind11.h>
#include <pybind11/stl.h>

namespace py = pybind11;



struct BoostGeometry2{

    const static size_t dimensions = 2;
    typedef movetk_support::BoostGeometryTraits<long double, dimensions> Boost_Geometry_Backend;
    typedef movetk_core::MovetkGeometryKernel<typename Boost_Geometry_Backend::Wrapper_Boost_Geometry> MovetkGeometryKernel;
    typedef movetk_core::MakePoint<typename BoostGeometry2::MovetkGeometryKernel> MakePoint;
    typedef MovetkGeometryKernel::MovetkPoint MovetkPoint;
    typedef MovetkGeometryKernel::NT NT;
};

PYBIND11_MODULE(movetk_geometry, m) {
    py::class_<typename BoostGeometry2::MakePoint>(m, "make_point")
        .def(py::init<>())
        .def("__call__",[](typename BoostGeometry2::MakePoint& f,
                std::array<typename BoostGeometry2::NT, BoostGeometry2::dimensions> arr) {
                    return f(std::cbegin(arr), std::cend(arr));
        });
    py::class_<typename BoostGeometry2::MovetkPoint>(m, "movetk_point")
            .def(py::init<>())
            .def("__repr__",[](const typename BoostGeometry2::MovetkPoint& self){
                auto it =  self.begin();
                return "[ " + std::to_string(*it) + "," + std::to_string(*(it + 1)) + " ]\n";
            });
}

import movetk_geometry as geometry
import numpy as np
mp = geometry.make_point()
pt = mp(np.array([1,1]))
pt

Discussion Points:

MoveTK is a geometry backend agnostic library. This means that the choice of the dimension, coordinate type and geometry library (Boost.Geometry or CGAL) is delegated to the application developer. Since, in the library these types are template parameters, the compiler expects these types to be defined while building the project. While writing python bindings with PyBind11, this means that the types have to be defined in the .cpp file. Therefore in the binding code , we will need to define, BoostGeometryKernel2D, BoostGeometryKernel3D, CGALGeometryKernel2D,... So the user if the python module will be restricted by the choice of the dimension and geometry backend defined in the binding code. Moreover we can limit the number type to long double. So the trade off with ease of use seems to come at the cost of lack of flexibility. We need to check if there can be a workaround to this
The above example uses type conversion i.e use native C++ type in this case an std::array on the C++ side and a numpy.array on the Python side. The downside is that leads to a copy of data on every Python <-> C++ transition. We need to think of an alternative.

aniketmitra001 · 2020-09-30T00:45:00Z

@craigrbarnes @mstroila

Following is a rewrite of the above example to create a MoveTKPoint using Buffer Protocols so that we can avoid copying of data from Python to C++

This code compiles without issues but reading data from the memory representation of a numpy array does not seem to work. Please see the output of the python code below

#include "movetk/geom/BoostGeometryTraits.h"
#include "movetk/geom/GeometryInterface.h"
#include <pybind11/pybind11.h>
#include <pybind11/stl.h>

namespace py = pybind11;



struct BoostGeometry2 {
    const static size_t dimensions = 2;
    typedef movetk_support::BoostGeometryTraits<long double, dimensions> Boost_Geometry_Backend;
    typedef movetk_core::MovetkGeometryKernel<typename Boost_Geometry_Backend::Wrapper_Boost_Geometry> MovetkGeometryKernel;
    typedef movetk_core::MakePoint<typename BoostGeometry2::MovetkGeometryKernel> MakePoint;
    typedef movetk_core::MakeSegment<typename BoostGeometry2::MovetkGeometryKernel> MakeSegment;
    typedef MovetkGeometryKernel::NT NT;
    typedef MovetkGeometryKernel::MovetkPoint MovetkPoint;
    typedef MovetkGeometryKernel::MovetkSegment MovetkSegment;
    typedef typename movetk_core::movetk_basic_iterator<const NT> CoordinateIterator;
};

PYBIND11_MODULE(movetk_geometry, m) {
    py::class_<typename BoostGeometry2::MovetkPoint>(m, "make_point", py::buffer_protocol())
            .def(py::init<>())
            .def(py::init([](py::buffer const buf) {
                py::buffer_info info = buf.request();
                if (info.format != py::format_descriptor<float>::format() ||
                    info.ndim != 1)
                    throw std::runtime_error("Incompatible buffer format!");
                typename BoostGeometry2::CoordinateIterator first(
                        static_cast<const typename BoostGeometry2::NT *>(info.ptr));
                typename BoostGeometry2::CoordinateIterator beyond(
                        static_cast<const typename BoostGeometry2::NT *>(info.ptr) + info.ndim);
                typename BoostGeometry2::MovetkPoint pt(first, beyond);
                return pt;

            }))
            .def_buffer([](typename BoostGeometry2::MovetkPoint &m) -> py::buffer_info {
                auto it = m.begin();
                return py::buffer_info(
                        std::addressof(*it),
                        BoostGeometry2::dimensions,
                        sizeof(typename BoostGeometry2::NT)
                );
            })
            .def("__repr__", [](const typename BoostGeometry2::MovetkPoint &m) {
                auto it = m.begin();
                return "[ " + std::to_string(*it) + "," + std::to_string(*(it + 1)) + " ]\n";
            });
}

import movetk_geometry as mg
pt =  mg.make_point(np.array([1,1]).astype(np.float32))
pt 
[ 0.000000,0.000000 ]

aniketmitra001 · 2020-09-30T19:43:26Z

Further investigations on Buffer protocols and what works....

There are two ways in which this works

The binding code exposes a C++ type (in this case a MovetkPoint) as a buffer object making it possible to cast the C++ type as a numpy.array and them it would be possible to completely avoid copy operations numpy.array(MoveTKPoint_instance, copy = False) . A way to achieve this would be to write the following binding code

PYBIND11_MODULE(movetk_geometry, m) {
    py::class_<typename BoostGeometry2::MovetkPoint>(m, "make_point", py::buffer_protocol())
            .def(py::init<>())
            .def_buffer([](typename BoostGeometry2::MovetkPoint &m) -> py::buffer_info {
                auto it = m.begin();
                return py::buffer_info(
                        std::addressof(*it),
                        BoostGeometry2::dimensions,
                        sizeof(typename BoostGeometry2::NT)
                );
            });
}

The above snippet exposes a MovetkPoint as a buffer object which now allows it to be cast as a numpy.array in python.
Setting values to this buffer object from pythons possible if we add a "setitem" method to the binding code as follows

   .def("__setitem__",[](typename BoostGeometry2::MovetkPoint &m , 
                               std:size_t idx, typename BoostGeometry2::NT value){
      ....
})

However, the caveat is that we consider a MoveTKPoint as an atomic unit, which can only be constructed by passing iterators (on the input data) to the constructor of MoveTKPoint

    template<class CoordinateIterator,
                typename = movetk_core::requires_random_access_iterator<CoordinateIterator> >
        Wrapper_Boost_Point(CoordinateIterator first,
                            CoordinateIterator beyond) {
            //ASSERT_NUMBER_TYPE(Kernel, first);
            pt.template set<0>(*first);
            pt.template set<1>(*(first + 1));
            if constexpr (Kernel::dim == 3) {
                pt.template set<2>(*(first + 2));
            }

        }

we do not allow the individual coordinates of the point object to be set by passing an index and value. This follows from the design principle of CGAL.

Open Discussion Point:

The caveat here is that once we have cast MovetkPoint as a numpy.array how do we pass iterators over the data from python to the binding code of MovetkPoint

The second option is to pass a python buffer object to the binding code as follows

PYBIND11_MODULE(movetk_geometry, m) {
    py::class_<typename BoostGeometry2::MovetkPoint>(m, "make_point", py::buffer_protocol())
            .def(py::init<>())
            .def(py::init([](py::buffer const buf) {
                py::buffer_info info = buf.request();
                if (info.format != py::format_descriptor<float>::format() ||
                    info.ndim != 1)
                    throw std::runtime_error("Incompatible buffer format!");
                typename BoostGeometry2::CoordinateIterator first(
                        static_cast<const typename BoostGeometry2::NT *>(info.ptr));
                typename BoostGeometry2::CoordinateIterator beyond(
                        static_cast<const typename BoostGeometry2::NT *>(info.ptr) + info.ndim);
                typename BoostGeometry2::MovetkPoint pt(first, beyond);
                return pt;

            }))
            .def("__repr__", [](const typename BoostGeometry2::MovetkPoint &m) {
                auto it = m.begin();
                return "[ " + std::to_string(*it) + "," + std::to_string(*(it + 1)) + " ]\n";
            });
}

Open Discussion Point:

But as mentioned in the previous comment it seems that when numpy.array passed as a buffer object does not work. Possible causes could be (a) value in the address pointed to by info.ptr is incorrect (b) the address itself that info.ptr points to is incorrect

What Works?

If we finalise that it will only be possible to create a MoveTKPoint from python by using a numpy.array , then we can substitute .def(py::init([](py::buffer const buf) with .def(py::init([](py::array_t<typename BoostGeometry2::NT, BoostGeometry2::dimensions> const buf)

PYBIND11_MODULE(movetk_geometry, m) {
    py::class_<typename BoostGeometry2::MovetkPoint>(m, "make_point", py::buffer_protocol())
            .def(py::init<>())
            .def(py::init([](py::array_t<typename BoostGeometry2::NT,
                    BoostGeometry2::dimensions> const buf) {
                py::buffer_info info = buf.request();
                typename BoostGeometry2::CoordinateIterator first(
                        static_cast<const typename BoostGeometry2::NT *>(info.ptr));
                typename BoostGeometry2::CoordinateIterator beyond(
                        static_cast<const typename BoostGeometry2::NT *>(info.ptr) + BoostGeometry2::dimensions);
                typename BoostGeometry2::MovetkPoint pt(first, beyond);
                return pt;
            }))
            .def("__repr__", [](const typename BoostGeometry2::MovetkPoint &m) {
                auto it = m.begin();
                return "[ " + std::to_string(*it) + "," + std::to_string(*(it + 1)) + " ]\n";
            });
 }

which works fine as shown below

import numpy as np
import movetk_geometry as mg
mg.make_point(np.array([1,1]).astype(np.float32))
[ 1.000000,1.000000 ]

mg.make_point(np.array([1,1]))
[ 1.000000,1.000000 ]

mg.make_point(np.array([1,3]))
[ 1.000000,3.000000 ]

aniketmitra001 · 2020-10-04T12:55:35Z

Added a notebook MovetkGeometry.ipynb

Binding code is movetk_geometry.cpp

The geometry backend , number type and dimensions of the geometry kernel is set as Cmake flags while building the project and is fixed for the whole project.

Inorder to change the geometry backend the Cmake flag WITH_CGAL_BACKEND has to be set to ON. The default geometry backend is Boost.Geometry
The number type and the dimensions have to be changed in GeometryBackendTraits.h before building the project

Build Status

Passes on Mac
Fails on Linux with the following error

In file included from /miniconda/include/pybind11/pytypes.h:12:0,
                 from /miniconda/include/pybind11/cast.h:13,
                 from /miniconda/include/pybind11/attr.h:13,
                 from /miniconda/include/pybind11/pybind11.h:44,
                 from /usr/src/movetk/py/movetk_geometry.cpp:25:
/miniconda/include/pybind11/detail/common.h:112:10: fatal error: Python.h: No such file or directory
 #include <Python.h>
          ^~~~~~~~~~
compilation terminated.

aniketmitra001 · 2020-10-05T22:30:33Z

Added a notebook MovetkGeometry.ipynb

Binding code is movetk_geometry.cpp

The geometry backend , number type and dimensions of the geometry kernel is set as Cmake flags while building the project and is fixed for the whole project.

Inorder to change the geometry backend the Cmake flag WITH_CGAL_BACKEND has to be set to ON. The default geometry backend is Boost.Geometry

The number type and the dimensions have to be changed in GeometryBackendTraits.h before building the project

Build Status

Passes on Mac

Fails on Linux with the following error
In file included from /miniconda/include/pybind11/pytypes.h:12:0,
                 from /miniconda/include/pybind11/cast.h:13,
                 from /miniconda/include/pybind11/attr.h:13,
                 from /miniconda/include/pybind11/pybind11.h:44,
                 from /usr/src/movetk/py/movetk_geometry.cpp:25:
/miniconda/include/pybind11/detail/common.h:112:10: fatal error: Python.h: No such file or directory
 #include <Python.h>
          ^~~~~~~~~~
compilation terminated.

Fixed build issue on Fedora by installing the python3-devel package . See this Issue #23 for details

aniketmitra001 created this issue from a note in Release v1.0 (In progress) Sep 25, 2020

aniketmitra001 assigned craigrbarnes and aniketmitra001 Sep 25, 2020

aniketmitra001 added the enhancement New feature or request label Sep 25, 2020

aniketmitra001 assigned mstroila Sep 25, 2020

aniketmitra001 modified the milestone: Cr Sep 25, 2020

aniketmitra001 linked a pull request Sep 25, 2020 that will close this issue

python-bindings for MoveTK #17

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python bindings for MoveTK #16

Python bindings for MoveTK #16

aniketmitra001 commented Sep 25, 2020

aniketmitra001 commented Sep 25, 2020

aniketmitra001 commented Sep 25, 2020 •

edited

aniketmitra001 commented Sep 27, 2020

aniketmitra001 commented Sep 30, 2020 •

edited

aniketmitra001 commented Sep 30, 2020

aniketmitra001 commented Oct 4, 2020

aniketmitra001 commented Oct 5, 2020

Python bindings for MoveTK #16

Python bindings for MoveTK #16

Comments

aniketmitra001 commented Sep 25, 2020

aniketmitra001 commented Sep 25, 2020

aniketmitra001 commented Sep 25, 2020 • edited

aniketmitra001 commented Sep 27, 2020

aniketmitra001 commented Sep 30, 2020 • edited

aniketmitra001 commented Sep 30, 2020

aniketmitra001 commented Oct 4, 2020

aniketmitra001 commented Oct 5, 2020

aniketmitra001 commented Sep 25, 2020 •

edited

aniketmitra001 commented Sep 30, 2020 •

edited