Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python bindings for MoveTK #16

Open
aniketmitra001 opened this issue Sep 25, 2020 · 7 comments · May be fixed by #17
Open

Python bindings for MoveTK #16

aniketmitra001 opened this issue Sep 25, 2020 · 7 comments · May be fixed by #17
Assignees
Labels
enhancement New feature or request

Comments

@aniketmitra001
Copy link
Collaborator

Create Python bindings for MoveTK

@aniketmitra001 aniketmitra001 created this issue from a note in Release v1.0 (In progress) Sep 25, 2020
@aniketmitra001 aniketmitra001 added the enhancement New feature or request label Sep 25, 2020
@aniketmitra001
Copy link
Collaborator Author

@heremaps/movepy Creating this feature request for creating python bindings for MoveTK

@aniketmitra001
Copy link
Collaborator Author

aniketmitra001 commented Sep 25, 2020

@craigrbarnes some relevant links

The possible options are

  • Boost.Python
  • SWIG
  • PyBind11

PyBind11 seems to be heavily influenced by Boost.Python and seems to be targeted towards c++11, c++14 and C++17 language features with support for iterators , ranges, lambdas....

Similarly, the advantage of PyBind11 over SWIG seems to be that bindings with PyBind11 results with much simpler and easier to debug code. Also SWIG's auto generated code seems to be not optimal for performance . Last year a RFC was submitted to migrate the bindings in tensorflow from SWIG to PyBind11

@craigrbarnes , @mstroila your thoughts on this?

@aniketmitra001 aniketmitra001 modified the milestone: Cr Sep 25, 2020
@aniketmitra001 aniketmitra001 linked a pull request Sep 25, 2020 that will close this issue
@aniketmitra001
Copy link
Collaborator Author

@craigrbarnes @mstroila

I have made an attempt to write a binding for MoveTKPoint.

#include "movetk/geom/CGALTraits.h"
#include "movetk/geom/BoostGeometryTraits.h"
#include "movetk/geom/GeometryInterface.h"
#include <pybind11/pybind11.h>
#include <pybind11/stl.h>

namespace py = pybind11;



struct BoostGeometry2{

    const static size_t dimensions = 2;
    typedef movetk_support::BoostGeometryTraits<long double, dimensions> Boost_Geometry_Backend;
    typedef movetk_core::MovetkGeometryKernel<typename Boost_Geometry_Backend::Wrapper_Boost_Geometry> MovetkGeometryKernel;
    typedef movetk_core::MakePoint<typename BoostGeometry2::MovetkGeometryKernel> MakePoint;
    typedef MovetkGeometryKernel::MovetkPoint MovetkPoint;
    typedef MovetkGeometryKernel::NT NT;
};

PYBIND11_MODULE(movetk_geometry, m) {
    py::class_<typename BoostGeometry2::MakePoint>(m, "make_point")
        .def(py::init<>())
        .def("__call__",[](typename BoostGeometry2::MakePoint& f,
                std::array<typename BoostGeometry2::NT, BoostGeometry2::dimensions> arr) {
                    return f(std::cbegin(arr), std::cend(arr));
        });
    py::class_<typename BoostGeometry2::MovetkPoint>(m, "movetk_point")
            .def(py::init<>())
            .def("__repr__",[](const typename BoostGeometry2::MovetkPoint& self){
                auto it =  self.begin();
                return "[ " + std::to_string(*it) + "," + std::to_string(*(it + 1)) + " ]\n";
            });
}
import movetk_geometry as geometry
import numpy as np
mp = geometry.make_point()
pt = mp(np.array([1,1]))
pt

Discussion Points:

  • MoveTK is a geometry backend agnostic library. This means that the choice of the dimension, coordinate type and geometry library (Boost.Geometry or CGAL) is delegated to the application developer. Since, in the library these types are template parameters, the compiler expects these types to be defined while building the project. While writing python bindings with PyBind11, this means that the types have to be defined in the .cpp file. Therefore in the binding code , we will need to define, BoostGeometryKernel2D, BoostGeometryKernel3D, CGALGeometryKernel2D,... So the user if the python module will be restricted by the choice of the dimension and geometry backend defined in the binding code. Moreover we can limit the number type to long double. So the trade off with ease of use seems to come at the cost of lack of flexibility. We need to check if there can be a workaround to this
  • The above example uses type conversion i.e use native C++ type in this case an std::array on the C++ side and a numpy.array on the Python side. The downside is that leads to a copy of data on every Python <-> C++ transition. We need to think of an alternative.

@aniketmitra001
Copy link
Collaborator Author

aniketmitra001 commented Sep 30, 2020

@craigrbarnes @mstroila

Following is a rewrite of the above example to create a MoveTKPoint using Buffer Protocols so that we can avoid copying of data from Python to C++

This code compiles without issues but reading data from the memory representation of a numpy array does not seem to work. Please see the output of the python code below

#include "movetk/geom/BoostGeometryTraits.h"
#include "movetk/geom/GeometryInterface.h"
#include <pybind11/pybind11.h>
#include <pybind11/stl.h>

namespace py = pybind11;



struct BoostGeometry2 {
    const static size_t dimensions = 2;
    typedef movetk_support::BoostGeometryTraits<long double, dimensions> Boost_Geometry_Backend;
    typedef movetk_core::MovetkGeometryKernel<typename Boost_Geometry_Backend::Wrapper_Boost_Geometry> MovetkGeometryKernel;
    typedef movetk_core::MakePoint<typename BoostGeometry2::MovetkGeometryKernel> MakePoint;
    typedef movetk_core::MakeSegment<typename BoostGeometry2::MovetkGeometryKernel> MakeSegment;
    typedef MovetkGeometryKernel::NT NT;
    typedef MovetkGeometryKernel::MovetkPoint MovetkPoint;
    typedef MovetkGeometryKernel::MovetkSegment MovetkSegment;
    typedef typename movetk_core::movetk_basic_iterator<const NT> CoordinateIterator;
};

PYBIND11_MODULE(movetk_geometry, m) {
    py::class_<typename BoostGeometry2::MovetkPoint>(m, "make_point", py::buffer_protocol())
            .def(py::init<>())
            .def(py::init([](py::buffer const buf) {
                py::buffer_info info = buf.request();
                if (info.format != py::format_descriptor<float>::format() ||
                    info.ndim != 1)
                    throw std::runtime_error("Incompatible buffer format!");
                typename BoostGeometry2::CoordinateIterator first(
                        static_cast<const typename BoostGeometry2::NT *>(info.ptr));
                typename BoostGeometry2::CoordinateIterator beyond(
                        static_cast<const typename BoostGeometry2::NT *>(info.ptr) + info.ndim);
                typename BoostGeometry2::MovetkPoint pt(first, beyond);
                return pt;

            }))
            .def_buffer([](typename BoostGeometry2::MovetkPoint &m) -> py::buffer_info {
                auto it = m.begin();
                return py::buffer_info(
                        std::addressof(*it),
                        BoostGeometry2::dimensions,
                        sizeof(typename BoostGeometry2::NT)
                );
            })
            .def("__repr__", [](const typename BoostGeometry2::MovetkPoint &m) {
                auto it = m.begin();
                return "[ " + std::to_string(*it) + "," + std::to_string(*(it + 1)) + " ]\n";
            });
}
import movetk_geometry as mg
pt =  mg.make_point(np.array([1,1]).astype(np.float32))
pt 
[ 0.000000,0.000000 ]

@aniketmitra001
Copy link
Collaborator Author

Further investigations on Buffer protocols and what works....

There are two ways in which this works

  1. The binding code exposes a C++ type (in this case a MovetkPoint) as a buffer object making it possible to cast the C++ type as a numpy.array and them it would be possible to completely avoid copy operations numpy.array(MoveTKPoint_instance, copy = False) . A way to achieve this would be to write the following binding code
PYBIND11_MODULE(movetk_geometry, m) {
    py::class_<typename BoostGeometry2::MovetkPoint>(m, "make_point", py::buffer_protocol())
            .def(py::init<>())
            .def_buffer([](typename BoostGeometry2::MovetkPoint &m) -> py::buffer_info {
                auto it = m.begin();
                return py::buffer_info(
                        std::addressof(*it),
                        BoostGeometry2::dimensions,
                        sizeof(typename BoostGeometry2::NT)
                );
            });
}

The above snippet exposes a MovetkPoint as a buffer object which now allows it to be cast as a numpy.array in python.
Setting values to this buffer object from pythons possible if we add a "setitem" method to the binding code as follows

   .def("__setitem__",[](typename BoostGeometry2::MovetkPoint &m , 
                               std:size_t idx, typename BoostGeometry2::NT value){
      ....
})

However, the caveat is that we consider a MoveTKPoint as an atomic unit, which can only be constructed by passing iterators (on the input data) to the constructor of MoveTKPoint

    template<class CoordinateIterator,
                typename = movetk_core::requires_random_access_iterator<CoordinateIterator> >
        Wrapper_Boost_Point(CoordinateIterator first,
                            CoordinateIterator beyond) {
            //ASSERT_NUMBER_TYPE(Kernel, first);
            pt.template set<0>(*first);
            pt.template set<1>(*(first + 1));
            if constexpr (Kernel::dim == 3) {
                pt.template set<2>(*(first + 2));
            }

        }

we do not allow the individual coordinates of the point object to be set by passing an index and value. This follows from the design principle of CGAL.

Open Discussion Point:

  • The caveat here is that once we have cast MovetkPoint as a numpy.array how do we pass iterators over the data from python to the binding code of MovetkPoint
  1. The second option is to pass a python buffer object to the binding code as follows
PYBIND11_MODULE(movetk_geometry, m) {
    py::class_<typename BoostGeometry2::MovetkPoint>(m, "make_point", py::buffer_protocol())
            .def(py::init<>())
            .def(py::init([](py::buffer const buf) {
                py::buffer_info info = buf.request();
                if (info.format != py::format_descriptor<float>::format() ||
                    info.ndim != 1)
                    throw std::runtime_error("Incompatible buffer format!");
                typename BoostGeometry2::CoordinateIterator first(
                        static_cast<const typename BoostGeometry2::NT *>(info.ptr));
                typename BoostGeometry2::CoordinateIterator beyond(
                        static_cast<const typename BoostGeometry2::NT *>(info.ptr) + info.ndim);
                typename BoostGeometry2::MovetkPoint pt(first, beyond);
                return pt;

            }))
            .def("__repr__", [](const typename BoostGeometry2::MovetkPoint &m) {
                auto it = m.begin();
                return "[ " + std::to_string(*it) + "," + std::to_string(*(it + 1)) + " ]\n";
            });
}

Open Discussion Point:

  • But as mentioned in the previous comment it seems that when numpy.array passed as a buffer object does not work. Possible causes could be (a) value in the address pointed to by info.ptr is incorrect (b) the address itself that info.ptr points to is incorrect

What Works?

If we finalise that it will only be possible to create a MoveTKPoint from python by using a numpy.array , then we can substitute .def(py::init([](py::buffer const buf) with .def(py::init([](py::array_t<typename BoostGeometry2::NT, BoostGeometry2::dimensions> const buf)

PYBIND11_MODULE(movetk_geometry, m) {
    py::class_<typename BoostGeometry2::MovetkPoint>(m, "make_point", py::buffer_protocol())
            .def(py::init<>())
            .def(py::init([](py::array_t<typename BoostGeometry2::NT,
                    BoostGeometry2::dimensions> const buf) {
                py::buffer_info info = buf.request();
                typename BoostGeometry2::CoordinateIterator first(
                        static_cast<const typename BoostGeometry2::NT *>(info.ptr));
                typename BoostGeometry2::CoordinateIterator beyond(
                        static_cast<const typename BoostGeometry2::NT *>(info.ptr) + BoostGeometry2::dimensions);
                typename BoostGeometry2::MovetkPoint pt(first, beyond);
                return pt;
            }))
            .def("__repr__", [](const typename BoostGeometry2::MovetkPoint &m) {
                auto it = m.begin();
                return "[ " + std::to_string(*it) + "," + std::to_string(*(it + 1)) + " ]\n";
            });
 }

which works fine as shown below

import numpy as np
import movetk_geometry as mg
mg.make_point(np.array([1,1]).astype(np.float32))
[ 1.000000,1.000000 ]

mg.make_point(np.array([1,1]))
[ 1.000000,1.000000 ]

mg.make_point(np.array([1,3]))
[ 1.000000,3.000000 ]

@aniketmitra001
Copy link
Collaborator Author

Added a notebook MovetkGeometry.ipynb

Binding code is movetk_geometry.cpp

The geometry backend , number type and dimensions of the geometry kernel is set as Cmake flags while building the project and is fixed for the whole project.

  • Inorder to change the geometry backend the Cmake flag WITH_CGAL_BACKEND has to be set to ON. The default geometry backend is Boost.Geometry
  • The number type and the dimensions have to be changed in GeometryBackendTraits.h before building the project

Build Status

  • Passes on Mac
  • Fails on Linux with the following error
In file included from /miniconda/include/pybind11/pytypes.h:12:0,
                 from /miniconda/include/pybind11/cast.h:13,
                 from /miniconda/include/pybind11/attr.h:13,
                 from /miniconda/include/pybind11/pybind11.h:44,
                 from /usr/src/movetk/py/movetk_geometry.cpp:25:
/miniconda/include/pybind11/detail/common.h:112:10: fatal error: Python.h: No such file or directory
 #include <Python.h>
          ^~~~~~~~~~
compilation terminated.

@aniketmitra001
Copy link
Collaborator Author

Added a notebook MovetkGeometry.ipynb

Binding code is movetk_geometry.cpp

The geometry backend , number type and dimensions of the geometry kernel is set as Cmake flags while building the project and is fixed for the whole project.

  • Inorder to change the geometry backend the Cmake flag WITH_CGAL_BACKEND has to be set to ON. The default geometry backend is Boost.Geometry
  • The number type and the dimensions have to be changed in GeometryBackendTraits.h before building the project

Build Status

  • Passes on Mac
  • Fails on Linux with the following error
In file included from /miniconda/include/pybind11/pytypes.h:12:0,
                 from /miniconda/include/pybind11/cast.h:13,
                 from /miniconda/include/pybind11/attr.h:13,
                 from /miniconda/include/pybind11/pybind11.h:44,
                 from /usr/src/movetk/py/movetk_geometry.cpp:25:
/miniconda/include/pybind11/detail/common.h:112:10: fatal error: Python.h: No such file or directory
 #include <Python.h>
          ^~~~~~~~~~
compilation terminated.

Fixed build issue on Fedora by installing the python3-devel package . See this Issue #23 for details

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Release v1.0
  
In progress
Development

Successfully merging a pull request may close this issue.

3 participants