Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving conversion performance by avoiding repeated imports #9

Closed
martins-mozeiko opened this issue Aug 2, 2018 · 15 comments
Closed
Labels
enhancement New feature or request

Comments

@martins-mozeiko
Copy link
Contributor

Feature request

Improve conversion performance for python message objects

Feature description

Every time python message is serialized or unserialized then generated code walks over all message fields and tries to import type which provides converter function pointers. Here and here.

These imports are local - meaning that once function exits, the references are decremented and next time it is called same imports happen again. Doing this for large array of elements or large amount of messages makes serialization/unserialization code pretty slow - especially on weaker platforms like raspberry pi. I have profiled Python code and I can see Python import functionality at top of the profile - they are called hundreds and thousands of times for my use case.

I suggest to call target C functions directly without going through Python. This would require each generated module to export @(spec.base_type.pkg_name)_@(module_name)__convert_from_py and @(spec.base_type.pkg_name)_@(module_name)__convert_to_py functions from shared library.

Is this change reasonable?

Implementation considerations

Generator would need to change following places:

  1. create corresponding header file which can be included and target function called directly

  2. build system will to be adjusted to correctly link dependent shared libraries together

Then for all non-primitive fields change generated code to:

  1. include header file that declares conversion function

  2. call C function directly instead of getting function pointer through Python capsule API

Example

Lets look how header field for sensor_msgs/CompressedImage message is converted. Currently it has following code in sensor_msgs_compressed_image__convert_from_py function:

  {  // header
    PyObject * field = PyObject_GetAttrString(_pymsg, "header");
    if (!field) {
      return false;
    }
    typedef bool (* convert_from_py_signature)(PyObject *, void *);
    convert_from_py_signature convert_from_py = NULL;
    // get conversion function
    {
      PyObject * msg_module = PyImport_ImportModule("std_msgs.msg._header");
      if (!msg_module) {
        Py_DECREF(field);
        return false;
      }
      PyObject * msg_class = PyObject_GetAttrString(msg_module, "Header");
      Py_DECREF(msg_module);
      if (!msg_class) {
        Py_DECREF(field);
        return false;
      }
      PyObject * msg_metaclass = PyObject_GetAttrString(msg_class, "__class__");
      Py_DECREF(msg_class);
      if (!msg_metaclass) {
        Py_DECREF(field);
        return false;
      }
      PyObject * convert_from_py_ = PyObject_GetAttrString(msg_metaclass, "_CONVERT_FROM_PY");
      Py_DECREF(msg_metaclass);
      if (!convert_from_py_) {
        Py_DECREF(field);
        return false;
      }
      convert_from_py = (convert_from_py_signature)PyCapsule_GetPointer(convert_from_py_, NULL);
      Py_DECREF(convert_from_py_);
      if (!convert_from_py) {
        Py_DECREF(field);
        return false;
      }
    }
    if (!convert_from_py(field, &ros_message->header)) {
      Py_DECREF(field);
      return false;
    }
    Py_DECREF(field);
  }

I propose to generate following code instead:

  {  // header
    PyObject * field = PyObject_GetAttrString(_pymsg, "header");
    if (!field) {
      return false;
    }
    if (!std_msgs_header__convert_from_py(field, &ros_message->header)) {
      Py_DECREF(field);
      return false;
    }
    Py_DECREF(field);
  }

This would require to add something like #include "std_msgs/msg/python_header.h" at beginning of file. And this would happen for all non-primitive fields.

std_msgs_header__convert_from_py would be function exported by std_msgs_s__rosidl_typesupport_c.cpython-35m-x86_64-linux-gnu.so file.

@mikaelarguedas
Copy link
Member

that sounds like a good plan to me 👍

This is a follow-up we mentioned but didnt get around implementing: ros2/rclpy#36 (comment)
It will definitely improve performance significantly.

One blocker for implementing that solution at the time was that Windows was unable to link .pyd library to another .pyd lib so we had to go through capsules (that was not an issue on Macos or Linux as they are just reguler shared libraries).
If this is still a problem with recent Visual Studio, an alternative solution is to find a way to make the imports nonlocal so that we pay the import cost only once.

@mikaelarguedas mikaelarguedas added the enhancement New feature or request label Aug 2, 2018
@martins-mozeiko
Copy link
Contributor Author

Ok, I'll check how his is possible on Windows with dll files.

Technically dll files supports delay loaded dll files if built by Visual Studio compiler (not so sure about gcc on Windows). That will make import resolving only at first time function is called, not when dll is loaded. This is one option to solve this.

I was also thinking of having imports just once - to store pointers globally. This is actually what I've already tried and it worked fine for us. Not sure if this is good solution. I can see this failing if some module/dll file is unloaded, then recompiled and new module/dll loaded with different functions now. Do you expect this kind of situation?

@mikaelarguedas
Copy link
Member

Technically dll files supports delay loaded dll files if built by Visual Studio compiler (not so sure about gcc on Windows). That will make import resolving only at first time function is called, not when dll is loaded. This is one option to solve this.

That would be great! We are targetting MSVC on windows at the moment so if it works with MSVC and not gcc on Windows it's not a blocker for getting it merged.
Hopefully this feature will not be part of the list of 'slight differences' between dll and pyd on Windows.

I was also thinking of having imports just once - to store pointers globally. This is actually what I've already tried and it worked fine for us.

I thinks that now that type support libraries are loaded at runtime this can work 👍 this was not the case before so we needed to delay the loading to figure out which library should be loaded first.

Do you expect this kind of situation?

Not currently especially as all the loaded libraries are message packages I think it's fine not to support this use case

@martins-mozeiko
Copy link
Contributor Author

martins-mozeiko commented Aug 3, 2018

Ok, I've tried delay loaded dll files on Windows, and they work fine for generated Python conversion code in pyd modules. I'll do a bit more testing, implement Linux part and will submit pull request.

There are no issues with dll vs pyd. pyd file is normal dll file. There is nothing special about it. Python simply expects pyd files to export specific symbol when it is loaded for import to work (PyInit_$ModuleName).

@mikaelarguedas
Copy link
Member

Great news, thanks for looking into it !

@martins-mozeiko
Copy link
Contributor Author

martins-mozeiko commented Aug 4, 2018

I have one issue on Linux - they way how shared libraries for python modules are placed in folders, they are not in the same place.

For example - std_msgs are built in install/lib/python3.5/site-packages/std_msgs/std_msgs_s__rosidl_typesupport_c.cpython-35m-x86_64-linux-gnu.so location. And it because it depends on builtin_interfaces it needs builtin_interfaces_s__rosidl_typesupport_c.cpython-35m-x86_64-linux-gnu.so file. But this file is located in install/lib/python3.5/site-packages/builtin_interfaces/builtin_interfaces_s__rosidl_typesupport_c.cpython-35m-x86_64-linux-gnu.so location.

This would mean I need to update LD_LIBRARY_PATH with location to each folder which contains python module. In this example it would be install/lib/python3.5/site-packages/std_msgs and install/lib/python3.5/site-packages/builtin_intefaces.

I'm not sure where can I update it. It seems LD_LIBRARY_PATH is already being changed from install/share/std_msgs/hook/ld_library_path.sh script. Is this generated by colcon? I'm not sure I understand how I can update it. Will I need to patch colcon?

You can see my current progress in this commit: https://github.com/martins-mozeiko/rosidl_python/commit/954ecafe3df64c8acf12c7977b8e524f0a896bd9
On Windows everything works (building & running). On Linux build succeeds. It does work for isolated install case (because rpath is set in shared libraries), but it does not work for merge-install case for reason mentioned above - shared libraries cannot find each another.

@martins-mozeiko
Copy link
Contributor Author

Currently I took pythonpath.py as example and created similar code that creates python_ld_library_path.sh script which updates LD_LIBRARY_PATH in same way as colcon_library_path already is doing this for "lib" folder.

For example, for std_msgs it looks like this:

colcon_prepend_unique_value LD_LIBRARY_PATH "$COLCON_CURRENT_PREFIX/lib/python3.5/site-packages/std_msgs"

Where you would prefer this change?

  • in new colcon package (colcon_python_module_library_path?)
  • in new file in colcon_library_path package
  • in new file, next to pythonpath.py, in colcon-core/environment
  • in pythonpath.py add new hook to returned list of hooks
  • ... other options?

@mikaelarguedas
Copy link
Member

My feeling is that this should work regardless of the build tool used for building the workspace.
So I would not make changes in colcon but rather in ament_package's templates that are used by all ament_cmake packages.

As adding a python install directory to the library_path is something specific to some packages. I think we should provide a cmake utility to do so, so that it doesn't get appended unconditionally.

@dirk-thomas may have more feedback as he is more familiar with this part of the code

@dirk-thomas
Copy link
Member

I don't think that adding a per-package library directory is going to work out. On Windows the length of that environment variable would easily exceed the maximum length with a large number of message packages.

@martins-mozeiko
Copy link
Contributor Author

martins-mozeiko commented Aug 7, 2018

This is not needed on Windows. Only on Linux/macOS. On Windows /delayload takes care of resolving symbols at runtime, not dll load-time.

My modifications in 954ecaf already work correctly on Windows without need to change anything else. Only changes to LD_LIBRARY_PATH are needed on Linux (and similar on macOS).

@dirk-thomas
Copy link
Member

The CMake can call ament_environment_hooks to add arbitrary hooks. The template can be provided by the package generating the library (I guess rosidl_generator_py?).

I would still try to place the libraries if anyhow possible into a common "LD_LIBRARY_PATH" location before adding these package specific paths to the env var.

@martins-mozeiko
Copy link
Contributor Author

martins-mozeiko commented Aug 7, 2018

Ok, I'll check how ament_environment_hooks works.

I cannot change location of current libraries, because Python requires them to be in specific folders for imports to work. Unless we set PYTHONPATH to lib folder and I'm not sure if this is good approach.

I could split library into two parts - so each message package would generate two libraries. One in regular lib folder with convert_from/to_py functions code, but still linking to python .so file for Python API. This way they will be able to link to each other without adjusting ld library path. Then second library (what currently already exists) will link to this new library for calling conversion functions.

If introducing new library is OK, then I can also look into this option.

@mmozeiko
Copy link

mmozeiko commented Aug 9, 2018

Here is alternative with splitting generated message library into two libraries:
https://github.com/martins-mozeiko/rosidl_python/commit/6bd75b87bc8d24539ed0fede6225c2d86b91e3b6

For example, for std_msgs it creates install/lib/libstd_msgs__python_rosidl_typesupport_c.so and install/lib/python3.5/site-packages/std_msgs/std_msgs_s__rosidl_typesupport_c.cpython-35m-x86_64-linux-gnu.so file. The convert_to_py and convert_from_py functions are in this new libstd_msgs__python_...so file. This file also can link to other message files like builtin_interfaces without need to adjust any environment variables - because it is placed in standard "lib" folder. The original python module .so file (.pyd on Windows) links to new libstd_msgs...so file to be able to use convert_from/to_py function pointers into Python capsule API.

This approach works on Windows and Linux without need to adjust any environment variables.
Let me know what you think of this change. I was not sure about naming of new library, if needed I can change it.

@sloretz sloretz added the in review Waiting for review (Kanban column) label Aug 9, 2018
@dirk-thomas
Copy link
Member

Here is alternative with splitting generated message library into two libraries:

On a first look it looks good to me. Please create a PR for it. Also include steps to compare the performance in the PR description so that others can run those to evaluate the difference.

This approach works on Windows and Linux without need to adjust any environment variables.

The patch might need some updates in order to work on macOS - just assuming - haven't tried building it yet.

@dirk-thomas dirk-thomas added in progress Actively being worked on (Kanban column) and removed in review Waiting for review (Kanban column) labels Aug 10, 2018
@mikaelarguedas mikaelarguedas added in review Waiting for review (Kanban column) and removed in progress Actively being worked on (Kanban column) labels Sep 17, 2018
mikaelarguedas pushed a commit that referenced this issue Sep 18, 2018
* Call conversion functions directly

This makes generated convert to and from function to call C functions
direclty instead of going through Python capsule API. Conversion
functions are placed in separate libraries that link to each other.
Python module libraries links to these new libraries.

See #9 for more details.

* generate single _python lib per package

* use visibility macros from visibility_control.h

* use dylib extension for libraries on macOS

* spurious line change

* remove duplicated include

* blank line fixups

* revert unnecessary code relocation

* rename variables and use lib folder instad of Lib on macos

* create lowercase_field_type for readability and reuse

* remove debug prints

* rename functions to use double underscore between package name and field type
@mikaelarguedas
Copy link
Member

fixed by #10

@mikaelarguedas mikaelarguedas removed the in review Waiting for review (Kanban column) label Sep 18, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants