Time execution of robot_wrapper.display are dependent on the call frequency ? #1068

thomasfla · 2020-02-16T11:28:27Z

To give a little bit of context, I'm planning on making the connection with the viewer non blocking by using a shared memory and the multiprocess module. While doing so, I realized some strange behavior, that someone might be able to explain :

I ran some test and realized that execution time of python calls to robot_wrapper.display are dependent on the call frequency.

If called a few times, display execution time is about 1ms in the worst case on my machine,
however, called in a loop, with a reasonable delay of 50ms between calls increase this time to 17ms!

import cProfile
def d(r,t=0):
    for i in range(r):
        robot.display(q_viewer)
        time.sleep(t)

cProfile.run('d(1,0.05)')

gives

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.051    0.051 <ipython-input-3-e24e84bd22c0>:1(d)
        1    0.000    0.000    0.051    0.051 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 gepetto_visualizer.py:114(<listcomp>)
        1    0.000    0.000    0.000    0.000 gepetto_visualizer.py:115(<listcomp>)
       13    0.000    0.000    0.000    0.000 gepetto_visualizer.py:9(getViewerNodeName)
        1    0.000    0.000    0.001    0.001 gepetto_visualizer.py:95(display)
        1    0.000    0.000    0.000    0.000 graphical_interface_idl.py:577(applyConfigurations)
        1    0.000    0.000    0.000    0.000 graphical_interface_idl.py:580(refresh)
        1    0.000    0.000    0.001    **0.001** robot_wrapper.py:248(display)
        1    0.000    0.000    0.051    0.051 {built-in method builtins.exec}
        1    0.050    0.050    0.050    0.050 {built-in method time.sleep}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        2    0.000    0.000    0.000    0.000 {method 'invoke' of '_omnipy.PyObjRefObject' objects}

but cProfile.run('d(1000,0.05)') gives

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.016    0.016   66.605   66.605 <ipython-input-3-e24e84bd22c0>:1(d)
        1    0.000    0.000   66.605   66.605 <string>:1(<module>)
     1000    0.072    0.000    0.091    0.000 gepetto_visualizer.py:114(<listcomp>)
     1000    0.135    0.000    0.135    0.000 gepetto_visualizer.py:115(<listcomp>)
    13000    0.019    0.000    0.019    0.000 gepetto_visualizer.py:9(getViewerNodeName)
     1000    0.074    0.000   16.502    0.017 gepetto_visualizer.py:95(display)
     1000    0.006    0.000    0.200    0.000 graphical_interface_idl.py:577(applyConfigurations)
     1000    0.006    0.000   16.001    0.016 graphical_interface_idl.py:580(refresh)
     1000    0.006    0.000   16.508    **0.017** robot_wrapper.py:248(display)
        1    0.000    0.000   66.605   66.605 {built-in method builtins.exec}
     1000   50.082    0.050   50.082    0.050 {built-in method time.sleep}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
     2000   16.191    0.008   16.191    0.008 {method 'invoke' of '_omnipy.PyObjRefObject' objects}

Even more bizarre, when I call d(100,0.), the cpu spend 0.28ms per display calls in average..
but d(1000,0.) gives an average of 3 to 6ms per display calls.

The text was updated successfully, but these errors were encountered:

jmirabel · 2020-02-16T12:17:36Z

Using gui.setRefreshIsSynchronous(False) before the benchmark should give a more stable behavior. It makes the call to refresh de-synchronized with the rendering loop.

thomasfla · 2020-02-16T15:20:53Z

Many thanks @jmirabel this solved my timing issues.
In the meantime, i also implemented a very simple multiprocess class to call the viewer an other process:

from multiprocessing import Process, Lock
from multiprocessing.sharedctypes import Value, Array
from ctypes import c_double
import pinocchio as pin
import numpy as np
import time
class NonBlockingViewer():
    def __init__(self,robot,timeIntervals=0.05):
        # a shared c_double array
        self.timeIntervals = timeIntervals
        self.shared_q_viewer = Array(c_double, robot.nq, lock=False)
        p = Process(target=self.display_process, args=(robot, self.shared_q_viewer))
        p.start()
        
    def display_process(self,robot, shared_q_viewer):
        ''' This will run on a different process'''
        q_viewer = robot.q0.copy()
        while(1):
            for i in range(robot.nq):
                q_viewer[i] = shared_q_viewer[i]
            robot.display(q_viewer)
            time.sleep(self.timeIntervals)

    def display(self,q):
        for i in range(len(self.shared_q_viewer)):
            self.shared_q_viewer[i] = q[i]


def benchmark():
    pin.switchToNumpyMatrix()
    modelPath = "/opt/openrobots/lib/python3.5/site-packages/../../../share/example-robot-data/robots"
    urdf = modelPath + "/solo_description/robots/solo.urdf"
    robot = pin.RobotWrapper.BuildFromURDF( urdf, modelPath, pin.JointModelFreeFlyer())
    robot.initDisplay(loadModel=True)   

    def job(n,t,disp):
        for i in range(n):
            disp(robot.q0)
            robot.q0[7:]+=0.01
            time.sleep(t)

    #blocking, in sync
    t=time.time()
    job(10000,0.01,robot.display)
    print (time.time()-t)

    #blocking, not synchronous
    robot.viewer.gui.setRefreshIsSynchronous(False)
    t=time.time()
    job(10000,0.01,robot.display)
    print (time.time()-t)

    #non blocking
    nbv = NonBlockingViewer(robot,0.0)
    t=time.time()
    job(10000,0.01,nbv.display)
    print (time.time()-t)

benchmark()

benchmarks give for 10k calls delayed by 10ms:
default : 333.11 s
Asynchronous gui : 106.87 s
shared memory : 101.55 s

So basically, the shared memory only saves 0.5ms per display calls.
This is still useful in my case since I'm calling the display from a realtime thread when prototyping 1kHz controller.

jmirabel · 2020-02-16T15:54:12Z

A few remarks:

the accuracy of time.sleep is not so high and is close to the actual computation cost of the function you want to benchmark. Running the following script gives me ~101.30 seconds

import time

def job(n,t):
    for _ in range(n):
        time.sleep(t)

t = time.time()
job(10000, 0.01)
print(time.time() - t)

from your benchmark, I understand that the asynchronous gui takes about 0.6ms per call. This is reasonable because there are inter-process communications. However, 0.155ms for the shared memory benchmark is extremely high. You only copy a configuration vector.

Putting the two together, I think the 1.55 extra seconds (in 101.55s) are due, almost entirely, to the bad accuracy of time.sleep.

thomasfla · 2020-02-16T17:07:47Z

You are right, the shared memory method is basically the reference point (Comparable of doing nothing), this should not really be part of the benchmark.
As you point out, I can still deduce the time of the display with asynchronous gui, witch is, as you said closer to 0.6ms than 0.5ms.

jcarpent · 2020-02-16T20:45:09Z

Using gui.setRefreshIsSynchronous(False) before the benchmark should give a more stable behavior. It makes the call to refresh de-synchronized with the rendering loop.

Why to not set this value by default in gepetto-gui?

jmirabel · 2020-02-17T08:17:58Z

Why to not set this value by default in gepetto-gui?

I asked myself the same thing. I think the only reason was for code like:

gui.refresh()
gui.captureFrame("image")

When not synchronous, you don't know which image will be rendered to the file. However, after thinking twice about it and checking the source code, I think you are still guaranteed to render to latest one.

I will change the default for some time on my laptop to see if I encounter any other issues.

jcarpent · 2020-02-18T16:21:06Z

Thanks @jmirabel for solving this issue.

jmirabel mentioned this issue Feb 17, 2020

Desynchronize method refresh with rendering loop by default Gepetto/gepetto-viewer#120

Closed

jcarpent closed this as completed Feb 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Time execution of robot_wrapper.display are dependent on the call frequency ? #1068

Time execution of robot_wrapper.display are dependent on the call frequency ? #1068

thomasfla commented Feb 16, 2020

jmirabel commented Feb 16, 2020

thomasfla commented Feb 16, 2020

jmirabel commented Feb 16, 2020 •

edited

Loading

thomasfla commented Feb 16, 2020

jcarpent commented Feb 16, 2020

jmirabel commented Feb 17, 2020

jcarpent commented Feb 18, 2020

Time execution of robot_wrapper.display are dependent on the call frequency ? #1068

Time execution of robot_wrapper.display are dependent on the call frequency ? #1068

Comments

thomasfla commented Feb 16, 2020

jmirabel commented Feb 16, 2020

thomasfla commented Feb 16, 2020

jmirabel commented Feb 16, 2020 • edited Loading

thomasfla commented Feb 16, 2020

jcarpent commented Feb 16, 2020

jmirabel commented Feb 17, 2020

jcarpent commented Feb 18, 2020

jmirabel commented Feb 16, 2020 •

edited

Loading