add caching results for repeated invocations to speedup repeated lookups #33

tfoote · 2013-02-20T08:31:53Z

This is now the slowest element when calling rosdep which is embedded inside rospack.

This is a follow onto ros-infrastructure/rosdep#218

dirk-thomas · 2013-02-20T18:54:31Z

Please clarify what kind of repeated calls you refer to, i.e. which function/use case.

jbohren · 2013-02-20T20:49:22Z

Do you mean the 4242 calls to os.walk here?
https://f.cloud.github.com/assets/447804/174243/aa6f70d4-7b00-11e2-91dc-3b89958e9162.png
https://github.com/ros/rospkg/blob/master/src/rospkg/rospack.py#L57

dirk-thomas · 2013-02-20T21:57:01Z

After taking to @tfoote the 4242 calls are fine as they perform the filesystem traversal. The point this issue is about is subsequent invocations of the same query could utilize a cache. The cache would persist results to gain speed for the downside of potentially returning wrong results if the filesystem has changed in the mean time.

The cache would need to be implemented cross-language to be usable by the different tools involved. I will mark this issue as "untargeted" for now as this is nothing which is currently scheduled to be implemented.

jbohren · 2013-02-20T23:53:25Z

Still, great progress towards making the legacy buildsystem as fast as it was in Fuerte.

tkruse · 2013-02-21T00:33:12Z

I see two ways to improve performance of rospack find_by_path. One is to avoid parsing the package.xml when that is not necessary. This can be done if this line

 resource_name = root.findtext('name')

is replaced with:

 resource_name = d

under the assumption that a package folder is named as the name tag in the package.xml. Then restructuring the if statement before that, parsing can be avoided. (for a 10% gain or so).

Also I guess we apply search often on /opt/ros/groovy, yet we only carebaout /opt/ros/groovy/stacks and /opt/ros/groovy/share. if the other folders in /opt/ros/groovy had ignore files (or would be ignored by some other means), that would reduce some file traversal. E.g. if ".catkin" had a line (IGNORE=...). That gave me a 30% boost or so.

dirk-thomas · 2013-02-21T00:41:00Z

The assumption that the folder name equals the package name can not be made, so the proposed improvement is not possible.

Could you describe in more detail in which use cases we search /opt/ros/groovy for packages? As far as I can see only the subfolders share and stacks are in the package path.

tkruse · 2013-02-21T08:19:57Z

I think the name tag still has to be discussed. Will do so in Buildsystem sig.

Regarding the folders, i guess I assumed wrong.

roehling · 2013-02-21T10:53:48Z

Digging further through the code, I found that the OS detection code is quite inefficient for Ubuntu. This does not really show in the benchmarks because the time is actually spent in the external lsb_release binary. I rewrote that particular class in ros-infrastructure/rospkg#28, which again halves the runtime of rosdep on my system.

The os.walk() does not seem to matter that much once the OS cache is hot.

tkruse · 2013-02-21T22:16:44Z

Just another idea, which might not do much, since I don't know the whole call graphs. Currently rospkg.rospackRosPack and rospkg.rospack.RosStack instances each have their own cache I believe, which is filled by crawling the same files and folders. So if both packages and stacks are needed, crawling is done twice.

If any client uses both in the same process, then the code could be restructured to crawl only once, filling both caches. As an example, rosdep.lookup.get_resources_that_need() invokes functions that fill both rospack and rosstack caches, I believe.

dirk-thomas · 2015-01-09T19:19:17Z

I will close this as "wontfix" since the API allows options for caching already () and crawling (at least once) is an inherent design decision in ROS 1.

Please comment with specific functions which should provide caching and this can be reconsidered.

tfoote · 2015-01-09T19:19:36Z

An example of using the caching API is here: ros/ros_comm#542

tfoote mentioned this issue Feb 20, 2013

Groovy rosmake / rosdep is very slow ros-infrastructure/rosdep#218

Closed

dirk-thomas closed this as completed Jan 9, 2015

dirk-thomas removed this from the untargeted milestone Dec 8, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add caching results for repeated invocations to speedup repeated lookups #33

add caching results for repeated invocations to speedup repeated lookups #33

tfoote commented Feb 20, 2013

dirk-thomas commented Feb 20, 2013

jbohren commented Feb 20, 2013

dirk-thomas commented Feb 20, 2013

jbohren commented Feb 20, 2013

tkruse commented Feb 21, 2013

dirk-thomas commented Feb 21, 2013

tkruse commented Feb 21, 2013

roehling commented Feb 21, 2013

tkruse commented Feb 21, 2013

dirk-thomas commented Jan 9, 2015

tfoote commented Jan 9, 2015

add caching results for repeated invocations to speedup repeated lookups #33

add caching results for repeated invocations to speedup repeated lookups #33

Comments

tfoote commented Feb 20, 2013

dirk-thomas commented Feb 20, 2013

jbohren commented Feb 20, 2013

dirk-thomas commented Feb 20, 2013

jbohren commented Feb 20, 2013

tkruse commented Feb 21, 2013

dirk-thomas commented Feb 21, 2013

tkruse commented Feb 21, 2013

roehling commented Feb 21, 2013

tkruse commented Feb 21, 2013

dirk-thomas commented Jan 9, 2015

tfoote commented Jan 9, 2015