-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add caching results for repeated invocations to speedup repeated lookups #33
Comments
Please clarify what kind of repeated calls you refer to, i.e. which function/use case. |
Do you mean the 4242 calls to os.walk here? |
After taking to @tfoote the 4242 calls are fine as they perform the filesystem traversal. The point this issue is about is subsequent invocations of the same query could utilize a cache. The cache would persist results to gain speed for the downside of potentially returning wrong results if the filesystem has changed in the mean time. The cache would need to be implemented cross-language to be usable by the different tools involved. I will mark this issue as "untargeted" for now as this is nothing which is currently scheduled to be implemented. |
Still, great progress towards making the legacy buildsystem as fast as it was in Fuerte. |
I see two ways to improve performance of rospack find_by_path. One is to avoid parsing the package.xml when that is not necessary. This can be done if this line
is replaced with:
under the assumption that a package folder is named as the name tag in the package.xml. Then restructuring the if statement before that, parsing can be avoided. (for a 10% gain or so). Also I guess we apply search often on /opt/ros/groovy, yet we only carebaout /opt/ros/groovy/stacks and /opt/ros/groovy/share. if the other folders in /opt/ros/groovy had ignore files (or would be ignored by some other means), that would reduce some file traversal. E.g. if ".catkin" had a line (IGNORE=...). That gave me a 30% boost or so. |
The assumption that the folder name equals the package name can not be made, so the proposed improvement is not possible. Could you describe in more detail in which use cases we search |
I think the name tag still has to be discussed. Will do so in Buildsystem sig. Regarding the folders, i guess I assumed wrong. |
Digging further through the code, I found that the OS detection code is quite inefficient for Ubuntu. This does not really show in the benchmarks because the time is actually spent in the external lsb_release binary. I rewrote that particular class in ros-infrastructure/rospkg#28, which again halves the runtime of rosdep on my system. The os.walk() does not seem to matter that much once the OS cache is hot. |
Just another idea, which might not do much, since I don't know the whole call graphs. Currently rospkg.rospackRosPack and rospkg.rospack.RosStack instances each have their own cache I believe, which is filled by crawling the same files and folders. So if both packages and stacks are needed, crawling is done twice. If any client uses both in the same process, then the code could be restructured to crawl only once, filling both caches. As an example, rosdep.lookup.get_resources_that_need() invokes functions that fill both rospack and rosstack caches, I believe. |
I will close this as "wontfix" since the API allows options for caching already () and crawling (at least once) is an inherent design decision in ROS 1. Please comment with specific functions which should provide caching and this can be reconsidered. |
An example of using the caching API is here: ros/ros_comm#542 |
This is now the slowest element when calling rosdep which is embedded inside rospack.
This is a follow onto ros-infrastructure/rosdep#218
The text was updated successfully, but these errors were encountered: