Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add caching results for repeated invocations to speedup repeated lookups #33

Closed
tfoote opened this issue Feb 20, 2013 · 11 comments
Closed

Comments

@tfoote
Copy link
Member

tfoote commented Feb 20, 2013

This is now the slowest element when calling rosdep which is embedded inside rospack.

This is a follow onto ros-infrastructure/rosdep#218

@dirk-thomas
Copy link
Member

Please clarify what kind of repeated calls you refer to, i.e. which function/use case.

@jbohren
Copy link
Contributor

jbohren commented Feb 20, 2013

@dirk-thomas
Copy link
Member

After taking to @tfoote the 4242 calls are fine as they perform the filesystem traversal. The point this issue is about is subsequent invocations of the same query could utilize a cache. The cache would persist results to gain speed for the downside of potentially returning wrong results if the filesystem has changed in the mean time.

The cache would need to be implemented cross-language to be usable by the different tools involved. I will mark this issue as "untargeted" for now as this is nothing which is currently scheduled to be implemented.

@jbohren
Copy link
Contributor

jbohren commented Feb 20, 2013

Still, great progress towards making the legacy buildsystem as fast as it was in Fuerte.

@tkruse
Copy link
Contributor

tkruse commented Feb 21, 2013

I see two ways to improve performance of rospack find_by_path. One is to avoid parsing the package.xml when that is not necessary. This can be done if this line

 resource_name = root.findtext('name')

is replaced with:

 resource_name = d

under the assumption that a package folder is named as the name tag in the package.xml. Then restructuring the if statement before that, parsing can be avoided. (for a 10% gain or so).

Also I guess we apply search often on /opt/ros/groovy, yet we only carebaout /opt/ros/groovy/stacks and /opt/ros/groovy/share. if the other folders in /opt/ros/groovy had ignore files (or would be ignored by some other means), that would reduce some file traversal. E.g. if ".catkin" had a line (IGNORE=...). That gave me a 30% boost or so.

@dirk-thomas
Copy link
Member

The assumption that the folder name equals the package name can not be made, so the proposed improvement is not possible.

Could you describe in more detail in which use cases we search /opt/ros/groovy for packages? As far as I can see only the subfolders share and stacks are in the package path.

@tkruse
Copy link
Contributor

tkruse commented Feb 21, 2013

I think the name tag still has to be discussed. Will do so in Buildsystem sig.

Regarding the folders, i guess I assumed wrong.

@roehling
Copy link
Contributor

Digging further through the code, I found that the OS detection code is quite inefficient for Ubuntu. This does not really show in the benchmarks because the time is actually spent in the external lsb_release binary. I rewrote that particular class in ros-infrastructure/rospkg#28, which again halves the runtime of rosdep on my system.

The os.walk() does not seem to matter that much once the OS cache is hot.

@tkruse
Copy link
Contributor

tkruse commented Feb 21, 2013

Just another idea, which might not do much, since I don't know the whole call graphs. Currently rospkg.rospackRosPack and rospkg.rospack.RosStack instances each have their own cache I believe, which is filled by crawling the same files and folders. So if both packages and stacks are needed, crawling is done twice.

If any client uses both in the same process, then the code could be restructured to crawl only once, filling both caches. As an example, rosdep.lookup.get_resources_that_need() invokes functions that fill both rospack and rosstack caches, I believe.

@dirk-thomas
Copy link
Member

I will close this as "wontfix" since the API allows options for caching already () and crawling (at least once) is an inherent design decision in ROS 1.

Please comment with specific functions which should provide caching and this can be reconsidered.

@tfoote
Copy link
Member Author

tfoote commented Jan 9, 2015

An example of using the caching API is here: ros/ros_comm#542

@dirk-thomas dirk-thomas removed this from the untargeted milestone Dec 8, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants