-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add rational why ament_index pkgs don't have explicit performance tests #65
Conversation
Signed-off-by: Dirk Thomas <dirk-thomas@users.noreply.github.com>
I believe with these justifications, this package can be considered level 1 😄 If that's indeed the case, do you mind updating the QD accordingly? |
I would rather get this PR approved and merged first and then do a follow up PR to bump the level (which needs a separate round of review if all criteria are satisfied). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this high-level description is accurate, I don't see how it explains performance testing is not necessary. REP-2004 does not specify when testing for performance is a requirement, but it does depict performance testing as a form of regression testing, which IMHO leaves little room for arguing against it.
Perhaps if we were to say that ament_index
file structure will not change, in addition to IO access being the main bottleneck, we can argue that changing the query logic itself won't impact performance.
@@ -121,7 +121,13 @@ A description of how coverage statistics are summarized from this page, can be f | |||
|
|||
### Performance [4.iv] | |||
|
|||
`ament_index_cpp` does not conduct performance tests. | |||
An environment variable defines the prefix paths of such resource indices and the API has a time complexity of `O(n)` where `n` is the number of prefix paths. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dirk-thomas meta: one could argue that we should have tests that backup that claim.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would you create a test to actually do that? What kind of thresholds would you use?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd probably benchmark queries with increasing values of n
and check if the result of a linear regression reasonably (within one or two standard deviations) explains the measurements. We would need a big index to see through OS noise though, I give you that.
And actually, now that I think about it, perhaps that's the only kind of performance test that would make sense here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Measuring enough different N trying to figure out if the complexity is linear as promised would be possible. (Even though this sounds extremely costly to me.)
An environment variable defines the prefix paths of such resource indices and the API has a time complexity of `O(n)` where `n` is the number of prefix paths. | ||
The time complexity to query information is either scaling lineraly with the number of resource types or with the number of resources per type (depending on which dimension is requested). | ||
If the content of a specific resource is retrieved the time complexity is linear to the size of the content as is the memory usage in that case since the content is returned to the caller. | ||
The performance of the implementation is defined by the performance of the underlying filesystem functions and the implemented logic doesn't add any significant overhead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dirk-thomas meta: while I agree the logic overhead is likely negligible compared to that of IO, the performance of ament_index
queries has more to do with the way the index is structured than with the code itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is why I tried to describe the complexity of the different kind of queries available.
Arguably if there would be a more efficient way the resource index could be structured to make queries more efficient (which still satisfies the requirements) that might be something better discussed on the design document (https://github.com/ament/ament_cmake/blob/master/ament_cmake_core/doc/resource_index.md).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arguably if there would be a more efficient way the resource index could be structured to make queries more efficient
Sure, my point being that to say The performance of the implementation is defined by the performance of the underlying filesystem functions
is not entirely correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re-reading this thread, I get the impression that we're using the term performance
for different things (algorithmic complexity vs. runtime cost). Since you do mention time-complexity, perhaps rephrasing as:
The performance of the implementation is defined by the performance of the underlying filesystem functions and the implemented logic doesn't add any significant overhead. | |
The runtime cost of the implementation is dominated by the runtime cost of the underlying filesystem API, and the implemented logic doesn't add any significant overhead. |
would constrain interpretation. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me - applied in both locations in d0dbd70.
Signed-off-by: Dirk Thomas <dirk-thomas@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this high-level description is accurate, I don't see how it explains performance testing is not necessary. REP-2004 does not specify when testing for performance is a requirement, but it does depict performance testing as a form of regression testing, which IMHO leaves little room for arguing against it.
I can't speak for the intention of the REP since I wasn't closely involved in its creation. This is basically something the team has to decide: is it worth / necessary / useful to cover this API with performance tests (and if the answer is yes: what exactly should the performance test actually check for - at least to me that is not obvious).
@@ -121,7 +121,13 @@ A description of how coverage statistics are summarized from this page, can be f | |||
|
|||
### Performance [4.iv] | |||
|
|||
`ament_index_cpp` does not conduct performance tests. | |||
An environment variable defines the prefix paths of such resource indices and the API has a time complexity of `O(n)` where `n` is the number of prefix paths. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would you create a test to actually do that? What kind of thresholds would you use?
An environment variable defines the prefix paths of such resource indices and the API has a time complexity of `O(n)` where `n` is the number of prefix paths. | ||
The time complexity to query information is either scaling lineraly with the number of resource types or with the number of resources per type (depending on which dimension is requested). | ||
If the content of a specific resource is retrieved the time complexity is linear to the size of the content as is the memory usage in that case since the content is returned to the caller. | ||
The performance of the implementation is defined by the performance of the underlying filesystem functions and the implemented logic doesn't add any significant overhead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is why I tried to describe the complexity of the different kind of queries available.
Arguably if there would be a more efficient way the resource index could be structured to make queries more efficient (which still satisfies the requirements) that might be something better discussed on the design document (https://github.com/ament/ament_cmake/blob/master/ament_cmake_core/doc/resource_index.md).
Agreed. I cannot say if it's worth or not (though probably isn't compared to the rest of the system), but I honestly can't come up with any other way to justify dropping performance regression testing than to say we won't be changing the main algorithm any time soon. |
Another thought about whether to test this for performance or not. The problem is how REP-2004 frames performance testing. Perhaps we shouldn't be trying to justify testing or not for performance regressions in every package, but putting together a set of concrete use cases and benchmarking critical paths. |
What's the problem? It just says you need a policy. Your policy can be that you will not do performance testing or block releases based on their results. You just need to justify it. Do you have a specific part of the REP in mind that is problematic? |
Quote:
On what basis can we justify that this package should not test for performance degradation? This patch describes FWIW I'm playing devil's advocate here on purpose, trying to make the same inquires a third-party that's interested in this Quality Declaration would make. Specially because IIUC this patch will be adapted to many other packages. Do not block on my concerns if you think that's not to worry about or that it is sufficient as it stands. |
I think the core argument is that the answer to "if performance is a reasonable concern for use in a production system" is "no" for this package due to the way it is implemented and the places it is used. |
Ok, fair enough. Let's hinge on this is not likely to be part of a critical path in runtime. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this justification is ok, the two main takeaways I get are this:
ament_index_cpp
does not need performance tests because:- the algorithms implemented here are not at high risk for performance degradation
- (more importantly)
ament_index_cpp
is not used in performance critical parts of the ecosystem right now
If in the future new algorithms are introduced, the first point could be reviewed, and if in the future we start using this in performance critical parts of the stack then we can reconsider it.
I hope the current statement:
covers the important reason well enough. Thanks for the feedback and discussion! |
This is a first draft for a rational why performance tests aren't necessary for
ament_index_*
packages.