-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: use the range
to limit offers, not definitions
#2018
fix: use the range
to limit offers, not definitions
#2018
Conversation
Codecov ReportBase: 64.33% // Head: 64.28% // Decreases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## main #2018 +/- ##
==========================================
- Coverage 64.33% 64.28% -0.05%
==========================================
Files 780 780
Lines 16537 16573 +36
Branches 1076 1081 +5
==========================================
+ Hits 10639 10654 +15
- Misses 5448 5468 +20
- Partials 450 451 +1
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
1fa968a
to
d9fc539
Compare
d9fc539
to
54efc3e
Compare
54efc3e
to
ffe0715
Compare
range
to limit offers, not definitions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm worried about the performances implications of this
@@ -55,15 +57,21 @@ public ContractOfferServiceImpl(ParticipantAgentService agentService, ContractDe | |||
@NotNull | |||
public Stream<ContractOffer> queryContractOffers(ContractOfferQuery query, Range range) { | |||
var agent = agentService.createFor(query.getClaimToken()); | |||
|
|||
return definitionService.definitionsFor(agent, range) | |||
var offers = definitionService.definitionsFor(agent) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that there aren't many ways to do this, but could this lead to long-blocked threads?
I mean, for every page we have to fetch all the definitions - that could be not a big issue, also in the case if there are a lot of them, since they are really little.
But, on the last page request here the service will fetch all the definitions, query/evaluate all the policies, fetch all the assets, and then skip
and limit
are applied, as they are located after a flatMap
, so the time needed to fetch the last page is actually the same time needed to fetch all the catalog in a single call.
e.g. (speculative) having 100.000 (pretty standard value) definitions means that there will be 200.000 queries to the policy store (access and contract) and 100.000 to the asset index. let's assume that the stores are really fast and the policy evaluation take no-time, if we count 1ms for every definition the single page request could take a time around 1 minute.
I don't remember if the issue with the "full catalog" fetch was with this operation of with serdes/transmission, but I think this implementation could lead poor performances and potentially to OOM exceptions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but could this lead to long-blocked threads?
We could convert everything down from the API to async to avoid blocking threads. That I would see as a separate topic though.
... skip and limit are applied, as they are located after a flatMap
well, from a logical standpoint, that is the only place where skip
and limit
can be applied, because we need the flat list of assets/offers first.
Maybe we could collect first all the Assets
, slice out the page and then convert them into offers? That would still rely on the store not materializing the stream prematurely, but would save calls to the PolicyDefinitionStore
.
Also, the way the range
was applied before, was flat-out wrong, because it limited the number of contractDefinitions
, so some action needed to be taken.
The problem with the full list was that jetty ran into timeouts (and potentially body-size limits) when using anything other than in-mem.
I mean, for every page we have to fetch all the definitions ...
We could add a cache for the definitions and policies (see #1173), if that needs to be optimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation was updated in commit ea75591, as was the PR descriptions
...ct/src/main/java/org/eclipse/dataspaceconnector/contract/offer/ContractOfferServiceImpl.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is probably the most efficient way to do this at the moment.
The next thing where we could investigate is actually supporting streaming in the SqlExecutor
(because now we are reading everything in a list in memory on which the stream is opened).
Will open an issue about.
9868175
to
a283da7
Compare
* fix: use the range to limit offers, not definitions * merge aftermath
* fix: use the range to limit offers, not definitions * merge aftermath
* fix: use the range to limit offers, not definitions * merge aftermath
* fix: use the range to limit offers, not definitions * merge aftermath
What this PR changes/adds
In the current implementation of the
ContractOfferSerivceImpl
, theRange
parameter is used to limitContractDefinitions
rather than the resultingContractOffers
. This is a bug and could lead to more contractOffers than specified byRange
.This PR dynamically composes the
Stream<Asset>
based on thefrom
andto
parameters of theRange
, skipping as manyAssets
as necessary, and applying theskip
andlimit
parameters to the respective database query.In doing that, it was necessary to add a method
AssetIndex.count(QuerySpec)
to obtain the number of Assets referenced by a particularAssetSelectorExpression
.Why it does that
To guarantee the correct/expected behaviour of the paging.
Further notes
Closes #2008
Checklist
no-changelog
)