Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: use the range to limit offers, not definitions #2018

Merged

Conversation

paullatzelsperger
Copy link
Member

@paullatzelsperger paullatzelsperger commented Sep 26, 2022

What this PR changes/adds

In the current implementation of the ContractOfferSerivceImpl, the Range parameter is used to limit ContractDefinitions rather than the resulting ContractOffers. This is a bug and could lead to more contractOffers than specified by Range.

This PR dynamically composes the Stream<Asset> based on the from and to parameters of the Range, skipping as many Assets as necessary, and applying the skip and limit parameters to the respective database query.

In doing that, it was necessary to add a method AssetIndex.count(QuerySpec) to obtain the number of Assets referenced by a particular AssetSelectorExpression.

Why it does that

To guarantee the correct/expected behaviour of the paging.

Further notes

Closes #2008

Checklist

  • added appropriate tests?
  • performed checkstyle check locally?
  • added/updated copyright headers?
  • documented public classes/methods?
  • added/updated relevant documentation?
  • assigned appropriate label? (exclude from changelog with label no-changelog)
  • formatted title correctly? (take a look at the CONTRIBUTING and styleguide for details)

@paullatzelsperger paullatzelsperger added the bug Something isn't working label Sep 26, 2022
@github-actions github-actions bot added this to In progress in Connector Sep 26, 2022
@paullatzelsperger paullatzelsperger temporarily deployed to Azure-dev September 26, 2022 17:02 Inactive
@codecov-commenter
Copy link

codecov-commenter commented Sep 26, 2022

Codecov Report

Base: 64.33% // Head: 64.28% // Decreases project coverage by -0.04% ⚠️

Coverage data is based on head (a283da7) compared to base (4e8252e).
Patch coverage: 50.00% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2018      +/-   ##
==========================================
- Coverage   64.33%   64.28%   -0.05%     
==========================================
  Files         780      780              
  Lines       16537    16573      +36     
  Branches     1076     1081       +5     
==========================================
+ Hits        10639    10654      +15     
- Misses       5448     5468      +20     
- Partials      450      451       +1     
Impacted Files Coverage Δ
...ceconnector/assetindex/azure/CosmosAssetIndex.java 66.00% <0.00%> (-23.19%) ⬇️
...taspaceconnector/sql/assetindex/SqlAssetIndex.java 0.00% <0.00%> (ø)
...lipse/dataspaceconnector/spi/asset/AssetIndex.java 0.00% <ø> (ø)
...nnector/spi/contract/offer/ContractOfferQuery.java 0.00% <0.00%> (ø)
...ector/contract/offer/ContractOfferServiceImpl.java 95.12% <92.30%> (-4.88%) ⬇️
.../contract/offer/ContractDefinitionServiceImpl.java 100.00% <100.00%> (ø)
...lplane/defaults/assetindex/InMemoryAssetIndex.java 92.85% <100.00%> (+0.12%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@paullatzelsperger paullatzelsperger marked this pull request as draft September 26, 2022 17:15
@paullatzelsperger paullatzelsperger temporarily deployed to Azure-dev September 26, 2022 17:54 Inactive
@paullatzelsperger paullatzelsperger temporarily deployed to Azure-dev September 26, 2022 18:12 Inactive
@paullatzelsperger paullatzelsperger marked this pull request as ready for review September 26, 2022 18:12
@paullatzelsperger paullatzelsperger temporarily deployed to Azure-dev September 26, 2022 18:16 Inactive
@paullatzelsperger paullatzelsperger changed the title fix: use the range to limit offers, not definitions fix: use the range to limit offers, not definitions Sep 27, 2022
Copy link
Member

@ndr-brt ndr-brt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm worried about the performances implications of this

@@ -55,15 +57,21 @@ public ContractOfferServiceImpl(ParticipantAgentService agentService, ContractDe
@NotNull
public Stream<ContractOffer> queryContractOffers(ContractOfferQuery query, Range range) {
var agent = agentService.createFor(query.getClaimToken());

return definitionService.definitionsFor(agent, range)
var offers = definitionService.definitionsFor(agent)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that there aren't many ways to do this, but could this lead to long-blocked threads?
I mean, for every page we have to fetch all the definitions - that could be not a big issue, also in the case if there are a lot of them, since they are really little.
But, on the last page request here the service will fetch all the definitions, query/evaluate all the policies, fetch all the assets, and then skip and limit are applied, as they are located after a flatMap, so the time needed to fetch the last page is actually the same time needed to fetch all the catalog in a single call.

e.g. (speculative) having 100.000 (pretty standard value) definitions means that there will be 200.000 queries to the policy store (access and contract) and 100.000 to the asset index. let's assume that the stores are really fast and the policy evaluation take no-time, if we count 1ms for every definition the single page request could take a time around 1 minute.

I don't remember if the issue with the "full catalog" fetch was with this operation of with serdes/transmission, but I think this implementation could lead poor performances and potentially to OOM exceptions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but could this lead to long-blocked threads?

We could convert everything down from the API to async to avoid blocking threads. That I would see as a separate topic though.

... skip and limit are applied, as they are located after a flatMap

well, from a logical standpoint, that is the only place where skip and limit can be applied, because we need the flat list of assets/offers first.
Maybe we could collect first all the Assets, slice out the page and then convert them into offers? That would still rely on the store not materializing the stream prematurely, but would save calls to the PolicyDefinitionStore.

Also, the way the range was applied before, was flat-out wrong, because it limited the number of contractDefinitions, so some action needed to be taken.

The problem with the full list was that jetty ran into timeouts (and potentially body-size limits) when using anything other than in-mem.

I mean, for every page we have to fetch all the definitions ...

We could add a cache for the definitions and policies (see #1173), if that needs to be optimized.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation was updated in commit ea75591, as was the PR descriptions

@paullatzelsperger paullatzelsperger temporarily deployed to Azure-dev September 27, 2022 08:39 Inactive
@paullatzelsperger paullatzelsperger temporarily deployed to Azure-dev September 27, 2022 12:00 Inactive
@paullatzelsperger paullatzelsperger marked this pull request as draft September 27, 2022 12:00
@paullatzelsperger paullatzelsperger temporarily deployed to Azure-dev September 27, 2022 13:18 Inactive
@paullatzelsperger paullatzelsperger marked this pull request as ready for review September 27, 2022 13:22
@paullatzelsperger paullatzelsperger temporarily deployed to Azure-dev September 28, 2022 09:34 Inactive
@paullatzelsperger paullatzelsperger temporarily deployed to Azure-dev September 28, 2022 12:39 Inactive
Copy link
Member

@ndr-brt ndr-brt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is probably the most efficient way to do this at the moment.
The next thing where we could investigate is actually supporting streaming in the SqlExecutor (because now we are reading everything in a list in memory on which the stream is opened).
Will open an issue about.

Connector automation moved this from In progress to Review in progress Sep 28, 2022
@paullatzelsperger paullatzelsperger temporarily deployed to Azure-dev October 4, 2022 06:50 Inactive
@paullatzelsperger paullatzelsperger temporarily deployed to Azure-dev October 6, 2022 13:10 Inactive
@juliapampus juliapampus merged commit c4d4741 into eclipse-edc:main Oct 7, 2022
Connector automation moved this from Review in progress to Done Oct 7, 2022
ndr-brt pushed a commit to Think-iT-Labs/edc-connector that referenced this pull request Feb 1, 2023
* fix: use the range to limit offers, not definitions

* merge aftermath
ndr-brt pushed a commit to Think-iT-Labs/edc-connector that referenced this pull request Feb 1, 2023
* fix: use the range to limit offers, not definitions

* merge aftermath
ndr-brt pushed a commit to Think-iT-Labs/edc-connector that referenced this pull request Feb 1, 2023
* fix: use the range to limit offers, not definitions

* merge aftermath
ndr-brt pushed a commit to Think-iT-Labs/edc-connector that referenced this pull request Feb 1, 2023
* fix: use the range to limit offers, not definitions

* merge aftermath
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
No open projects
Connector
  
Done
Development

Successfully merging this pull request may close these issues.

Potential bug: paging is not applied correctly during a catalog request
5 participants