Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error with TransferTerminationMessage when terminated by Provider #688

Open
tmberthold opened this issue Jan 9, 2024 · 2 comments
Open
Labels
component/edc-ce affected software component component/edc-ui affected software component kind/bug Something isn't working. The software does not behave as expected or specified. scope/mds related to MDS task/analyze Need for investigation

Comments

@tmberthold
Copy link
Member

tmberthold commented Jan 9, 2024

Description - What happened?

Aborting transfers by the Provider because of an internal error leads to a follow up error with the TransferTerminationMessage.
As a follow up in the EDC-UI of the consumer, the transfer process will continue to rotate bars indefinitely.

Expected Behavior

TransferTerminationMessage send by Provider should also lead to a (graphical) termination of the transfer process at the Consumer after the Provider canceled the transfer because of an internal error.

Observed Behavior

If the provider cancels a data transfer requested by a consumer immediately because it encounters an internal error in its setup, such as an unreachable data plane or not being able to build the HttpDataSource, the provider then notifies the consumer of the termination with a TransferTerminationMessage. However, during sending the message also a failure is logged at the provider: "404 - Transferprocess with corellationId ... not found". As a consequence the consumer remains in the Requested status for the transfer while the provider has been in Terminated status for a long time.

It is unclear whether the 404-response comes from the Consumer in response to receiving the termination-message from the Provider or whether it is another internal error from the Provider himself.

Steps to Reproduce

There are no real steps to reproduce this, as this must be preceded by an internal error on the provider's side when initiating the transfer that was requested by a consumer. This is therefore a subsequent error.

Context Information

Additional Findings

Occurred in two independent transfer terminations on the provider side:

  1. Provider: Internal error because Dataplane could not be found -> Provider termination -> Consumer stays in Requested
  2. Provider: Internal error when processing the HttpDataSource -> Provider termination -> Consumer stays in Requested

Hypothesis / Possible Root Cause

  • SO hints that bug could be cause by an issue in the state machine.
    • State requested has ID 500, no relation to HTTP
    • Error 500 - could also be state machine
    • It is one of our EDCs - we can start debugging

Timeline / Priority

We want to fix this issue before the launch of MDS 2.1 so that on-premise customers of MDS do not have to do a second update at a later stage; this will ensure a smoother customer experience.

Screenshots

As a follow up in the EDC-UI of the consumer, the transfer process will continue to rotate bars indefinitely. Like here for a transfer, that was started 20 days ago but meanwhile terminated by the Provider.
image

Workaround idea

As a workaround, the Consumer could consider the transfer to have been terminated after a certain amount of time x if he does not make any status changes for a certain period of time on that Transferprocess.

Stakeholders

@ip312 @jkbquabeck

@tmberthold tmberthold added the task/analyze Need for investigation label Jan 9, 2024
@tmberthold tmberthold added kind/bug Something isn't working. The software does not behave as expected or specified. component/edc-ce affected software component component/edc-ui affected software component labels Apr 9, 2024
@jkbquabeck
Copy link
Collaborator

@SebastianOpriel @AbdullahMuk Who can work on the workaround and how much time will the workaround take?

@SebastianOpriel
Copy link
Member

@ununhexium ?

@AbdullahMuk AbdullahMuk added the scope/mds related to MDS label Apr 17, 2024
@AbdullahMuk AbdullahMuk added the clean-backlog requires backlog cleaning label May 2, 2024
@ununhexium ununhexium removed the clean-backlog requires backlog cleaning label May 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/edc-ce affected software component component/edc-ui affected software component kind/bug Something isn't working. The software does not behave as expected or specified. scope/mds related to MDS task/analyze Need for investigation
Projects
None yet
Development

No branches or pull requests

5 participants