Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Delete problems in a cohort #7303

Closed
mandy-chessell opened this issue Jan 2, 2023 · 5 comments
Closed

[BUG] Delete problems in a cohort #7303

mandy-chessell opened this issue Jan 2, 2023 · 5 comments
Assignees
Labels
bug Something isn't working no-issue-activity Issues automatically marked as stale because they have not had recent activity.

Comments

@mandy-chessell
Copy link
Contributor

Existing/related issue?

No response

Current Behavior

When anchored elements are deleted, downstream elements should also be deleted.

For example, if a comment in a hierarchy of comments is deleted, all comment replies to this comment should be delemented.

Although it works ok in a local repository, in a cohort we are seeing some ofthese anchored elements being left behind.

Expected Behavior

The behaviour in a cohort should be the same as in a local repository.

Steps To Reproduce

In a cohort (like the lab environment) use Asset Consumer to create a comment in one repository linked to an asset. From another repository, add a comment reply. This situation is created in the building an asset catalog lab.

Then delete the comment connected directly to the asset. The comment reply remains.

This is consistent behaviour - there is no timing issue.

Environment

- Egeria: first seen in 3.14

Any Further Information?

Observations from looking at code and thinking about recent changed ...

If this were a timing issue and the delete events were slow to get around the cohort, then a getEntityDetail call issued at the enterprise connector could conceivably just return the reference copies which are at the version before the delete (the home copy would not be returned because it is deleted). This would mean the entity could be returned from this call until the delete events are propagated and processed. If this were a problem, it is a timing issue that has been in the code probably from v0.1.

This timing window could easily be closed by changing the enterprise connector to use isEntityKnown(). This would return the deleted copies as well as the copies of the previous version. Once the consolidated entity is assembled (which would favour the latest version - ie the deleted home copy), the enterprise connector could check its status and throw EntityNotKnowException if it is in deleted status.

This is a simple change that is probably worth doing to close a timing window - and probably give the impression that this problem is fixed.

However, the situation of the undeleted reference copies is persistent. My thoughts looking at the code, is that there is a problem in the processing of the delete events and it is related to the storing of HomeClassifications in a different repository to the home copy of the entity.

Before this change, the receipt of a delete request, resulted in a purge of all reference copies in the cohort. The home repository was then left to deal with possible restore or purge requests.

With the possibility that the reference copy could contain home classifications, the management of events may need to be more sophisiticated.

These are just some of my thoughts - need to gather some diagnostics to be more certain. This is just a capture of my initial thoughts and to ensure we do not forget the problem.

@mandy-chessell mandy-chessell added the bug Something isn't working label Jan 2, 2023
@mandy-chessell mandy-chessell self-assigned this Jan 2, 2023
@davidradl
Copy link
Member

@mandy-chessell I wonder what the performance impact it is, and whether it is worth closing this timing window. I am thinking the fix is worth doing. Also when you say With the possibility that the reference copy could contain home classifications, the management of events may need to be more sophisticated. I am wondering when this could case causes an issue. Most of the time if you are deleting a reference copy, then you want to get rid of it irrespective of what classifications are on it. I assume anchor processing and dealing with mementos would be done at a higher layer prior to an omrs delete of the reference copy.

@mandy-chessell
Copy link
Contributor Author

@davidradl This bug is reporting the fact that orphaned entities are left in the repository. This is not a timing issue. The anchor processing is removing the home copies - the issue here is to clean up the reference copies.

The suggested fix to the timing issue that is described can be fixed with no impact on performance - the same number of REST calls would be made.

The home classifications can not be deleted with the reference copy as part of the delete request because they are still needed in case a restore call is made.

@mandy-chessell
Copy link
Contributor Author

This issue is proving difficult to debug. It is easy to recreate when the servers are running at normal speed, but does not fail in the debugger. The event handling change that I mentioned above was actually already implemented and appears to work fine.

I have implemented the change to the enterprise metadata collection to use isEntityKnown rather than getEntityDetails and that has simplified the executors as well as removed a potential problem.

I need to put this to one-side to focus on the CTS but will return to it next week

mandy-chessell added a commit to mandy-chessell/egeria that referenced this issue Jan 16, 2023
Signed-off-by: Mandy Chessell <mandy.e.chessell@gmail.com>
mandy-chessell added a commit that referenced this issue Jan 17, 2023
Detect deleted entities quicker (#7303)
@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 20 days if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the no-issue-activity Issues automatically marked as stale because they have not had recent activity. label Mar 18, 2023
@mandy-chessell mandy-chessell removed the no-issue-activity Issues automatically marked as stale because they have not had recent activity. label Mar 19, 2023
@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 20 days if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the no-issue-activity Issues automatically marked as stale because they have not had recent activity. label May 19, 2023
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jun 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working no-issue-activity Issues automatically marked as stale because they have not had recent activity.
Projects
None yet
Development

No branches or pull requests

2 participants