New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to Delete File DID via Undertaker #5154
Labels
Comments
@bjwhite-fnal with which Rucio version did you test this? |
This would have been 1.26, though I'm not sure which minor version I was using back in January. |
I have to test this a bit more, but to me it looks like this should be working in 1.29 - I will add a unittest to confirm, but the parts where we exclude files are not there anymore as far as I see. |
ThePhisch
added a commit
to ThePhisch/rucio
that referenced
this issue
Feb 2, 2023
Added tests to cover more of the undertaker functionality. This was triggered by issue rucio#5154. `test_file_did_deletion` tests how the undertaker handles a file DID that is not attached to any dataset. It asserts that the epoch tombstone is set on the file replica, regardless of whether rules exist on the file DID (parametrised via a pytest fixture). Furthermore, the file DID must not be deleted, as this is the Reaper's job. `test_file_dids_in_dataset` tests how the undertaker handles an expired file DID that belongs to a non-expired dataset among with other files. It asserts that the expired file is detached and its replicas receive the epoch tombstone. Furthermore, the other files are unchanged. `test_file_protected_by_rule` sets up a system in which two files belong to an expired dataset. At the same time, one of these files is protected by a rule. The expected outcome is that the unprotected file DID is detached from the dataset and its replica is tombstoned. On the other hand, the protected file DID is detached from the dataset, and its replicas protected by the dataset rule are epoch-tombstoned while its replicas protected by the external rule remain unchanged. Finally, it is asserted that the file DIDs remain whereas the dataset DID is deleted. `test_file_protected_by_dataset` has a similar setup to `test_file_protected_by_rule`, but instead of a rule acting directly on the protected file, the protected file is shared between an expired DID and a non-expired DID. We expect the unprotected file DID to be detached and its replica to receive the epoch tombstone. Furthermore, we expect the protected file DID to be detached from the expired dataset but remain attached to the non-expired dataset; with the protected replica remaining untouched and the unprotected replica being epoch-tombstoned. As with the pre-existing `test_removal_all_replicas2`, the tests have been parametrised to test both `use_temp_tables: True` and `False`. The pre-existing test `TestUndertaker::test_undertaker` was changed to include an assertion that the expired dataset DIDs were deleted by the undertaker. In addition, we assert that the replicas have the epoch-tombstone (marked by `rucio.db.sqla.constants.OBSOLETE`), instead of asserting `is not None` (which is more generic and could also include non-epoch-tombstones).
ThePhisch
added a commit
to ThePhisch/rucio
that referenced
this issue
Feb 2, 2023
In `code/did.py`, the process for DID-deletion was extended (both for `temp_tables` and without). This was necessary because DIDs without rules still require their replicas to be epoch-tombstoned (without the `delete_rule`-process to handle that). Took extra care not to add any SQL calls into the loop over all DIDs. === In `core/rule.py`, the case that no locks exist for a given rule was handled. Since this would mean that the function `__delete_lock_and_update_replica` (which sets the epoch tombstone) would not be called, the epoch tombstone had to be set manually by updating the Replica. The Replica is updated by joining the ReplicationRule table to the DataIdentifierAssociation table to the RSEFileAssociation table via the scopenames. We select the RSEFileAssociation for which exists a rule that indirectly protects it (rule assigned to dataset which is the parent to a file which has a replica). We constrain this by asserting that the rule needs to match a given rule_id, the replica must be saved on an RSE matched by the rule's RSE Expression and the replica must have a lock_cnt of 0. The structure of the SQLAlchemy Query has been taken from other places in the code. Choices such as using `execution_options(synchronize_session=False)` stem from reusing pre-existing code.
ThePhisch
added a commit
to ThePhisch/rucio
that referenced
this issue
Feb 8, 2023
Added tests to cover more of the undertaker functionality. This was triggered by issue rucio#5154. `test_file_did_deletion` tests how the undertaker handles a file DID that is not attached to any dataset. It asserts that the epoch tombstone is set on the file replica, regardless of whether rules exist on the file DID (parametrised via a pytest fixture). Furthermore, the file DID must not be deleted, as this is the Reaper's job. `test_file_dids_in_dataset` tests how the undertaker handles an expired file DID that belongs to a non-expired dataset among with other files. It asserts that the expired file is detached and its replicas receive the epoch tombstone. Furthermore, the other files are unchanged. `test_file_protected_by_rule` sets up a system in which two files belong to an expired dataset. At the same time, one of these files is protected by a rule. The expected outcome is that the unprotected file DID is detached from the dataset and its replica is tombstoned. On the other hand, the protected file DID is detached from the dataset, and its replicas protected by the dataset rule are epoch-tombstoned while its replicas protected by the external rule remain unchanged. Finally, it is asserted that the file DIDs remain whereas the dataset DID is deleted. `test_file_protected_by_dataset` has a similar setup to `test_file_protected_by_rule`, but instead of a rule acting directly on the protected file, the protected file is shared between an expired DID and a non-expired DID. We expect the unprotected file DID to be detached and its replica to receive the epoch tombstone. Furthermore, we expect the protected file DID to be detached from the expired dataset but remain attached to the non-expired dataset; with the protected replica remaining untouched and the unprotected replica being epoch-tombstoned. As with the pre-existing `test_removal_all_replicas2`, the tests have been parametrised to test both `use_temp_tables: True` and `False`. The pre-existing test `TestUndertaker::test_undertaker` was changed to include an assertion that the expired dataset DIDs were deleted by the undertaker. In addition, we assert that the replicas have the epoch-tombstone (marked by `rucio.db.sqla.constants.OBSOLETE`), instead of asserting `is not None` (which is more generic and could also include non-epoch-tombstones).
ThePhisch
added a commit
to ThePhisch/rucio
that referenced
this issue
Feb 8, 2023
In `code/did.py`, the process for DID-deletion was extended (both for `temp_tables` and without). This was necessary because DIDs without rules still require their replicas to be epoch-tombstoned (without the `delete_rule`-process to handle that). Took extra care not to add any SQL calls into the loop over all DIDs. === In `core/rule.py`, the case that no locks exist for a given rule was handled. Since this would mean that the function `__delete_lock_and_update_replica` (which sets the epoch tombstone) would not be called, the epoch tombstone had to be set manually by updating the Replica. The Replica is updated by joining the ReplicationRule table to the DataIdentifierAssociation table to the RSEFileAssociation table via the scopenames. We select the RSEFileAssociation for which exists a rule that indirectly protects it (rule assigned to dataset which is the parent to a file which has a replica). We constrain this by asserting that the rule needs to match a given rule_id, the replica must be saved on an RSE matched by the rule's RSE Expression and the replica must have a lock_cnt of 0. The structure of the SQLAlchemy Query has been taken from other places in the code. Choices such as using `execution_options(synchronize_session=False)` stem from reusing pre-existing code.
Related old PR for reference #6063 |
rdimaio
added a commit
to rdimaio/rucio
that referenced
this issue
Mar 5, 2024
rdimaio
added a commit
to rdimaio/rucio
that referenced
this issue
Mar 5, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The ICARUS Rucio deployment has been attempting to delete a DID for an individual file. However, the Undertaker fails to remove the DID, since (as per Martin) it was originally designed only to be used on datasets. It would be useful to allow the Undertaker to remove file DIDs also.
Thanks!
The text was updated successfully, but these errors were encountered: