Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scope of storage capabilities is ambiguous for SEs with tape storage #26

Open
slithy opened this issue Jun 15, 2023 · 6 comments
Open

Comments

@slithy
Copy link

slithy commented Jun 15, 2023

Scope of storage.read and storage.stage

As discussed at WLCG DOMA BDT Meeting (April 2023), CHEP (May 2023) and ATLAS S&C week (June 2023):

In the common-jwt-profile document, the scope of the storage.read and storage.stage capabilities should be amended:

  • The document says that storage.read only applies to data on disk, implying that storage.stage provides both stage and read capabilities for data on tape. However, discussion at the DOMA meeting above concluded that stage and read are separate capabilities which can be authorised independently.
  • I also understand that dCache has implemented stage and read as separate capabilities (correct me if I am wrong).
  • At CHEP, I spoke to the StoRM team (see their poster about tape REST API + token authorisation). They also told me that they had implemented stage and read as separate capabilities.
  • Usually stage and read requests are separated in time. Staging a file can take hours or days, reading it comes afterwards.

Related questions

  • Does storage.stage grant permission to abort a stage request?
  • Does storage.stage grant permission to evict a staged file from the buffer?
  • Similarly, what about capabilities for pinning/unpinning files in the buffer? (EOSCTA does not support this, but dCache does).
  • At least for EOS, the scope of permissions that can be set in the namespace is more fine-grained than the proposed capabilities. As well as rwx and p ("prepare" or stage permission), there is also the "forbid change", "forbid update" and "forbid deletion" ACL permissions. The current set of token capabilities does not allow the same fine-grained control.
  • At the DOMA meeting there was some discussion about whether claims in a token can override ACLs in the namespace. In particular, if a directory is set to prohibit deletion, should a token be able to override this? SE managers seem uneasy with this idea.
  • What happens if there is a bulk request where some files are authorised and others not? Does the entire request fail?
@slithy
Copy link
Author

slithy commented Jun 15, 2023

Pull request to address the first point above: #27

@abh3
Copy link

abh3 commented Jun 15, 2023

Some additional ambiguity. Say you have a system that may trigger a stage when a client attempts a read. However, the client does not have "stage" as a claim. What happens next is ambiguous. If you want a transparent system then read implies stage. However, if you want to prevent clients from staging files simply because they want to read them then you really want them to have a stage claim.

for stage followed by abort and evict. It would seem reasonable that if the client staged the file the client should also have the ability to abort the stage as well as evict the file. However, that is not clear when you consider the transparency point raised above.

Pin and unpin certainly should be separate from "stage" as it represents additional resource usage. However, as above if a client has pin privileges is unpin only w.r.t. to files the client pinned or all files?

I am in favor that a site can implement restrictions that are more severe than ones in a token and the site's policy should override the token's claims. Not doing so essentially says a site has surrendered complete control to the token issuer. I doubt many sites would accept that.

As for bulk requests I've seen it implemented in three ways -- two that you mention, the third is the request fails on the first failure encountered even when subsequent requests would succeed (i.e. partial failure). The reasoning is that recovery is much easier using the third scenario.

@paulmillar
Copy link
Contributor

I also understand that dCache has implemented stage and read as separate capabilities (correct me if I am wrong).

dCache currently has only partial support for storage.stage. It treats storage.stage as a synonym for storage.read (as per the spec) but storage.stage does not authorise staging of that file. Instead, the existing stage authorisation processes are enforced.

@paulmillar
Copy link
Contributor

At the DOMA meeting there was some discussion about whether claims in a token can override ACLs in the namespace. In particular, if a directory is set to prohibit deletion, should a token be able to override this? SE managers seem uneasy with this idea.

I find this comment rather strange.

As I understand it, the point of explicit AuthZ is to delegate AuthZ decisions (for some subtree within the namespace) to the VO. If the token says the bearer is authorised to delete a particular file then the storage system should honour that statement and delete the file when so requested.

Having the storage system overriding the VO's AuthZ decision dilutes the benefits from adopting explicit AuthZ.

@beer4duke
Copy link

Having the storage system overriding the VO's AuthZ decision dilutes the benefits from adopting explicit AuthZ.

Experiments define some SLAs directly with every storage endpoint like: for example never allow RAW data deletion at T0.

As the Storage endpoints are ultimately responsible for hosted data integrity: having VO's AuthZ decision overriding Storage endpoint SLAs revokes Storage endpoint responsibility for all its data.

But in case of deletion incident we all know that the storage endpoint will be blamed as usual and will have to spend expensive operations time to restore as much as it can.

I would not call this decision dilution but a mutually beneficial safety net.

@paulmillar
Copy link
Contributor

I think this is an important point, and something that (I think) should be clarified and stated very clearly and explicitly.

A specific example scenario would be:

If a site is under some kind of commitment (SLA/MoU) to never delete certain data and a request comes in to delete said data, with a token (from the VO) that authorises that operation, what is the correct behaviour of the storage?

I think this generalises naturally to a broader question: are sites under any kind of MoU or SLA that could be in conflict with that site supporting tokens with explicit AuthZ statements?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants