Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accessing ATLAS data from EOS/Tier2 through UChicago AF #507

Open
MoAly98 opened this issue Nov 25, 2022 · 6 comments
Open

Accessing ATLAS data from EOS/Tier2 through UChicago AF #507

MoAly98 opened this issue Nov 25, 2022 · 6 comments

Comments

@MoAly98
Copy link

MoAly98 commented Nov 25, 2022

Hello!

I have some files on EOS that are accessible by custom permissions within ATLAS. I have managed with help from @oshadura to access the file with uproot.open(), however I still cannot access the file with uproot serviceX. My setup is:

from func_adl_servicex import ServiceXSourceUpROOT
eos_file    = "root://eosatlas.cern.ch//eos/atlas/atlascerngroupdisk/phys-higgs/HSG8/tH_v34_minintuples_v0/mc16a_nom/410470_AFII_user.vvecchio.27456668._000001.output.root"
ds = ServiceXSourceUpROOT([eos_file], treename="nominal_Loose")
data = (ds
    .Select("lambda e: {'lep_pt': e.leptons_pt,'lep_eta': e.leptons_eta,}")
    .AsAwkwardArray().value()
)

I end up with

Row: 26; Column: 4
Failed to transform input file root://eosatlas.cern.ch//eos/atlas/atlascerngroupdisk/phys-higgs/HSG8/tH_v34_minintuples_v0/mc16a_nom/410470_AFII_user.vvecchio.27456668._000001.output.root: file not found ([ERROR] Server responded with an error: [3010] Unable to give access - user access restricted - unauthorized identity used ; Permission denied ) 

I am not running explicitly any distributed code, so I believe this should have worked unless there is some permission issues with the proxy being used to access the file.

Are there any further checks I should do to understand/resolve this?

@gordonwatts
Copy link
Collaborator

Ok - I think I understand the problem. Let me try to rephrase and let me know if I've understood correctly:

  • The files are not accessible to the general ATLAS user (e.g. I could not access them on EOS)
  • ServiceX proxies all accesses to data through a single cert, generated from a particular single user (almost certainly not a special person with super-user abilities)
  • Access fails because the general security cert does not have permissions to access EOS.

Possible Workarounds

What could be done right now without changes to ServiceX?

  1. Make the files accessible to anyone that is a member of the ATLAS collaboration. This should make it accessible to ServiceX

Modifications to ServiceX

@BenGalewsky probably can point out a specific story

  • User tokens forward to ServiceX query so they can be used for access.

@BenGalewsky
Copy link
Contributor

BenGalewsky commented Dec 21, 2022

One variation on (1): ServiceX uses a captive service account to access ATLAS resources. We could add the owner of that account as membership in the private ATLAS group. Every user of that serviceX instance would have access to the files.

Maybe another pattern would be to deploy a private ServiceX at the AF that uses the account of a member of the private group as the service account.

But yes, passing tokens all the way through serviceX is a major (and ultimately necessary) change. See #321 for a skeletal story.

@MoAly98
Copy link
Author

MoAly98 commented Dec 23, 2022

Hi @BenGalewsky and @gordonwatts -- Thanks a lot for your replies :) You've got the story right, thanks a lot for summarising!

I'm sure you know how painful it would be to try and convince conveners to give full access to a group disk in ATLAS to the entire collaboration, but I can ask if this is possible. I think it could potentially be easier to ask the for access for the service account, so I can suggest both solutions to the group conveners. Am I right to assume the accoung is associated with Ilija? can you provide me an account name that would need access?

@bbockelm
Copy link

Hi!

I think we can reasonably request access to for a service account on a one-by-one basis. It might be difficult for a personal account, however.

For token-based access, I've put in a request for the ATLAS EOS folks to have this enabled.

Brian

@vokac
Copy link

vokac commented Dec 27, 2022

This atlascerngroupdisk is a "non-Grid" (non-Rucio) storage area for local groups and permissions usually set to allow reading for all ATLAS users and writing for a specific group. I can read file mentioned above with just normal ATLAS user account and X.509 proxy (I'm not member of atlas-eos-access-phys-higgs e-group).

It is possible to check permission with

[vokac@lxplus.cern.ch]~% eos ls -l /eos/atlas/atlascerngroupdisk/ | grep phys-higgs
drwxr-x--+   1 root     zp       177290217647715 Sep  8 19:17 phys-higgs
[vokac@lxplus.cern.ch]~% eos acl -l /eos/atlas/atlascerngroupdisk/phys-higgs        
egroup:atlas-eos-access-phys-higgs:rwx

(for full picture it is also necessary to understand identity mapping, e.g. ATLAS EOS grid-mapfile, but that's basically same for all ATLAS users).

Because atlascerngroupdisk is not space managed by Rucio its directories & files can have arbitrary pemissions => technically we can define e.g. IAM policy that allows any ATLAS user to get token with read privileges from /eos/atlas/atlascerngroupdisk/phys-higgs (storage.read:/atlascerngroupdisk/phys-higgs/ scope), but for writing it would be necessary to synchronize all atlas-eos-access-* groups in the ATLAS IAM or do some fancy path based token identity mapping on the EOS ATLAS side.

Also this simple model with tokens would require quite a lot of knowledge on user side (e.g. user will be able to get token with storage.read:/atlascerngroupdisk/phys-higgs/, but not with storage.read:/ or storage.read:/atlascerngroupdisk/more-restricted-group-access/ ... this may be acceptable for R&D projects, but for production ServiceX may participate in a token exchange flows that could hide complexity of token content.

@vokac
Copy link

vokac commented Dec 28, 2022

Anyway, I think that EOS-5460 issue needs to be resolved first for xroot access with tokens and meanwhile I would like to discus with EOS team token configuration for storage areas managed by Rucio. We should still start with EOS testbed, because this instance was not yet configured in a way to pass WLCG compliance tests. To be honest I also did not yet tested IAM scope policies, @bbockelm does CMS already configured / use IAM with scope policies for storage.*:$PATH ... so we don't have to worry about unexpected / undocumented security features that comes from this IAM configuration?

I mean, it may take some time before we are ready to use tokens for EOS ATLAS, but I would like to have something ready in Q1 2023.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants