Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FOI: custom freeze/thaw'ing helpers #6497

Closed
yarikoptic opened this issue Feb 23, 2022 · 4 comments
Closed

FOI: custom freeze/thaw'ing helpers #6497

yarikoptic opened this issue Feb 23, 2022 · 4 comments

Comments

@yarikoptic
Copy link
Member

yarikoptic commented Feb 23, 2022

decided to test those out as https://git-annex.branchable.com/todo/lockdown_hooks/ says they are already there and messing with adjusted branches is ... messy, especially whenever you are presumably on a performant file system on HPC and do not want to duplicate content etc.

Here is the reference for my initial attempt and hurdle: https://git-annex.branchable.com/bugs/can__39__t_make_annex_happy_in_freeze__47__thaw/ . But I still hope to see the light in the end of the tunnel there.

The problem with that solution overall would be that it then would need to be configured per user (every user??) or system wide (on every system in HPC??), but then also would need to sense file system on either it actually needs that or not (some mount points might be pure POSIX and some not). So I still don't know the best/cleanest UX to provide on those systems, but first want to make sure that freeze/thawing shims workaround works. At worst, it just could be some "post-clone" procedure to run to configure that dataset right after git clone but before git annex init. @mih @bpoldrack -- do we have a way to hook at that point?

edit1: I went back to more fragile "edit ACL" approach and it seems to work -- https://git-annex.branchable.com/bugs/can__39__t_make_annex_happy_in_freeze__47__thaw/#comment-02e2c8a0e65109d3fdc6bff220b4815d . So the question now is how to "bolt" it to DataLad's clone/create etc!?

@mih
Copy link
Member

mih commented Feb 24, 2022

Here is the issue with the current state of things:

@yarikoptic
Copy link
Member Author

For posterity

protocol of successful clone/get/drop/removal
(git-annex) [d31548v@discovery7 tmp]$ git clone https://datasets.datalad.org/openneuro/ds000001/.git
Cloning into 'ds000001'...
remote: Enumerating objects: 3326, done.
remote: Counting objects: 100% (3326/3326), done.
remote: Compressing objects: 100% (933/933), done.
remote: Total 3326 (delta 1511), reused 3280 (delta 1489)
Receiving objects: 100% (3326/3326), 327.09 KiB | 1.02 MiB/s, done.
Resolving deltas: 100% (1511/1511), done.
Updating files: 100% (136/136), done.
(git-annex) [d31548v@discovery7 tmp]$ cd ds000001/
(git-annex) [d31548v@discovery7 ds000001]$ git config --add annex.thawcontent-command "$HOME/bin-annex/thaw-content %path"
(git-annex) [d31548v@discovery7 ds000001]$ git config --add annex.freezecontent-command "$HOME/bin-annex/freeze-content %path"
(git-annex) [d31548v@discovery7 ds000001]$ datalad get -J4 sub-*/anat
get(ok): sub-13/anat/sub-13_inplaneT2.nii.gz (file) [from s3-PUBLIC...]
get(ok): sub-13/anat/sub-13_T1w.nii.gz (file) [from s3-PUBLIC...]
get(ok): sub-07/anat/sub-07_T1w.nii.gz (file) [from s3-PUBLIC...]
get(ok): sub-01/anat/sub-01_T1w.nii.gz (file) [from s3-PUBLIC...]
get(ok): sub-01/anat/sub-01_inplaneT2.nii.gz (file) [from s3-PUBLIC...]
get(ok): sub-07/anat/sub-07_inplaneT2.nii.gz (file) [from s3-PUBLIC...]
get(ok): sub-05/anat/sub-05_T1w.nii.gz (file) [from s3-PUBLIC...]
get(ok): sub-06/anat/sub-06_inplaneT2.nii.gz (file) [from s3-PUBLIC...]
get(ok): sub-16/anat/sub-16_inplaneT2.nii.gz (file) [from s3-PUBLIC...]
get(ok): sub-09/anat/sub-09_T1w.nii.gz (file) [from s3-PUBLIC...]
  [22 similar messages have been suppressed; disable with datalad.ui.suppress-similar-results=off]
get(ok): sub-13/anat (directory)
get(ok): sub-07/anat (directory)
get(ok): sub-05/anat (directory)
get(ok): sub-01/anat (directory)
get(ok): sub-06/anat (directory)
get(ok): sub-10/anat (directory)
get(ok): sub-09/anat (directory)
get(ok): sub-16/anat (directory)
get(ok): sub-02/anat (directory)
get(ok): sub-04/anat (directory)
  [6 similar messages have been suppressed; disable with datalad.ui.suppress-similar-results=off]
action summary:
  get (ok: 48)
(git-annex) [d31548v@discovery7 ds000001]$ datalad drop sub-*/
drop(ok): /dartfs-hpc/rc/lab/C/CANlab/labdata/data/tmp/ds000001/sub-01/anat/sub-01_T1w.nii.gz (file)
drop(ok): /dartfs-hpc/rc/lab/C/CANlab/labdata/data/tmp/ds000001/sub-01/anat/sub-01_inplaneT2.nii.gz (file)
drop(ok): /dartfs-hpc/rc/lab/C/CANlab/labdata/data/tmp/ds000001/sub-02/anat/sub-02_T1w.nii.gz (file)
drop(ok): /dartfs-hpc/rc/lab/C/CANlab/labdata/data/tmp/ds000001/sub-02/anat/sub-02_inplaneT2.nii.gz (file)
drop(ok): /dartfs-hpc/rc/lab/C/CANlab/labdata/data/tmp/ds000001/sub-03/anat/sub-03_T1w.nii.gz (file)
drop(ok): /dartfs-hpc/rc/lab/C/CANlab/labdata/data/tmp/ds000001/sub-03/anat/sub-03_inplaneT2.nii.gz (file)
drop(ok): /dartfs-hpc/rc/lab/C/CANlab/labdata/data/tmp/ds000001/sub-04/anat/sub-04_T1w.nii.gz (file)
drop(ok): /dartfs-hpc/rc/lab/C/CANlab/labdata/data/tmp/ds000001/sub-04/anat/sub-04_inplaneT2.nii.gz (file)
drop(ok): /dartfs-hpc/rc/lab/C/CANlab/labdata/data/tmp/ds000001/sub-05/anat/sub-05_T1w.nii.gz (file)
drop(ok): /dartfs-hpc/rc/lab/C/CANlab/labdata/data/tmp/ds000001/sub-05/anat/sub-05_inplaneT2.nii.gz (file)
  [22 similar messages have been suppressed; disable with datalad.ui.suppress-similar-results=off]
action summary:
  drop (notneeded: 96, ok: 32)
(git-annex) [d31548v@discovery7 ds000001]$ ls -ld .git/annex/objects/49/4j/MD5E-s5587170--8342da511e9cf3a2a813e1155e0df485.nii.gz
drwxrwx--- 2 d31548v rc-CANlab-admin 0 Feb 23 22:40 .git/annex/objects/49/4j/MD5E-s5587170--8342da511e9cf3a2a813e1155e0df485.nii.gz
(git-annex) [d31548v@discovery7 ds000001]$ nfs4_getfacl .git/annex/objects/49/4j/MD5E-s5587170--8342da511e9cf3a2a813e1155e0df485.nii.gz

# file: .git/annex/objects/49/4j/MD5E-s5587170--8342da511e9cf3a2a813e1155e0df485.nii.gz
A::OWNER@:rwadxtTnNcy
A:fdi:OWNER@:rwadxtTnNcy
A:fd:GROUP@:rwaDdxtTnNcCoy
A:fdg:rc-CANlab@KIEWIT.DARTMOUTH.EDU:rwadxtTnNcy
(git-annex) [d31548v@discovery7 ds000001]$ rmdir .git/annex/objects/49/4j/MD5E-s5587170--8342da511e9cf3a2a813e1155e0df485.nii.gz
(git-annex) [d31548v@discovery7 ds000001]$ cd ..
(git-annex) [d31548v@discovery7 tmp]$ rm -rf ds000001/
using these two "dirty" scripts
(git-annex) [d31548v@discovery7 tmp]$ cat ~/bin-annex/freeze-content
#!/bin/bash

set -eu

{
#echo "D: freezing [$@] while under `pwd`"
##ls -ld "$@"
# Cumbersome and buggy -- I found no other way to just say "remove write bits"
nfs4_getfacl "$@" | sed -e 's,:rwadxtTnN,:rxtn,g' | nfs4_setfacl -S- "$@"

# This should be first and thus preventing any changes to that path
# 'thaw'ing would just remove that rule
# nfs4_setfacl -a D::EVERYONE@:wadTN 1 -R "$@"
##echo "D: finished freezing ok. Current ACL:"
##nfs4_getfacl "$@"
##ls -ld "$@"
} >&2
(git-annex) [d31548v@discovery7 tmp]$ cat ~/bin-annex/thaw-content
#!/bin/bash

set -eu

{
#echo "thawing [$@] while under `pwd`"
if [ -e "$@" ] ; then
    nfs4_getfacl "$@" | sed -e 's,:rxtn,:rwadxtTnN,g' | nfs4_setfacl -S- -R "$@"
fi

#nfs4_setfacl -x D::EVERYONE@:wadTN -R "$@"
##echo "finished unlocking ok. ACL:"
##nfs4_getfacl "$@"
##ls -ld "$@"
} >&2

@yarikoptic
Copy link
Member Author

yarikoptic commented Mar 1, 2022

during week call @christian-monch suggested and I nodded in agreement that may be it is time to add "path specific" config settings, e.g.

[datalad "path.path-{uuid}"]
  path = /mnt/dartfs/
  pre-git-annex-init-hook = nfs4_acl

which would set some settings which would be path specific. Then ConfigManager, for per-dataset invocation, should consult those and add them. The overload order might be tricky to define though :-/

edit: or should we run this idea with @joeyh - may be he sees some cute way to setup such "path specific" configuration handling in datalad or even in git-annex (since here it concerns with git-annex really).

@mih
Copy link
Member

mih commented Mar 23, 2022

Closing FOI now. Story continue in #6538

@mih mih closed this as completed Mar 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants