Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iRODS remote functionality - issues with "zone" #1510

Closed
ScottMastro opened this issue Mar 23, 2022 · 1 comment
Closed

iRODS remote functionality - issues with "zone" #1510

ScottMastro opened this issue Mar 23, 2022 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@ScottMastro
Copy link
Contributor

ScottMastro commented Mar 23, 2022

Hello! Snakemake is quickly becoming one of my favourite tools and I was pleasantly surprised to find out there was support for iRODS. I've observed two issue however that I have debugged for myself and just wanted to share it here for the benefit of other users. I will format the two issues separately but they are related.

Snakemake version
7.3.0 (as far as I can tell these issue still exists on the main branch)

Issue 1. glob_wildcards requires anti-pattern to work properly

Describe the bug
https://snakemake.readthedocs.io/en/stable/snakefiles/remote_files.html#irods
The example provided in the documentation has the following code:

from snakemake.remote.iRODS import RemoteProvider

irods = RemoteProvider(irods_env_file='setup-data/irods_environment.json',
                       timezone="Europe/Berlin") # all parameters are optional

# please note the comma after the variable name!
# access: irods.remote(expand('home/rods/{f}), f=files))
files, = irods.glob_wildcards('home/rods/{files})

The last line causes an issue by throwing irods.exception.CollectionDoesNotExist. This issue can be traced to the fact that the iRODS "zone" is not being included when running glob_wildcards. When I manually add the zone as a prefix to the path, it runs fine. (i.e files, = irods.glob_wildcards('/zone/home/rods/{files})). However, this does not seem to be the correct pattern - the documentation states:

Please note that the zone folder is not included in the path as it will be taken from the configuration file. The path also must not start with a /.

Proposed solution
The way to fix this is to include the zone as a prefix, as the documentation suggests should happen. The glob_wildcards function is defined in snakemake/remote/__init__.py. The snakemake/remote/iRODS.py RemoteProvider class can override the function as follows:

def glob_wildcards(self, pattern, *args, **kwargs):
    remote_pattern = os.path.join(os.sep, self._irods_session.zone, pattern)
    return super().glob_wildcards(remote_pattern, *args, **kwargs)

This will append the zone as a prefix prior to execution of the super function. This has resolved the issue for me.


Issue 2. Uploading functionality is broken for users without access to complete directory structure

Describe the bug
This issue lies in the RemoteObject class of snakemake/remote/iRODS.py:

    def _upload(self):
        # get current local timestamp
        stat = os.stat(self.local_path)

        # create folder structure on remote
        folders = os.path.dirname(self.remote_path).split(os.sep)[1:]
        collpath = os.sep

        for folder in folders:
            collpath = os.path.join(collpath, folder)

            try:
                self._irods_session.collections.get(collpath)
            except:
                self._irods_session.collections.create(collpath)

The function splits up the directory structure, such that /zone/home/rods becomes ["zone", "home", "rods"]
folders = os.path.dirname(self.remote_path).split(os.sep)[1:]
Then for each one of these folders, it tries to either "get" it or "create" it. That is, it will try to access /zone, /zone/home and /zone/home/rods. This will throw an iRODS exception CAT_NO_ACCESS_PERMISSION and fail if any of these are inaccessible to the user. In my case, I only have access to the subdirectory /zone/home/rods, which means all uploading fails for me on the first iteration of the "for loop". I imagine that having limited access to the "root" /zone directory is not uncommon.

Proposed solution
What I did to fix this for myself is to check if the directory is not accessible before attempting to get/create it.
Include iRODS access exception:
from irods.exception import CollectionDoesNotExist, DataObjectDoesNotExist, CAT_NO_ACCESS_PERMISSION
Skip over directory parts that are not accessible:

    def denied_access(self, collpath):
        try:
            self._irods_session.collections.get(collpath)
            return False
        except(CAT_NO_ACCESS_PERMISSION):
            return True
        return False

    def _upload(self):
        # get current local timestamp
        stat = os.stat(self.local_path)

        # create folder structure on remote
        folders = os.path.dirname(self.remote_path).split(os.sep)[1:]
        collpath = os.sep + folders.pop(0) + os.sep + folders.pop(0)

        for folder in folders:
            collpath = os.path.join(collpath, folder)
            print(collpath)
            if not self.denied_access(collpath):
                try:
                    self._irods_session.collections.get(collpath)
                except:
                    self._irods_session.collections.create(collpath)
@ScottMastro ScottMastro added the bug Something isn't working label Mar 23, 2022
@johanneskoester
Copy link
Contributor

johanneskoester commented Mar 31, 2022

I see, good catch! Could you create a PR with your fix?

@johanneskoester johanneskoester self-assigned this Mar 31, 2022
ScottMastro added a commit to ScottMastro/snakemake that referenced this issue Apr 27, 2022
ScottMastro added a commit to ScottMastro/snakemake that referenced this issue Apr 27, 2022
ScottMastro added a commit to ScottMastro/snakemake that referenced this issue May 2, 2022
Co-authored-by: Johannes Köster <johannes.koester@uni-due.de>
johanneskoester added a commit that referenced this issue May 16, 2022
* fix: iRODS functionality - issue #1510

* fix: iRODS functionality - issue #1510

Co-authored-by: Johannes Köster <johannes.koester@uni-due.de>

* fmt with black

* iRODS correctly handles subdirectories  

allows iRODS _upload function to either create (if missing) or ignore (if user has no access) a subdirectory

Co-authored-by: Johannes Köster <johannes.koester@uni-due.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants