Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cwl failed to locate the output file if it has '+' in its name #1098

Closed
byb121 opened this issue Apr 8, 2019 · 3 comments
Closed

cwl failed to locate the output file if it has '+' in its name #1098

byb121 opened this issue Apr 8, 2019 · 3 comments

Comments

@byb121
Copy link

byb121 commented Apr 8, 2019

Expected Behavior

copy and rename an input file

Actual Behavior

cwltool failed to locate the output

Workflow Code

#!/usr/bin/env cwl-runner

class: CommandLineTool

cwlVersion: v1.0

requirements:
  InitialWorkDirRequirement:
    listing:
      - entry: $(inputs.srcfile)

inputs:
  srcfile:
    type: File
    inputBinding:
      position: 1
      shellQuote: true
  newname:
    type: string
    inputBinding:
      position: 2
      shellQuote: true
outputs:
  outfile:
    type: File
    outputBinding:
      glob: $(inputs.newname)

baseCommand: ["cp"]

Full Traceback

/usr/local/bin/cwltool 1.0.20181217162649
Resolved 'cp.cwl' to 'file:///home/yaobo/Downloads/cp.cwl'
[job cp.cwl] initializing from file:///home/yaobo/Downloads/cp.cwl
[job cp.cwl] {
    "srcfile": {
        "class": "File",
        "location": "file:///home/yaobo/Downloads/colo-829-bl+12_i.fq.gz",
        "size": 0,
        "basename": "colo-829-bl+12_i.fq.gz",
        "nameroot": "colo-829-bl+12_i.fq",
        "nameext": ".gz"
    },
    "newname": "colo-829-bl_i+.fastq.gz"
}
[job cp.cwl] path mappings is {
    "file:///home/yaobo/Downloads/colo-829-bl+12_i.fq.gz": [
        "/home/yaobo/Downloads/colo-829-bl+12_i.fq.gz",
        "/tmp/f2d53v_6/colo-829-bl+12_i.fq.gz",
        "File",
        false
    ]
}
[job cp.cwl] command line bindings is [
    {
        "position": [
            -1000000,
            0
        ],
        "datum": "cp"
    },
    {
        "position": [
            1,
            "srcfile"
        ],
        "shellQuote": true,
        "datum": {
            "class": "File",
            "location": "file:///home/yaobo/Downloads/colo-829-bl+12_i.fq.gz",
            "size": 0,
            "basename": "colo-829-bl+12_i.fq.gz",
            "nameroot": "colo-829-bl+12_i.fq",
            "nameext": ".gz",
            "path": "/tmp/f2d53v_6/colo-829-bl+12_i.fq.gz",
            "dirname": "/tmp/f2d53v_6"
        }
    },
    {
        "position": [
            2,
            "newname"
        ],
        "shellQuote": true,
        "datum": "colo-829-bl_i+.fastq.gz"
    }
]
[job cp.cwl] initial work dir {
    "file:///home/yaobo/Downloads/colo-829-bl+12_i.fq.gz": [
        "/home/yaobo/Downloads/colo-829-bl+12_i.fq.gz",
        "/tmp/f2d53v_6/colo-829-bl+12_i.fq.gz",
        "File",
        true
    ]
}
[job cp.cwl] /tmp/f2d53v_6$ cp \
    /tmp/f2d53v_6/colo-829-bl+12_i.fq.gz \
    colo-829-bl_i+.fastq.gz
Could not collect memory usage, job ended before monitoring began.
[job cp.cwl] Job error:
Error collecting output for parameter 'outfile':
cp.cwl:24:3: Traceback (most recent call last):
cp.cwl:24:3: 
cp.cwl:24:3:   File "/usr/local/lib/python3.6/dist-packages/cwltool/command_line_tool.py", line 612, in collect_output_ports
cp.cwl:24:3:     compute_checksum=compute_checksum)
cp.cwl:24:3: 
cp.cwl:24:3:   File "/usr/local/lib/python3.6/dist-packages/cwltool/command_line_tool.py", line 702, in collect_output
cp.cwl:24:3:     with fs_access.open(rfile["location"], "rb") as f:
cp.cwl:24:3: 
cp.cwl:24:3:   File "/usr/local/lib/python3.6/dist-packages/cwltool/stdfsaccess.py", line 41, in open
cp.cwl:24:3:     return open(self._abs(fn), mode)
cp.cwl:24:3: 
cp.cwl:24:3: FileNotFoundError: [Errno 2] No such file or directory: '/tmp/f2d53v_6/colo-829-bl_i%2B.fastq.gz'
[job cp.cwl] completed permanentFail
[job cp.cwl] {}
[job cp.cwl] Removing input staging directory /tmp/tmpb80kmu_k
[job cp.cwl] Removing temporary directory /tmp/tmpidw7qb94
{}
Final process status is permanentFail

Your Environment

  • cwltool version:
    1.0.20181217162649
@tom-tan
Copy link
Member

tom-tan commented Apr 15, 2019

It is because + is not allowed in location field. Please escape it by using URL encoding.

If URL encoding does not work with cwltool, it is a bug of cwltool and please make an issue for it!


Here is a details.

The spec of File object says that the location field is:

An IRI that identifies the file resource

And the RFC3987 that defines the syntax of IRI says (See section 2.1):

...the syntax and use of components and reserved characters is the same as that in [RFC3986].

... and RFC3986 says:

If data for a URI component would conflict with a reserved character's purpose as a delimiter, then the conflicting data must be percent-encoded before the URI is formed.

And + is defined as a delimiter for URI (and IRI).

 reserved    = gen-delims / sub-delims

 gen-delims  = ":" / "/" / "?" / "#" / "[" / "]" / "@"

 sub-delims  = "!" / "$" / "&" / "'" / "(" / ")"
             / "*" / "+" / "," / ";" / "="

@byb121
Copy link
Author

byb121 commented Apr 15, 2019

Hmm, if + is not allowed in a file object, I'd expect cwltool to complain even when an input file name has +. As you can see from the traceback, this is not the case.

The cp.cwl tries to rename a file, so if the new name has a char that requires URL encoding, I'd expect cwltool to handle it internally, or tell me that I'm trying something bad, other than throwing an a FileNotFound error, which is not helpful.

I think this line of code handles some (if not all) of the char restrictions, and + is in it.

If I encode the + in the newname, I get this

/usr/local/bin/cwltool 1.0.20181217162649
Resolved 'cp.cwl' to 'file:///home/yaobo/Downloads/cp.cwl'
[job cp.cwl] initializing from file:///home/yaobo/Downloads/cp.cwl
[job cp.cwl] {
    "srcfile": {
        "class": "File",
        "location": "file:///home/yaobo/Downloads/colo-829-bl+12_i.fq.gz",
        "size": 0,
        "basename": "colo-829-bl+12_i.fq.gz",
        "nameroot": "colo-829-bl+12_i.fq",
        "nameext": ".gz"
    },
    "newname": "colo-829-bl_i%2B.fastq.gz"
}
[job cp.cwl] path mappings is {
    "file:///home/yaobo/Downloads/colo-829-bl+12_i.fq.gz": [
        "/home/yaobo/Downloads/colo-829-bl+12_i.fq.gz",
        "/tmp/4yjehwa4/colo-829-bl+12_i.fq.gz",
        "File",
        false
    ]
}
[job cp.cwl] command line bindings is [
    {
        "position": [
            -1000000,
            0
        ],
        "datum": "cp"
    },
    {
        "position": [
            1,
            "srcfile"
        ],
        "shellQuote": true,
        "datum": {
            "class": "File",
            "location": "file:///home/yaobo/Downloads/colo-829-bl+12_i.fq.gz",
            "size": 0,
            "basename": "colo-829-bl+12_i.fq.gz",
            "nameroot": "colo-829-bl+12_i.fq",
            "nameext": ".gz",
            "path": "/tmp/4yjehwa4/colo-829-bl+12_i.fq.gz",
            "dirname": "/tmp/4yjehwa4"
        }
    },
    {
        "position": [
            2,
            "newname"
        ],
        "shellQuote": true,
        "datum": "colo-829-bl_i%2B.fastq.gz"
    }
]
[job cp.cwl] initial work dir {
    "file:///home/yaobo/Downloads/colo-829-bl+12_i.fq.gz": [
        "/home/yaobo/Downloads/colo-829-bl+12_i.fq.gz",
        "/tmp/4yjehwa4/colo-829-bl+12_i.fq.gz",
        "File",
        true
    ]
}
[job cp.cwl] /tmp/4yjehwa4$ cp \
    /tmp/4yjehwa4/colo-829-bl+12_i.fq.gz \
    colo-829-bl_i%2B.fastq.gz
Could not collect memory usage, job ended before monitoring began.
[job cp.cwl] Job error:
Error collecting output for parameter 'outfile':
cp.cwl:24:3: Traceback (most recent call last):
cp.cwl:24:3: 
cp.cwl:24:3:   File "/usr/local/lib/python3.6/dist-packages/cwltool/command_line_tool.py", line 612, in collect_output_ports
cp.cwl:24:3:     compute_checksum=compute_checksum)
cp.cwl:24:3: 
cp.cwl:24:3:   File "/usr/local/lib/python3.6/dist-packages/cwltool/command_line_tool.py", line 702, in collect_output
cp.cwl:24:3:     with fs_access.open(rfile["location"], "rb") as f:
cp.cwl:24:3: 
cp.cwl:24:3:   File "/usr/local/lib/python3.6/dist-packages/cwltool/stdfsaccess.py", line 41, in open
cp.cwl:24:3:     return open(self._abs(fn), mode)
cp.cwl:24:3: 
cp.cwl:24:3: FileNotFoundError: [Errno 2] No such file or directory: '/tmp/4yjehwa4/colo-829-bl_i%252B.fastq.gz'
[job cp.cwl] completed permanentFail
[job cp.cwl] {}
[job cp.cwl] Removing input staging directory /tmp/tmp_h7xfo4k
[job cp.cwl] Removing temporary directory /tmp/tmpm0tlppsi
{}
Final process status is permanentFail

@mr-c
Copy link
Member

mr-c commented Oct 4, 2021

Fixed in #1446

@mr-c mr-c closed this as completed Oct 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants