Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zip URI imports locally in local WDL #610

Closed
yunhailuo opened this issue Oct 31, 2022 · 2 comments
Closed

Zip URI imports locally in local WDL #610

yunhailuo opened this issue Oct 31, 2022 · 2 comments
Labels
bug Something isn't working

Comments

@yunhailuo
Copy link

First, thank you for the great tool!

To start with, I come from @rhpvorderman 's recommendation and try to package LOCAL WDL with http(s) imports. To be clear, I haven't tested zip http(s) main wdl but my reading of the code seems suggesting that could work find.

Key errors

2022-10-31 08:51:02.652 miniwdl-zip test_miniwld_zip.wdl <= /Users/yunhailuo/git/test_miniwdl_zip/wdl/test_miniwld_zip.wdl
main_dir='/Users/yunhailuo/git/test_miniwdl_zip/wdl/'
abspath='https://raw.githubusercontent.com/openwdl/learn-wdl/master/1_script_examples/1_hello_worlds/1_hello/hello.wdl'
abspath2='https_raw.githubusercontent.com/openwdl/learn-wdl/master/1_script_examples/1_hello_worlds/1_hello/hello.wdl'
prefix=''
Traceback (most recent call last):
  File "/Users/yunhailuo/git/miniwdl/.venv/bin/miniwdl", line 8, in <module>
    sys.exit(main())
  File "/Users/yunhailuo/git/miniwdl/.venv/lib/python3.10/site-packages/WDL/CLI.py", line 86, in main
    zip_wdl(**vars(args))
  File "/Users/yunhailuo/git/miniwdl/.venv/lib/python3.10/site-packages/WDL/CLI.py", line 2046, in zip_wdl
    Zip.build(doc, output, logger, meta=meta, inputs=input_dict, archive_format=fmt)
  File "/Users/yunhailuo/git/miniwdl/.venv/lib/python3.10/site-packages/WDL/Zip.py", line 40, in build
    dir_to_zip = build_source_dir(cleanup, top_doc, logger)
  File "/Users/yunhailuo/git/miniwdl/.venv/lib/python3.10/site-packages/WDL/Zip.py", line 85, in build_source_dir
    zip_paths = build_zip_paths(main_dir, wdls, logger)
  File "/Users/yunhailuo/git/miniwdl/.venv/lib/python3.10/site-packages/WDL/Zip.py", line 122, in build_zip_paths
    assert abspath2.startswith(prefix) and prefix.endswith("/")
AssertionError

MCVE

  • Local workflow structures
test_miniwdl_zip
  |- hello1.wdl
  |- wdl
      |- test_miniwdl_zip.wdl
      |- tasks
          |- hello2.wdl
  • Main WDL (all imported WDL are just copies of this hello).
version 1.0 

import "../hello1.wdl"
import "../wdl/tasks/hello2.wdl" as hello_b
import "https://raw.githubusercontent.com/openwdl/learn-wdl/master/1_script_examples/1_hello_worlds/1_hello/hello.wdl"

workflow HelloWorld {
  call hello_b.WriteGreeting
}

Sorry I might missed the discussion thread for #550 . My preliminary understanding on how zip works is

  • Any imports in/under the main WDL directory will be put in final ZIP as they are with minimum rewrite (meaning change to relative if not already relative).
  • Any imports outside the main WDL directory will be put in __outside_wdl from os.path.commonprefix below and rewrite imports accordingly.
  • I couldn't get http(s) imports, which are essentially URI imports excluding file://, zipped locally due to the common prefix error mentioned above.

Guess on intention

I'm guessing the initial design might be download and put http(s) imports under __outside_wdl/http(s)/? I think this is a great design. But what I get lost is this code section: https://github.com/chanzuckerberg/miniwdl/blob/main/WDL/Zip.py#L116-L118 :

  • abspath2 for all http(s) imports will become http(s)_*
  • main_dir goes through the same transformation but my issue here is main_dir is local so it's still /*
  • Then prefix = os.path.commonprefix will always be empty thus failed the assertion

I'm wondering if I misunderstand the design and use zip in a wrong way. Briefly, my development need is I have a local WDL workflow/repo imported some public http(s) wdl and want to package it locally. Is the best practice pushing to http(s) repo before packaging the workflow and there are complications I missed if I don't zip it that way?

@mlin mlin added the bug Something isn't working label Nov 2, 2022
@mlin
Copy link
Collaborator

mlin commented Nov 2, 2022

@yunhailuo Thank you for the detailed report 👍

The assert in question was supposed to be a sanity check for the usual case when the imported WDL does share some common prefix with the importing WDL, but forgot to allow/ignore the case where there's no common prefix at all. I put your example as a regression test and the log shows how the URL is rewritten:

2022-11-02 10:24:06.053 miniwdl-zip DEBUG /tmp/miniwdl_zip_tests_Yvq83R/issue610/wdl/test_miniwdl_zip.wdl
2022-11-02 10:24:06.053 miniwdl-zip DEBUG   import "../hello1.wdl"
2022-11-02 10:24:06.053 miniwdl-zip DEBUG   => import "__outside_wdl/hello1.wdl"
2022-11-02 10:24:06.054 miniwdl-zip DEBUG /tmp/miniwdl_zip_tests_Yvq83R/issue610/wdl/test_miniwdl_zip.wdl
2022-11-02 10:24:06.054 miniwdl-zip DEBUG   import "../wdl/tasks/hello2.wdl" as hello_b
2022-11-02 10:24:06.054 miniwdl-zip DEBUG   => import "tasks/hello2.wdl" as hello_b
2022-11-02 10:24:06.054 miniwdl-zip DEBUG /tmp/miniwdl_zip_tests_Yvq83R/issue610/wdl/test_miniwdl_zip.wdl
2022-11-02 10:24:06.054 miniwdl-zip DEBUG   import "https://raw.githubusercontent.com/openwdl/learn-wdl/9d05365/1_script_examples/1_hello_worlds/1_hello/hello.wdl"
2022-11-02 10:24:06.054 miniwdl-zip DEBUG   => import "__outside_wdl/https_raw.githubusercontent.com/openwdl/learn-wdl/9d05365/1_script_examples/1_hello_worlds/1_hello/hello.wdl"

(The intent of the commonprefix business is to abbreviate the URL-derived subdirectory when all the relevant WDLs are at least on the same server)

@yunhailuo
Copy link
Author

Thank you very much for the quick fix. Tested and work like a charm

❯ unzip -l test_miniwdl_zip.wdl.zip
Archive:  test_miniwdl_zip.wdl.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
      118  01-01-1980 00:00   MANIFEST.json
      212  01-01-1980 00:00   __outside_wdl/hello1.wdl
      212  01-01-1980 00:00   __outside_wdl/https_raw.githubusercontent.com/openwdl/learn-wdl/master/1_script_examples/1_hello_worlds/1_hello/hello.wdl
      212  01-01-1980 00:00   tasks/hello2.wdl
      271  01-01-1980 00:00   test_miniwdl_zip.wdl
---------                     -------
     1025                     5 files
❯ unzip -p test_miniwdl_zip.wdl.zip test_miniwdl_zip.wdl
version 1.0 

import "__outside_wdl/hello1.wdl"
import "tasks/hello2.wdl" as hello_b
import "__outside_wdl/https_raw.githubusercontent.com/openwdl/learn-wdl/master/1_script_examples/1_hello_worlds/1_hello/hello.wdl"

workflow HelloWorld {
  call hello_b.WriteGreeting
}

Close this and look forward to the next release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants