Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XML macros: add named yields, tokenized macros, tokens for attributes, and fix replacement of toplevel yield #13152

Merged
merged 16 commits into from Jan 19, 2022

Conversation

bernt-matthias
Copy link
Contributor

@bernt-matthias bernt-matthias commented Jan 11, 2022

Originally I wanted to find out what exactly happens if there are multiple yield tags.. hoping that I can somehow control that each yield is replaced with the corresponding child of the expand tag. This did not work. So I added named yields and two other extensions that I were missing in the past:

Named yields

Essentially <yield name="XYZ"/> is replaced by the content of <template name="XYZ"/>. Unnamed yields work as before and named and unnamed yields can be mixed.

An example:

    <xml name="named_yields_example">
      <conditional>
        <param type="select">
          <option value="a">A</option>
          <option value="b">B</option>
          <yield name="more_options"/>
        </param>
        <when value="a">
          <param type="select">
            <yield />
          </param>
        </when>
        <when value="b">
          <param ... />
        </when>
        <yield name="more_whens">
      </conditional>
    </xml>
        <expand macro="named_yields_example">
          <token name="more_options">
            <option value="c">C</option>
          </token>
          <token name="more_whens">
            <when value="c">
              <param type="select">
                <yield />
              </param>
            </when>
          </token>
          <options from_data_table="tophat2_indexes">
            <filter type="sort_by" column="2"/>
            <validator type="no_options" message="No genomes are available for the selected input dataset"/>
          </options>
        </expand>

The choice of token as tag name may be sub-optimal... since tokens are also used to parametrize macro parameters. Should be easy to change if desired.

Allow that the name of nested macros can be controlled by tokens:

For instance

<xml name="example">
    <expand macro="@TOKEN_NAME@"/>`
</xml>

See

def test_loader_specify_nested_macro_by_token():
for a complete example

Allow token expansion also for attributes

<tool>
    <macros>
        <token name="__ATTR_NAME">name</token>
    </macros>
    <another>
        <tag __ATTR_NAME="blah" />
    </another>
</tool>

Here the token can't be surrounded by @ character (i.e. @ATTR_NAME@) since the xml file needs to proper xml (because it is parsed before token/macro expansion). But it seems that for tokens surrounding the token by @ is a (undocumented) convention only, anyway.

Docs have been added here: galaxyproject/planemo#1212 (not all changes are yet documented, but will be when this is merged).

I'm happy if there are more suggestions to make this more expressive / powerful / elegant :). Real recursion would be cool but also dangerous .. also could not think of a way to express a stopping condition.

Fix: If there are multiple top level yield statements only the last was replaced

How to test the changes?

(Select all options that apply)

  • I've included appropriate automated tests.
  • This is a refactoring of components with existing test coverage.
  • Instructions for manual testing are as follows:
    1. [add testing steps and prerequisites here if you didn't write automated tests covering all your changes]

License

if there are multiple top level yield statements
only the last was replaced
@github-actions github-actions bot added this to the 22.01 milestone Jan 11, 2022
instead of the list of children

mv parsing of token_quote out of the for loop, since otherwise
tokens that are parsed before token_quote will get the default
and only the ones after will get the desired token_quote
use galaxy.util.xml_to_string for export since lxml is not always
available?
yield tags with a name attribute will be replaced

```
<xml>...<yield name="xyz"/>...</xml>
```

by the content of the token with the corresponding name:

```
<expand>...<token name="xyz">CONTENT</token>...</expand>
```

The token must be direct child of the expand tag

like for unnamed yields all yields with the same name will
be replaced by the token's content.

Named yields are replaced befor unnamed yields.

When replacing unnamed yields tokens are ignored.
bernt-matthias added a commit to bernt-matthias/planemo that referenced this pull request Jan 12, 2022
@bernt-matthias bernt-matthias changed the title XML macros: fix replacement of toplevel yield XML macros: add named yields and fix replacement of toplevel yield Jan 12, 2022
got myself used to the unit test and thought it might be nice
to use more xpath

undo a unnecessary change in xml_macros.py
1. allow that the name of nested macros can be controlled by tokens:
eg `<expand macro="@TOKEN_NAME@"/>`

therefore:

- in load_with_references childs of the macros element are
  temporarily removed. otherwise the replacement of the nested macros
  also happens for the expand tags in macros and it seems really
  difficult to iterate a tree excepting a subtree
- in _expand_macro replace tokens first and then consider
  nested macros

see test test_loader_specify_nested_macro_by_token

2. allow token expansion also for attributes (before only attribute
values)

this only works if the token is valid xml i.e. they can't be surrounded
by `@`

see test test_loader_token_attribute_name

3. split unit tests in separate functions

and add a few (e.g. nested tokens ...)
and add unit test for this case
@bernt-matthias bernt-matthias changed the title XML macros: add named yields and fix replacement of toplevel yield XML macros: add named yields, tokenized macros, tokens for attributes, and fix replacement of toplevel yield Jan 14, 2022
Copy link
Member

@mvdbeek mvdbeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should run this against the iuc as well.

# that are not included in the macros node
macros_el = _macros_el(root)
if macros_el is not None:
macros_copy = deepcopy(macros_el)
Copy link
Member

@mvdbeek mvdbeek Jan 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deepcopy essentially pickles and unpickles the object, this could be detrimental to startup speed. Would a .copy() be sufficient ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. I remember that if I do not deepcopy then the following clear also removes the copied elements. Maybe calling remove(subelement) for each child helps.

In the end this is only a workaround: I was not able to find a way to iterate only the expand elements that are not included in the macros subtree. With lxml it might be possible, but it appears to me that the code needs to be compatible with xml from the stdlib and lxml, or am I wrong?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could import LXML_AVAILABLE and make that conditional on this (if it works).

macro_name = expand_el.get('macro')
assert macro_name is not None, "Attempted to expand macro with no 'macro' attribute defined."

# check for cycles in the nested macro expansion
assert macro_name not in visited, f"Cycle in nested macros: already expanded {visited} can't expand '{macro_name}' again"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure this must be a cycle ? Also list lookups can be expensive, maybe make visited a set or an ordered set?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure this must be a cycle ?

Quite. Essentially this is a DFS of the expansions and if a previously expanded macro is expanded again we have a cycle. Note that visisted is kind of a stack (i.e. the macro names are removed at the end of the function.)

Also list lookups can be expensive, maybe make visited a set or an ordered set?

Then I will go for ordered set, since otherwise the test can not check for the message (or I just test for the prefix "Cycle in nested macros".

lib/galaxy/util/xml_macros.py Outdated Show resolved Hide resolved
bernt-matthias and others added 2 commits January 14, 2022 14:44
Co-authored-by: Marius van den Beek <m.vandenbeek@gmail.com>
Copy link
Member

@mvdbeek mvdbeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, but let's have a look at the IUC tests in galaxyproject/tools-iuc#4332

@bernt-matthias
Copy link
Contributor Author

Seems that the same 15 tools fail that also failed for #13039

if macros copy is actually an `Element` then this gives a deprecation
warning:

xml_macros.py:41: FutureWarning: The behavior of this method will change in future versions. Use specific 'len(elem)' or 'elem is not None' test instead.
  if macros_copy:
@mvdbeek mvdbeek merged commit 68e5099 into galaxyproject:dev Jan 19, 2022
@github-actions
Copy link

This PR was merged without a "kind/" label, please correct.

@bernt-matthias bernt-matthias deleted the topic/xml-replace-fix-test branch March 14, 2022 15:06
bernt-matthias added a commit to bernt-matthias/planemo that referenced this pull request Apr 20, 2022
bernt-matthias added a commit to bernt-matthias/planemo that referenced this pull request Apr 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants