New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More, Better XML Macros #362

Merged
merged 9 commits into from Jun 23, 2015

Conversation

Projects
None yet
7 participants
@jmchilton
Copy link
Member

jmchilton commented Jun 17, 2015

There are two primary improvements here.

  1. XML macros are now parameterizable. This should allow for generic macros for things like data tables. e88b73e
  2. Galaxy's job_conf.xml now expands macros during loading. This should allow building macros that allow defining categories for destinations for instance and should result in simplifications and duplication reductions for complex job conf files. 483acd5

I think together these changes represent a more generic solution to the problem with Docker volumes being addressed with Intel-HSS#28. These should allow building a generic destination for various Intel clusters with simple parameters for the things that change between destination (maybe just resource allocations).


More on paramerization of XML macros:

This comes in two modes - simple lists of required attributes and per-attribute parameters that may specify defaults.

The following toy example:

<tool>
    <expand macro="inputs" bar="hello" />
    <macros>
        <xml name="inputs" tokens="bar" token_quote="$$">
            <inputs type="the type is $$BAR$$" />
        </xml>
    </macros>
</tool>

will evaluate to <tool><inputs type="the type is hello" /><tool>. The XML macro definition tag can now take in the new attribute tokens - which will be interpreted as a list of parameter names. Each parameter can be specified when expanding the macros as simple attributes specifying the value of the parameter. The parameters will be wrapped in token_quote when creating actual tokens for replacement in the XML file - the default value for token_quote is @.

Parameters can also be defined individual to specify default values. So the following example:

<tool>
    <expand macro="inputs" foo="hello" />
    <expand macro="inputs" foo="world" />
    <expand macro="inputs" />
    <macros>
        <xml name="inputs" token_foo="the_default">
            <inputs>@FOO@</inputs>
        </xml>
    </macros>
</tool>

Would evaluate to <tool><inputs>hello</inputs><inputs>world</inputs><inputs>the_default</inputs></tool>.

@erasche

This comment has been minimized.

Copy link
Member

erasche commented Jun 17, 2015

This is wonderful, 👍/10, I am very excited for this.

Any reason for supporting both "dollar" and "at" parameters?

@martenson

This comment has been minimized.

Copy link
Member

martenson commented Jun 17, 2015

Looks great John. Please update our wiki to reflect these changes after merge. In case you resist I volunteer to update the wiki with your commit messages.

@jmchilton

This comment has been minimized.

Copy link
Member

jmchilton commented Jun 17, 2015

@erasche My intuition tells me that if we only picked one there would crop up a situation where that choice would be awkward - though I cannot think of many specific contexts where @ symbols are used like this. So I might be keen to drop the dollar_ and simplify things to just tokens and foo_token - but these tokens operate different than the <token> tag which doesn't modify the name or append/prepend things - so I like being explicit with the names at_tokens and at_token_foo. So if we have at tokens it seems very easy to have other kinds as well.

One way in which I would probably use the two token types - is to stick with @ for global tokens (which it seems like most people do) and then use something else for the contextual tokens.

Though you know what - $s are very commonly used in both cheetah and shell scripts - so that is a terrible choice for a second token type. I think I will redo this and drop the dollar token type and add amp and percent token types - unless people think we should just stick with at tokens for now.

@nsoranzo

This comment has been minimized.

Copy link
Member

nsoranzo commented Jun 17, 2015

What about adding a token_quote attribute to the <xml> element, with @ as default?

This would apply to all tokens in this element, which can be defined with a simple tokens attribute.

@jmchilton

This comment has been minimized.

Copy link
Member

jmchilton commented Jun 17, 2015

@nsoranzo - yes this! It is very obvious this is the right thing to do now that you explain it :).

@jmchilton jmchilton force-pushed the jmchilton:xml_macros_3 branch from 483acd5 to 1adabb8 Jun 18, 2015

jmchilton added some commits Jun 16, 2015

Refactor tool loader toward more reusable token expansion.
Refactoring tool macro expansion to evaluate tokens along with XML which will give Galaxy a chance to evaluate xml macro dependent tokens. This in turn will allow parameterized XML macros in the next commit.
Allow parameterization of XML macros.
This comes in two modes - simple lists of required attributes and per-attribute parameters that may specify defaults.

The following toy example:

```
<tool>
    <expand macro="inputs" bar="hello" />
    <macros>
        <xml name="inputs" dollar_tokens="bar">
            <inputs type="the type is $BAR$" />
        </xml>
    </macros>
</tool>
```

will evalute to ``<tool><inputs type="the type is hello" /><tool>``. The XML macro definition tag can now take in two new fixed attributes dollar_tokens and at_tokens - which will be interpreted as a list of parameter names. Each parameter can be specified when expanding the macros as simple attributes specifying the value of the parameter. The "dollar" parameter with name bar would map to a token with name $BAR$. Likewise an "at" parameter with name foo would map to a token with name @foo@.

Both parameter types can be specified on a per-parameter basis to also specify default values. So the following example:

```
<tool>
    <expand macro="inputs" foo="hello" />
    <expand macro="inputs" foo="world" />
    <expand macro="inputs" />
    <macros>
        <xml name="inputs" at_token_foo="the_default">
            <inputs>@foo@</inputs>
        </xml>
    </macros>
</tool>
```

Would evalute to ``<tool><inputs>hello</inputs><inputs>world</inputs><inputs>the_default</inputs>``.

Updated unit tests explain the above much more succinctly - though some like English and words for reasons that escape me.
Enable XML macros for the job conf XML file.
Should allow large de-deduplication of repeated XML definitions when defining job destinations.

Given that the same object type (destination) encodes per-tool information (such as resource allocations), per-filesystem such path configurations for Docker or Pulsar), and per-cluster information (such as cluster/Docker/pulsar  connection information) - non-trivial deployments will likely have a lot of duplicated XML in job_conf.xml. Macros work exactly the same way as with tools and should allow a great decrease in this XML.

@jmchilton jmchilton force-pushed the jmchilton:xml_macros_3 branch from 1adabb8 to e5d90c7 Jun 18, 2015

Modify XML macro parameters to allow arbitrary wrap string.
Previously there were two hard-coded toke types at_tokens an dollar_tokens. Now all tokens for the same macro must have the same wrap string and this string can be set arbitrarily with token_quote attribute. The defualt token_quote value is @.

@jmchilton jmchilton force-pushed the jmchilton:xml_macros_3 branch from e5d90c7 to 711d4bb Jun 18, 2015

@jmchilton

This comment has been minimized.

Copy link
Member

jmchilton commented Jun 18, 2015

Okay - requested modifications made. Pull request message and test cases updated.

Any remaining concerns?

@bgruening

This comment has been minimized.

Copy link
Member

bgruening commented Jun 18, 2015

This will change a lot! Thanks John! 👍

@martenson

This comment has been minimized.

Copy link
Member

martenson commented Jun 18, 2015

👍

1 similar comment
@davebx

This comment has been minimized.

Copy link
Contributor

davebx commented Jun 22, 2015

👍

@nsoranzo

This comment has been minimized.

Copy link
Member

nsoranzo commented Jun 22, 2015

Can you add the new lib/galaxy/util/xml_macros.py to .ci/pep8_sources.txt ?

@@ -189,3 +189,41 @@ def load(self, name="tool.xml", preprocess=True):
tag_el = xml.find("another").find("tag")
value = tag_el.get('value')
assert value == "The value.", value

# Test macros XML macros with $ expansions in attributes

This comment has been minimized.

@nsoranzo

nsoranzo Jun 22, 2015

Member

s/$/$$/ or change the code below.

@jmchilton

This comment has been minimized.

Copy link
Member

jmchilton commented Jun 22, 2015

@nsoranzo - done. Thanks for the catch. I suppose we should update CONTRIBUTING.md to mention that all new Python files should be added to pep8_sources.txt.

@dannon

This comment has been minimized.

Copy link
Member

dannon commented Jun 22, 2015

Asked on IRC as an informal query, but it never got a response. Why move everything to macros? There's non-macro related stuff that's getting moved to that module like load_tool, etc.

@jmchilton

This comment has been minimized.

Copy link
Member

jmchilton commented Jun 22, 2015

@dannon load_tool is renamed to load in the new module. It has no "tool" specific logic per-se except arguably the existence of template macros - which unlike "xml" and "token" macros have no use in job confs currently. If we opt-ed to templatize more of job_conf.xml though - we certainly could allow template macros in the future for consistency.

@nsoranzo

This comment has been minimized.

Copy link
Member

nsoranzo commented Jun 22, 2015

Maybe add an example of macro use to config/job_conf.xml.sample_advanced ?

@dannon

This comment has been minimized.

Copy link
Member

dannon commented Jun 22, 2015

Right, I see that it was renamed from load_tool to load. I don't particularly like this new organization at all since it seems like we've just shoved everything into a macros module when a lot of it seems higher level than macro interpretation, but I'm a -0, so that's fine.

@jmchilton

This comment has been minimized.

Copy link
Member

jmchilton commented Jun 22, 2015

@dannon - Are there specific things (line numbers, functions, etc...) you feel are not related to xml_macros in the new xml_macros.py? Would you like imported_macro_paths and template_macro_params back in loader.py?

Add example and test case for job configuration expansion.
First suggestion by @nsoranzo and second was easy once example was added to advanced config.
@jmchilton

This comment has been minimized.

Copy link
Member

jmchilton commented Jun 22, 2015

@nsoranzo Example added - and added a test case since the job configuration unit tests are based on that file anyway.

def test_macro_expansion( self ):
self.__with_advanced_config()
for name in ["foo_small", "foo_medium", "foo_large", "foo_longrunning"]:
print self.job_config.destinations.keys()

This comment has been minimized.

@nsoranzo

nsoranzo Jun 22, 2015

Member

Debug statement? I suppose it can be removed.

This comment has been minimized.

@jmchilton

jmchilton Jun 22, 2015

Member

@nsoranzo fixed. Added that test file to pep8_sources.txt as well.

@nsoranzo

This comment has been minimized.

Copy link
Member

nsoranzo commented Jun 22, 2015

@nsoranzo Example added - and added a test case since the job configuration unit tests are based on that file anyway.

@jmchilton Awesome! Thanks for adding both!

Fix stray print noticed by @nsoranzo.
And add that file to pep8 sources.

@jmchilton jmchilton force-pushed the jmchilton:xml_macros_3 branch from 5168a4d to f90e2e1 Jun 22, 2015

@nsoranzo

This comment has been minimized.

Copy link
Member

nsoranzo commented Jun 22, 2015

👍

@dannon

This comment has been minimized.

Copy link
Member

dannon commented Jun 23, 2015

@jmchilton It's fine. There isn't much I'd pull out, but that the root 'load_tool' actually comes from the macros module seems off; load_tool, raw_tool_xml_tree, set_children, replace, etc., should remain in the more general tool loader class IMO. Merging anyway.

dannon added a commit that referenced this pull request Jun 23, 2015

@dannon dannon merged commit d1179eb into galaxyproject:dev Jun 23, 2015

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@jmchilton jmchilton referenced this pull request Sep 30, 2015

Merged

Add hisat2 wrapper. #288

@bgruening bgruening referenced this pull request Oct 1, 2015

Merged

Swap all repeats to multiple='True' where appropriate #276

1 of 1 task complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment