Revise comma handling on templates #2213

diego-plan9 · 2016-10-02T18:20:43Z

A first take for treating commas (ARG_SEP) as a regular character during the template parsing, except when it is used inside a function's argument (ie. %foo{bar,baz}), as discussed on #2166.

I have added a couple of tests checking for the usage of commas before other elements, as it seems that the problem was that the parser stopped processing the rest of the string whenever an unescaped comma was found. The test suite already includes other tests that deal with separators, (a $, b, a , b, %foo{bar,baz}, %foo{bar$,baz}, ...) but I'd be happy to add more if needed.

The solution basically splits the handling of the special characters (special_chars, special_char_re, and the newly promoted escapable_chars and terminator_chars) in two cases: one for regular parsing that does not treats commas in a special way, and one just for parse_argument_list() which does consider commas a special character (ie. in the same way it has been until these changes). I opted for subclassing, albeit I'm not sure it's entirely justified - it could easily be moved to a lower abstraction level, ie. __init__ or parse_expression if preferred.

Add unit tests for the use of the separator special character (comma) outside a function argument.

Remove ARG_SEP from Parser.special_chars, and promote some groups of characters used in parse_expression to class variables. ARG_SEP is still considered an "escapable" character, pending a decision on whether both escaped ('$,') and unescaped (',') syntax would be allowed.

Add ArgumentParser as a subclass of Parser that considers ARG_SEP a special character (ie. always needs escaping, terminates a block); and use it for parsing the substring that contains the list of arguments at parse_argument_list().

diego-plan9 · 2016-10-02T18:28:54Z

beets/util/functemplate.py

    special_char_re = re.compile(r'[%s]|$' %
                                 u''.join(re.escape(c) for c in special_chars))
+    escapable_chars = (SYMBOL_DELIM, FUNC_DELIM, GROUP_CLOSE, ARG_SEP)


I left ARG_SEP deliberately in the list of escapable chars in order to allow both escaped and unescaped commands outside a function (-f 'foo, $bar' would work exactly the same as -f 'foo$, $bar').

I'm undecided about it, though: on one hand, it keeps the syntax backwards compatible-ish, but on the other hand introduces some ambiguity.

Yes, I agree here—I think it's more predictable to make $, work the same way—as an escape—both inside and outside of function arguments.

sampsyo · 2016-10-03T18:12:27Z

beets/util/functemplate.py

    special_char_re = re.compile(r'[%s]|$' %
                                 u''.join(re.escape(c) for c in special_chars))
+    escapable_chars = (SYMBOL_DELIM, FUNC_DELIM, GROUP_CLOSE, ARG_SEP)
+    terminator_chars = (GROUP_CLOSE)


I think you want (GROUP_CLOSE,) here to make it a tuple. Otherwise this is just a one-character string, which happens to work in this case!

Yep, that was the idea - thanks for catching it!

sampsyo

This looks great! I have a few small suggestions; and we should also add a changelog entry.

sampsyo · 2016-10-03T18:14:32Z

beets/util/functemplate.py

@@ -512,6 +513,18 @@ def _parse_ident(self):
        return ident


+class ArgumentsParser(Parser):
+    """``Parser`` that considers ``ARG_SEP`` to be a special character.


Perhaps a more descriptive docstring would say "a parser used inside function arguments," i.e., state the purpose first before getting into the implementation details.

sampsyo · 2016-10-03T18:16:58Z

beets/util/functemplate.py

+    special_char_re = re.compile(r'[%s]|$' %
+                                 u''.join(re.escape(c) for c in special_chars))
+    escapable_chars = (SYMBOL_DELIM, FUNC_DELIM, GROUP_CLOSE, ARG_SEP)
+    terminator_chars = (GROUP_CLOSE, ARG_SEP)


I like this subclassing approach, but I'm a little sad that it required us to copy n' paste the definitions from the base class. This seems like it could get us into trouble if we eventually change one set of lists and forget to update the other one.

It's a tough call, but maybe just an in_argument flag on Parser would be simpler?

Use a `in_argument` flag on Parser constructor for specifying if the parser should treat commas as a special character, including the logic in parse_expression.

diego-plan9 · 2016-10-03T21:13:22Z

beets/util/functemplate.py

+            extra_special_chars = (ARG_SEP,)
+            special_char_re = re.compile(
+                r'[%s]|$' % u''.join(re.escape(c) for c in
+                                     self.special_chars + extra_special_chars))


Take 2, using an in_argument flag! This block introduces an amount of extra-cruft on parse_expression, but I wasn't too comfortable directly overwriting the class variables with instance variables on the constructor, ie:

def __init__(self, string, in_argument=False): ... if in_argument: self.special_chars = ... self.special_char_re = ...

It might be quite a minor concern, as the Parsers that are used for list arguments only get a single call to parse_expression (and none to the other methods) in practice, so I'd be up for another refactoring if you think the trade-off makes sense.

Yes; this looks great! The small bit of extra cruft is a little bit annoying, but I agree this is the right direction. It's preferable, as you note, to overriding those class variables.

sampsyo

This looks great! And quick work too. Thank you for tackling that issue. 🎉 ✨

sampsyo · 2016-10-04T00:52:42Z

beets/util/functemplate.py

+            extra_special_chars = (ARG_SEP,)
+            special_char_re = re.compile(
+                r'[%s]|$' % u''.join(re.escape(c) for c in
+                                     self.special_chars + extra_special_chars))


Yes; this looks great! The small bit of extra cruft is a little bit annoying, but I agree this is the right direction. It's preferable, as you note, to overriding those class variables.

diego-plan9 · 2016-10-04T11:24:59Z

Thanks for the quick review, it's my pleasure to squash bugs! I'm wondering if I could trouble you to revise the Path Formats\Syntax Details documentation before merging, as it will probably result in a clearer explanation of the situation?

sampsyo · 2016-10-04T13:11:09Z

I'd be happy to, but I don't think I can push to your fork to update this PR? I can also just add the docs post-merge, but I'd say something like this:

Commas are used as argument separators in function calls. Inside of a function's argument, use $, to get a literal , character. Outside of any function argument, escaping is not necessary: , by itself will produce , in the output.

Fix a formatting problem related to sphinx not allowing spaces at the beginning or end of an inline literal, and removed an extra sentence at th end of the %first template function documentation.

diego-plan9 · 2016-10-04T14:28:53Z

I'd be happy to, but I don't think I can push to your fork to update this PR?

Hmmm, I think it should be possible since the recent-ish github changes (I have an "Allow edit from mantainers" box checked right after the "Lock conversation" and was prompted when creating the pull request), but honestly haven't tried out that feature.

Nevertheless, I adjusted the documentation based on your suggestion, plus included a fix for a formatting issue one the %first{} section that I noticed while revising generated html file that seems to be caused by a limitation on reST inline markup:

content may not start or end with whitespace: * text* is wrong,

sampsyo · 2016-10-04T16:00:35Z

Aha; thanks for fixing that!

And for committing the docs. I honestly can't quite figure out what's going on with the "allow edit from maintainers" checkbox—AFAICT, I still don't have ordinary push permissions to your repository, but maybe there's some kind of invisible exception that requires some incantation I don't know?

Anyway, this looks great! Please merge whenever you like.

diego-plan9 · 2016-10-04T16:17:39Z

Thanks! As for the permissions issue, it seems that they are "per-branch" (in this case, just for the template-comma-behaviour branch), so there might be some magic involved - or alternatively, I might have missed something on my end, which is always an option! I'll look into it for future pull requests.

Merging!

sampsyo · 2016-10-04T18:27:05Z

Yay! Thanks again. ✨

I'll look into that new feature more closely in future PRs too.

diego-plan9 added 3 commits October 2, 2016 19:17

Add tests for comma outside functions in templates

518c6b8

Add unit tests for the use of the separator special character (comma) outside a function argument.

Add ArgumentParser, taking into account commas

3e82007

Add ArgumentParser as a subclass of Parser that considers ARG_SEP a special character (ie. always needs escaping, terminates a block); and use it for parsing the substring that contains the list of arguments at parse_argument_list().

diego-plan9 commented Oct 2, 2016

View reviewed changes

diego-plan9 changed the title ~~Revise comma handling on templates (#2166)~~ Revise comma handling on templates Oct 2, 2016

sampsyo reviewed Oct 3, 2016

View reviewed changes

sampsyo requested changes Oct 3, 2016

View reviewed changes

Use flag instead of subclass for comma in Parser

c5da629

Use a `in_argument` flag on Parser constructor for specifying if the parser should treat commas as a special character, including the logic in parse_expression.

diego-plan9 commented Oct 3, 2016

View reviewed changes

Add changelog for unescaped commas in Parser

550206a

sampsyo approved these changes Oct 4, 2016

View reviewed changes

diego-plan9 added 2 commits October 4, 2016 16:17

Revise documentation for commas in Parser

0eb0353

Fix documentation issues for %first

f0a14bf

Fix a formatting problem related to sphinx not allowing spaces at the beginning or end of an inline literal, and removed an extra sentence at th end of the %first template function documentation.

diego-plan9 merged commit 9dcd4f7 into beetbox:master Oct 4, 2016

diego-plan9 mentioned this pull request Oct 4, 2016

Template doesn't format at all when contains a comma #2166

Closed

diego-plan9 deleted the template-comma-behaviour branch October 6, 2016 12:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revise comma handling on templates #2213

Revise comma handling on templates #2213

diego-plan9 commented Oct 2, 2016 •

edited

diego-plan9 Oct 2, 2016 •

edited

sampsyo Oct 3, 2016

sampsyo Oct 3, 2016

diego-plan9 Oct 3, 2016

sampsyo left a comment

sampsyo Oct 3, 2016

sampsyo Oct 3, 2016

diego-plan9 Oct 3, 2016 •

edited

sampsyo Oct 4, 2016

sampsyo left a comment

sampsyo Oct 4, 2016

diego-plan9 commented Oct 4, 2016

sampsyo commented Oct 4, 2016

diego-plan9 commented Oct 4, 2016

sampsyo commented Oct 4, 2016

diego-plan9 commented Oct 4, 2016

sampsyo commented Oct 4, 2016

Revise comma handling on templates #2213

Revise comma handling on templates #2213

Conversation

diego-plan9 commented Oct 2, 2016 • edited

diego-plan9 Oct 2, 2016 • edited

Choose a reason for hiding this comment

sampsyo Oct 3, 2016

Choose a reason for hiding this comment

sampsyo Oct 3, 2016

Choose a reason for hiding this comment

diego-plan9 Oct 3, 2016

Choose a reason for hiding this comment

sampsyo left a comment

Choose a reason for hiding this comment

sampsyo Oct 3, 2016

Choose a reason for hiding this comment

sampsyo Oct 3, 2016

Choose a reason for hiding this comment

diego-plan9 Oct 3, 2016 • edited

Choose a reason for hiding this comment

sampsyo Oct 4, 2016

Choose a reason for hiding this comment

sampsyo left a comment

Choose a reason for hiding this comment

sampsyo Oct 4, 2016

Choose a reason for hiding this comment

diego-plan9 commented Oct 4, 2016

sampsyo commented Oct 4, 2016

diego-plan9 commented Oct 4, 2016

sampsyo commented Oct 4, 2016

diego-plan9 commented Oct 4, 2016

sampsyo commented Oct 4, 2016

diego-plan9 commented Oct 2, 2016 •

edited

diego-plan9 Oct 2, 2016 •

edited

diego-plan9 Oct 3, 2016 •

edited