New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for new-style string formatting #38

Closed
sprt opened this Issue May 5, 2015 · 4 comments

Comments

Projects
None yet
2 participants
@sprt

sprt commented May 5, 2015

No description provided.

@MattDMo

This comment has been minimized.

Show comment
Hide comment
@MattDMo

MattDMo May 26, 2015

Owner

Can you please elaborate? What sort of support would you like to see?

Owner

MattDMo commented May 26, 2015

Can you please elaborate? What sort of support would you like to see?

@sprt

This comment has been minimized.

Show comment
Hide comment
@sprt

sprt May 26, 2015

See https://docs.python.org/2/library/string.html#formatstrings

For example, {} in '{}'.format(foo) isn't properly highlighted but '%s' % foo is.

sprt commented May 26, 2015

See https://docs.python.org/2/library/string.html#formatstrings

For example, {} in '{}'.format(foo) isn't properly highlighted but '%s' % foo is.

@MattDMo

This comment has been minimized.

Show comment
Hide comment
@MattDMo

MattDMo May 26, 2015

Owner

OK, I'll see what I can do. Development is kind of at a low ebb at the moment, as I'm pretty busy with my real job, but I'll definitely put this on the list. If you have any contributions to make, feel free to submit a PR.

Owner

MattDMo commented May 26, 2015

OK, I'll see what I can do. Development is kind of at a low ebb at the moment, as I'm pretty busy with my real job, but I'll definitely put this on the list. If you have any contributions to make, feel free to submit a PR.

@sprt

This comment has been minimized.

Show comment
Hide comment
@sprt

sprt Jun 13, 2015

Okay I came up with this regex:

(?<![^\{]\{)\{(?:(?:(?:[A-Za-z_]\w*|(?:[1-9]\d*|0|0[Oo]?[0-7]+|0[Xx][0-7]+|0[Bb][01]+)))?(?:\.[A-Za-z_]\w*|\[(?:(?:[1-9]\d*|0|0[Oo]?[0-7]+|0[Xx][0-7]+|0[Bb][01]+)|[^\]\}}\{{]+)\])*)?(?:\![rsa])?(?:\:(?:(?:.?[<>=\^])?[\x20+-]?\#?0?(?:[1-9]\d*|0|0[Oo]?[0-7]+|0[Xx][0-7]+|0[Bb][01]+)?,?(?:\.(?:[1-9]\d*|0|0[Oo]?[0-7]+|0[Xx][0-7]+|0[Bb][01]+))?[bcdEeFfGgnosXx%]?))?\}(?!\}[^\}])

This is the script I used to build it:

import re

def merge(*args):
    return r'(?:{})'.format(r'|'.join(args))

# integer
bininteger = r'0[Bb][01]+'
hexinteger = r'0[Xx][0-7]+'
octinteger = r'0[Oo]?[0-7]+'
decimalinteger = r'[1-9]\d*|0'
integer = merge(decimalinteger, octinteger, hexinteger, bininteger)

# format_spec
type_ = r'[bcdEeFfGgnosXx%]'
precision = integer
width = integer
sign = r'[\x20+-]'
align = r'[<>=\^]'
fill = r'.'
format_spec = r'(?:(?:{fill}?{align})?{sign}?\#?0?{width}?,?(?:\.{precision})?{type_}?)'.format(**locals())

conversion = r'[rsa]'
index_string = r'[^\]\}}\{{]+'
identifier = r'[A-Za-z_]\w*'
element_index = merge(integer, index_string)
attribute_name = identifier
arg_name = r'(?:{})?'.format(merge(identifier, integer))
field_name = r'(?:{arg_name}(?:\.{attribute_name}|\[{element_index}\])*)'.format(**locals())
replacement_field = r'(?<![^\{{]\{{)\{{{field_name}?(?:\!{conversion})?(?:\:{format_spec})?\}}(?!\}}[^\}}])'.format(**locals())

print(replacement_field)

You can see here that it works pretty well. I just copy-pasted the examples from the Python docs.

These are the only strings it can't parse:

'{:%Y-%m-%d %H:%M:%S}'
'{0:{fill}{align}16}'
'{0:{width}{base}}'

but to be honest they don't look valid according to the grammar. There seem to be some discrepancies between the docs and the implementation as I had to edit the index_string regex according to this.

sprt commented Jun 13, 2015

Okay I came up with this regex:

(?<![^\{]\{)\{(?:(?:(?:[A-Za-z_]\w*|(?:[1-9]\d*|0|0[Oo]?[0-7]+|0[Xx][0-7]+|0[Bb][01]+)))?(?:\.[A-Za-z_]\w*|\[(?:(?:[1-9]\d*|0|0[Oo]?[0-7]+|0[Xx][0-7]+|0[Bb][01]+)|[^\]\}}\{{]+)\])*)?(?:\![rsa])?(?:\:(?:(?:.?[<>=\^])?[\x20+-]?\#?0?(?:[1-9]\d*|0|0[Oo]?[0-7]+|0[Xx][0-7]+|0[Bb][01]+)?,?(?:\.(?:[1-9]\d*|0|0[Oo]?[0-7]+|0[Xx][0-7]+|0[Bb][01]+))?[bcdEeFfGgnosXx%]?))?\}(?!\}[^\}])

This is the script I used to build it:

import re

def merge(*args):
    return r'(?:{})'.format(r'|'.join(args))

# integer
bininteger = r'0[Bb][01]+'
hexinteger = r'0[Xx][0-7]+'
octinteger = r'0[Oo]?[0-7]+'
decimalinteger = r'[1-9]\d*|0'
integer = merge(decimalinteger, octinteger, hexinteger, bininteger)

# format_spec
type_ = r'[bcdEeFfGgnosXx%]'
precision = integer
width = integer
sign = r'[\x20+-]'
align = r'[<>=\^]'
fill = r'.'
format_spec = r'(?:(?:{fill}?{align})?{sign}?\#?0?{width}?,?(?:\.{precision})?{type_}?)'.format(**locals())

conversion = r'[rsa]'
index_string = r'[^\]\}}\{{]+'
identifier = r'[A-Za-z_]\w*'
element_index = merge(integer, index_string)
attribute_name = identifier
arg_name = r'(?:{})?'.format(merge(identifier, integer))
field_name = r'(?:{arg_name}(?:\.{attribute_name}|\[{element_index}\])*)'.format(**locals())
replacement_field = r'(?<![^\{{]\{{)\{{{field_name}?(?:\!{conversion})?(?:\:{format_spec})?\}}(?!\}}[^\}}])'.format(**locals())

print(replacement_field)

You can see here that it works pretty well. I just copy-pasted the examples from the Python docs.

These are the only strings it can't parse:

'{:%Y-%m-%d %H:%M:%S}'
'{0:{fill}{align}16}'
'{0:{width}{base}}'

but to be honest they don't look valid according to the grammar. There seem to be some discrepancies between the docs and the implementation as I had to edit the index_string regex according to this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment