Enhancement: Add custom formatting specifiers for uppercase, lowercase, etc. in templates #5736

BigRoy · 2023-10-09T21:51:04Z

Is there an existing issue for this?

I have searched the existing issues.

Please describe the feature you have in mind and explain what the current shortcomings are?

It might be worth using a custom formatter in OpenPype so we don't need all the keys like Task task, TASK etc. in the data and can actually explicitly state what we want to have done with the value.

It would allow explicitly forcing lowercase, uppercase or whatever we want as custom formatting. 🙂
Plus, you'd just need to just extend the anatomy formatting class and don't need to actually wrangle all data. It'd then automatically work for all data formatted with anatomy.

Simpler. (And faster potentially since the formatting would only occur when requested instead of preparing the data beforehand).
So uppercase task would e.g. be {task:upper} or {task!u} or whatever we want to use as format specifier or solution.

How would you imagine the implementation of the feature?

Custom value conversions

An example of specifying custom conversion specs for formatting can be found here which shows how to add e.g. {myvar!u} to uppercase a string or {myvar!l} to lowercase a string.

Here's an example of that:

from string import Formatter

class ExtendedFormatter(Formatter):
    """An extended format string formatter

    Formatter with extended conversion symbol
    """
    def convert_field(self, value, conversion):
        """ Extend conversion symbol
        
        Conversions:
        * l: convert to string and low case
        * u: convert to string and up case

        """
        if conversion == "u":
            return str(value).upper()
        elif conversion == "l":
            return str(value).lower()
        # Do the default conversion or raise error if no matching conversion found
        return super(ExtendedFormatter, self).convert_field(value, conversion)


# Example
template = "{task!u}/{asset!l}/v{version:03d}"
data = {
  "task": "anim",
  "asset": "char_SuperHero",
  "version": 1
}
formatter = ExtendedFormatter()
output = formatter.format(template, **data)
print(output)
# ANIM/char_superhero/v001

This requires the conversion character to start with ! and the conversion can only be a single character.

Custom field format specifiers

However, it's also possible to define custom field conversions - where we can use the format spec to implement the same but allowing e.g. longer readable specs like :lower and :upper

from string import Formatter

class MyFormatter(Formatter):
    def format_field(self, value, format_spec):
        if format_spec == "upper":
            return str(value).upper()
        elif format_spec == "lower":
            return str(value).lower()
        
        return super(MyFormatter, self).format_field(value, format_spec)


# Example
template_str = "{task:upper}/{asset:lower}/{subset:lower}/v{version:03d}"
data = {
  "task": "anim",
  "asset": "char_SuperHero",
  "subset": "helloWorld",
  "version": 1
}
myformatter = MyFormatter()
output = myformatter.format(template_str, **data)
print(output)
# ANIM/char_superhero/helloworld/v001

Maybe low and up are nicer because they are shorter maybe?

These could be implemented on the StringTemplate class when formatting.

Custom field format specifiers - uppercase only first character

We currently also support {Task} to only uppercase the first character - we can take that same logic and implement it as well:

from string import Formatter

class MyFormatter(Formatter):
    def format_field(self, value, format_spec):
        if format_spec == "upper":
            return str(value).upper()
        elif format_spec == "lower":
            return str(value).lower()
        elif format_spec == "upperfirst":
            value = str(value)
            if value:
                # Uppercase first character, leave rest as is
                return "".join([value[0].upper(), value[1:]])
            else:
                return value
        
        return super(MyFormatter, self).format_field(value, format_spec)


# Example
template_str = "{task:upper}/{asset:lower}/{family}{task:upperfirst}{variant:upperfirst}/v{version:03d}"
data = {
  "task": "anim",
  "asset": "char_SuperHero",
  "family": "render",
  "variant": "main",
  "version": 1
}
myformatter = MyFormatter()
output = myformatter.format(template_str, **data)
print(output)
# ANIM/char_superhero/renderAnimMain/v001

Are there any labels you wish to add?

I have added the relevant labels to the enhancement request.

Describe alternatives you've considered:

The alternative is basically the hassle we've been riding with currently where all data needs to be prepared with: {task}, {Task}, {TASK}, etc.

It would mean we'd need to preformat all data, which can be slow due to the fact that we're likely formatting a lot of versions that we don't end up using in the data. Also with these keys it's unclear what e.g. would be a key for force lowercase formatting.

Additional context:

Originally proposed on Discord

[cuID:OP-7100]

The text was updated successfully, but these errors were encountered:

MustafaJafar · 2023-10-16T14:01:56Z

Simply Brilliant!
I like the idea of adding extended formatter, I think we may be able to combine the two proposed ideas
e.g.
resolve : User -> user:upperFirst
then use your code above.

MustafaJafar · 2023-10-16T14:29:25Z

I've made a quick prototype.
I believe this function can be imporved!

def my_formatter(my_dict, key):
    # Test if only the first letter is uppercase
    if key[0].isupper() and not key.isupper() :
        return my_dict[key.lower()].capitalize()
        
    # Test whether all letters are uppercase
    elif key.isupper() :
        return my_dict[key.lower()].upper()
    
    # I'm forcing lower case because someone may write stupid thing like "uSER"
    return my_dict[key.lower()]


phonebook = {"user" : "mustafa"}

print( my_formatter(phonebook, "user"))
print( my_formatter(phonebook, "User"))
print( my_formatter(phonebook, "USER"))

BigRoy · 2023-10-16T14:41:55Z

resolve : User -> user:upperFirst
then use your code above.

Yes, we could do that for backwards compatibility for the time being - but of course having to parse the keys all the time would be slower than just formatting what gets requested.

Technically we could override Formatter.parse to convert certain cases as legacy like you described:

from string import Formatter

class MyFormatter(Formatter):
    
    def format_field(self, value, format_spec):
        if format_spec == "upper":
            return str(value).upper()
        elif format_spec == "lower":
            return str(value).lower()
        elif format_spec == "upperfirst":
            value = str(value)
            if value:
                # Uppercase first character, leave rest as is
                return "".join([value[0].upper(), value[1:]])
            else:
                return value
        
        return super(MyFormatter, self).format_field(value, format_spec)

    def parse(self, format_string):
        for (literal_text, field_name, format_spec, conversion) in super(MyFormatter, self).parse(format_string):
            if field_name and not format_spec:
                # Allow legacy uppercase conversions
                if field_name[0].isupper() and not field_name[1:].isupper():
                    # If field name has only uppercase first character 
                    field_name = field_name.lower()
                    format_spec = "upperfirst"
                elif field_name.isupper():
                    # All characters are uppercase
                    field_name = field_name.lower()
                    format_spec = "upper"
            yield (literal_text, field_name, format_spec, conversion)
                
        
formatter = MyFormatter()
text = """
upper: {HELLO} {WORLD}
upperfirst: {Hello} {World}
regular: {hello} {world}
"""
data = {"world": "world", "hello": "hello"}
result = formatter.format(text, **data)
print(result)

Prints:

upper: HELLO WORLD
upperfirst: Hello World
regular: hello world

BigRoy · 2023-10-16T14:45:55Z

Note that the above example I provided would be problematic with formatting e.g. environment variables because the keys there are always all uppercase and thus would be considered the upper format specifier, e.g.:

formatter.format("{AVALON_TASK}", **{"AVALON_TASK": "test"})

This will result in a key error because avalon_task key is not provided.

iLLiCiTiT · 2023-11-09T09:51:07Z

Ad backwards compatibility. I don't think we should handle backwards compatibility in the formatting logic. We should still pass in the same data as we do now, until it's fully deprecated.

I was thinking about implementing callable attributes access in the formatter, instead of using formatting description.
That requires formatter_field_name_split from _string because it is C implementation, but I think it is more "intuitive" to use.

import re
from string import Formatter, _string


class Akwargs:
    def __init__(self, args_str):
        args, kwargs = self._parse(args_str)
        self.args = tuple(args)
        self.kwargs = kwargs

    def _convert_value(self, value):
        if value == "True":
            return True
        if value == "False":
            return False
        if value == "None":
            return None

        try:
            return float(value)
        except ValueError:
            pass

        try:
            return int(value)
        except ValueError:
            pass
        return value.strip('"')

    def _parse(self, args_str):
        parts = args_str.split(",")
        args = []
        kwargs = {}
        for part in parts:
            part = part.strip(" ")
            if not part:
                continue
            if "=" in part:
                key, value = part.split("=")
                kwargs[key] = self._convert_value(value)
            else:
                args.append(self._convert_value(part))
        return args, kwargs


class MyFormatter(Formatter):
    _char_regex = re.compile(r"[a-zA-Z0-9]")

    def _upperfirst(self, value):
        capitalized = ""
        for idx in range(len(value or "")):
            char = value[idx]
            if not self._char_regex.match(char):
                capitalized += char
            else:
                capitalized += char.upper()
                capitalized += value[idx + 1:]
                break
        return capitalized

    def _parse_func_args_kwargs(self, attr_name):
        if "(" not in attr_name:
            return False, tuple(), dict()

        attr_name, args = attr_name.split("(", 1)
        args_kwargs = Akwargs(args.strip(")"))
        return attr_name, True, args_kwargs.args, args_kwargs.kwargs

    def get_field(self, field_name, args, kwargs):
        first, rest = _string.formatter_field_name_split(field_name)

        obj = self.get_value(first, args, kwargs)

        # loop through the rest of the field_name, doing
        #  getattr or getitem as needed
        for is_attr, attr_name in rest:
            if not is_attr:
                obj = obj[attr_name]
                continue
            attr_name, is_callable, func_args, func_kwargs = (
                self._parse_func_args_kwargs(attr_name)
            )
            # Fake 'str' method
            if attr_name == "upperfirst":
                if not is_callable:
                    raise AttributeError(
                        "'str' object has no attribute 'upperfirst'"
                    )
                obj = self._upperfirst(obj)
                continue

            obj = getattr(obj, attr_name)
            if is_callable:
                obj = obj(*func_args, **func_kwargs)
        return obj, first


class TestClass:
    def get_value(self, test=True):
        if test:
            return "Value1"
        return "Value2"

# Example
template_str = "{task.upper()}/{folder[name].lower()}/{family}{task.upperfirst()}/{variant.get_value(test=False)}/v{version:03d}"
data = {
    "task": "anim",
    "asset": "char_SuperHero",
    "folder": {"name": "char_SuperHero"},
    "family": "render",
    "variant": TestClass(),
    "version": 1
}

myformatter = MyFormatter()
output = myformatter.format(template_str, **data)
print(output)

Note: It does not work if called method would have brackets (, ) or dot . in args (e.g. {some_obj.add_offset(offset=1.0, data=dict())}).

BigRoy · 2023-11-09T10:20:59Z

Ad backwards compatibility. I don't think we should handle backwards compatibility in the formatting logic. We should still pass in the same data as we do now, until it's fully deprecated.

Agreed - safest way forward.

I think allowing custom scripts to be triggered will likely be:

More complex in code
More edges cases someone could run into - easier to break
Likely "nice" for developers but harder for admins?
Slower maybe?
But mostly, potentially dangerous if someone just happens to be able to create a string that runs random code via the templating system.

Additionally - I feel like if we're going that far with the formatting we're just as well off doing f-string evaluations or calls to eval - which also, is dangerous (and slow?).

I think we're best off designing it so that it only allows certain calls, not all - so we can document them and test them. If more is needed specifically and it's a logical request - then we implement the extra formatting.

iLLiCiTiT · 2023-11-09T10:37:01Z

More edges cases someone could run into

I don't see any more other edge cases than with custom descriptor...

Likely "nice" for developers but harder for admins?

I would say more admins would more understand python calls than using custom formatting descriptor. Learning curve is same if they don't understand neither.

But mostly, potentially dangerous if someone just happens to be able to create a string that runs random code via the templating system.

It would be dangerous only if you would pass dangerous objects in... Only methods can be called, not random functions as in eval (you can't call sys.exit(0) until "sys" is passed in as {"sys": sys}).

Slower maybe?

We still have to use StringTemplate because of optional keys, which is slow because it's custom implementation of template formatting.

But almost nothing of that is important. We would still need to "insert" the logic to StringTemplate, and keep it's logic. If you think it's easy, and must have now, then do it please.

BigRoy added the type: enhancement Enhancements to existing functionality label Oct 9, 2023

BigRoy changed the title ~~Enhancement: Add custom formatting specifiers for uppercase, lowercase, etc.~~ Enhancement: Add custom formatting specifiers for uppercase, lowercase, etc. in templates Oct 9, 2023

jakubjezek001 assigned Minkiu, antirotor and iLLiCiTiT Oct 10, 2023

MustafaJafar added the community contribution label Oct 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhancement: Add custom formatting specifiers for uppercase, lowercase, etc. in templates #5736

Enhancement: Add custom formatting specifiers for uppercase, lowercase, etc. in templates #5736

BigRoy commented Oct 9, 2023 •

edited by jakubjezek001

MustafaJafar commented Oct 16, 2023 •

edited

MustafaJafar commented Oct 16, 2023

BigRoy commented Oct 16, 2023 •

edited

BigRoy commented Oct 16, 2023

iLLiCiTiT commented Nov 9, 2023 •

edited

BigRoy commented Nov 9, 2023 •

edited

iLLiCiTiT commented Nov 9, 2023 •

edited

Enhancement: Add custom formatting specifiers for uppercase, lowercase, etc. in templates #5736

Enhancement: Add custom formatting specifiers for uppercase, lowercase, etc. in templates #5736

Comments

BigRoy commented Oct 9, 2023 • edited by jakubjezek001

Is there an existing issue for this?

Please describe the feature you have in mind and explain what the current shortcomings are?

How would you imagine the implementation of the feature?

Custom value conversions

Custom field format specifiers

Custom field format specifiers - uppercase only first character

Are there any labels you wish to add?

Describe alternatives you've considered:

Additional context:

MustafaJafar commented Oct 16, 2023 • edited

MustafaJafar commented Oct 16, 2023

BigRoy commented Oct 16, 2023 • edited

BigRoy commented Oct 16, 2023

iLLiCiTiT commented Nov 9, 2023 • edited

BigRoy commented Nov 9, 2023 • edited

iLLiCiTiT commented Nov 9, 2023 • edited

BigRoy commented Oct 9, 2023 •

edited by jakubjezek001

MustafaJafar commented Oct 16, 2023 •

edited

BigRoy commented Oct 16, 2023 •

edited

iLLiCiTiT commented Nov 9, 2023 •

edited

BigRoy commented Nov 9, 2023 •

edited

iLLiCiTiT commented Nov 9, 2023 •

edited