Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use sign %? #429

Closed
GTres opened this issue Sep 7, 2016 · 12 comments · Fixed by #911
Closed

How to use sign %? #429

GTres opened this issue Sep 7, 2016 · 12 comments · Fixed by #911

Comments

@GTres
Copy link

GTres commented Sep 7, 2016

Define Sign %

I define:
from pint import UnitRegistry
ureg=UnitRegistry()
ureg.define('percent = 1 = %')
x = 20 * u('1 %')
print('x: {0}'.format(x))
print('x: {:L}'.format(x))

and the output is:
x: 20
ValueError: Unknown format code 'L' for object of type 'int'

I need:
x: 20%
x: 20\%

@hgrecco
Copy link
Owner

hgrecco commented Sep 13, 2016

The problem is that % is a valid operator for python
For example:

>>> 5 % 3
2

And therefore allowing % will mean that we loose it.

@GTres
Copy link
Author

GTres commented Sep 14, 2016

Thank hgrecco,

@mcgibbon
Copy link

mcgibbon commented Dec 6, 2016

I was able to get around this limitation by subclassing UnitRegistry:

class UnitRegistry(pint.UnitRegistry):

    def __call__(self, input_string, **kwargs):
        if input_string == '%':
            input_string = 'percent'
        return super(UnitRegistry, self).__call__(input_string, **kwargs)


unit_registry = UnitRegistry()
unit_registry.define('percent = 0.01*count = %')

This works so long as you use conventions like 50*unit_registry('%') instead of unit_registry('50%'). You could instead use input_string.replace('%', 'percent') if you want more complex strings:

class UnitRegistry(pint.UnitRegistry):

    def __call__(self, input_string, **kwargs):
        return super(UnitRegistry, self).__call__(input_string.replace('%', 'percent'), **kwargs)


unit_registry = UnitRegistry()
unit_registry.define('percent = 0.01*count = %')

This could also be applied to other pint classes in whatever way is relevant to your specific project. Given "percent" isn't even defined in base pint, it's probably not worth hacking in to the base package itself.

This solution is also probably usable for #449 .

@onealj
Copy link
Contributor

onealj commented Dec 7, 2016 via email

@hgrecco
Copy link
Owner

hgrecco commented Apr 2, 2017

@mcgibbon If you go this way, it is better to change the preprocessor. I was thinking for a while to make the preprocessor puggable (or at least editable). Then you will be able to do something like:

unit_registry = UnitRegistry() 
unit_registry.pre_processor.append('%', 'percent')

@crusaderky
Copy link
Contributor

A more thorough temporary hack, which works with ureg("5%"), ureg.Quantity("5%") and ureg.Unit("%"):

    def __call__(self, input_string, **kwargs):
        """Hack around `pint#429 <https://github.com/hgrecco/pint/issues/429>`_
        to support % sign
        """
        return super().__call__(_fix_percent(input_string), **kwargs)

    def parse_expression(self, input_string, *args, **kwargs):
        """Allow % sign
        """
        return super().parse_expression(_fix_percent(input_string), *args, **kwargs)

    def parse_units(self, input_string, *args, **kwargs):
        """Allow % sign
        """
        return super().parse_units(_fix_percent(input_string), *args, **kwargs)


def _fix_percent(x: str) -> str:
    return x.replace("%%", " per_mille ").replace("%", " percent ")

@jthielen
Copy link
Contributor

jthielen commented Sep 4, 2019

I have a draft implementation of an extensible string preprocessor here: https://gist.github.com/jthielen/57026fee4dc6599a3fb646000b8ca748

But, I do have a question I wanted to ask before getting it ready for a PR: should the preprocessor be extensible on a global level or a registry level (or something else)?

While extensibility on the registry level (i.e., the preprocessor instance being a property of the BaseRegistry class) seems like the cleanest solution from a user point-of-view, given how widely used the current string_preprocessor() function is throughout in the library in a non-registry-dependent way (e.g., through ParserHelper and UnitsContainer), this would likely require some significant internal API changes. Is this desired, or should I go with the easier solution of letting the preprocessor instance be set in util.py directly in place of the current string_preprocessor()? In this case, the preprocessor instance can still be accessible on the registry, it would just end up being a single instance shared throughout the entire library.

@jthielen
Copy link
Contributor

Just bumping my previous question: would anyone have input on the extensible preprocessor being at the global level or the registry level (the latter of which would require several internal API changes)?

Also, I'm not sure if the changes in #880 impact this decision at all.

@crusaderky
Copy link
Contributor

crusaderky commented Sep 22, 2019

@jthielen #880 is not related to this.

I don't see a big problem in doing it per-registry (which is cleaner)? You'd just need to add an instance variable that contains a list of callables.

class UnitRegistry:
    def __init__(...):
        self.preprocessors = []

    def __call__(self, input_string, **kwargs):
        for p in self.preprocessors:
            input_string = p(input_string)
        ...

    def parse_expression(self, input_string, *args, **kwargs):
        for p in self.preprocessors:
            input_string = p(input_string)
        ...

    def parse_units(self, input_string, *args, **kwargs):
        for p in self.preprocessors:
            input_string = p(input_string)
        ...

@jthielen
Copy link
Contributor

jthielen commented Sep 22, 2019

@crusaderky Thank you for weighing on #880, and I apologize for not clarifying the connection I had in mind. Having a global default Registry would seem to make the option of a global preprocessor that isn't extensible per-registry, and yet still accessible on a registry basis, more confusing for end users (making that option less desirable). However, upon further thought, this is likely a minor concern, and it wouldn't even matter if the per-registry preprocessor is the preferred option.

While I wish the implementation would be as simple as just adding additional preprocessing of the input strings in parse_units and parse_expressions like you mention, I unfortunately don't think that would be sufficient. The current string preprocessor is relied upon by ParserHelper.from_string, which in turn makes it used in several non-registry-dependent ways. For example, it is used when comparing strings against a UnitsContainer for equality, or in Context.from_lines(). Based on some experimentation with the test suite, skipping preprocessing in ParserHelper.from_string (and instead relying on parse_expression or parse_units to do so) breaks pint in some pretty severe ways right now. This is why I think this option would require some degree of internal refactoring, either to pass the preprocessor instance to every UnitsContainer and ParserHelper, or modify every usage of ParserHelper.from_string to always have its inputs preprocessed beforehand (on the registry level). The Context and Definition classes would also likely gain preprocessor attributes to make this work.

I agree that the per-registry implementation is cleaner from the end-user perspective, so the internal changes are likely worthwhile. I just picture it being a fair bit of work, so I wanted to verify with others that it is indeed worthwhile before I plunged ahead with it.

@jthielen
Copy link
Contributor

After grappling with this for a while, I think that, unless I'm missing something, the changes required to have a single, per-registry, extensible preprocessor may be a bit too onerous. The utilization of the current string_preprocessor() is just entrenched too firmly to make this something I feel can do in the time I have available in the near future to devote to it.

So, based on past feedback that a per-registry implementation is preferred, I think we are left with @crusaderky's implementation in #429 (comment)? While there are the downsides of having two separate preprocessing steps (user-defined and string_preprocessor()) and user-defined preprocessing only occurring through parse_expression and parse_units (not in contexts or definitions), I think it would capture most use cases, and would still be a substantial improvement over not having any options for user-defined preprocessing.

Does anybody see any problems with moving forward in this way?

@hgrecco
Copy link
Owner

hgrecco commented Dec 3, 2019

Please submit a PR. I agree with your @crusaderky suggestion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants