Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for #274 Make all Variable subclasses name valid python identifiers #307

Open
wants to merge 18 commits into
base: master
Choose a base branch
from

Conversation

quaquel
Copy link
Owner

@quaquel quaquel commented Nov 1, 2023

No description provided.

…dentifier

First step for solving #274. This covers all outcome and parameter classes.

Still remaining are constants and potentially all Point subclasses.
from NamedObject to Variable, which means Constant name now  also must be a valid python identifier
@quaquel quaquel requested a review from EwoutH November 1, 2023 08:34
@quaquel quaquel added the bug label Nov 1, 2023
@quaquel quaquel added this to the 2.5.0 milestone Nov 1, 2023
@quaquel
Copy link
Owner Author

quaquel commented Nov 1, 2023

As expected, this brakes a lot of stuff. I will start fixing all errors this creates over the coming days.

@EwoutH
Copy link
Collaborator

EwoutH commented Nov 2, 2023

Right, I just now understand the implications of this, after seeing all the examples with names with spaces in them which now aren't allowed.

So this is definitely a 3.0 change, and we should add deprecation warnings in the next release (2.5).

Since the scope in which backwards compatibility is broken is huge, I'm also not sure if we really should do this for all internal variable classes.

@coveralls
Copy link

coveralls commented Nov 15, 2023

Coverage Status

coverage: 79.638% (-0.7%) from 80.335%
when pulling d8b9b08 on valid_identifiers
into f3c9674 on master.

@quaquel
Copy link
Owner Author

quaquel commented Nov 15, 2023

All tests now pass, so this is ready for review.

We have to decide also on how to handle 2.5 and 3. I agree that it would be good to keep this for a 3.0 release, but what to still do for 2.5?

@EwoutH
Copy link
Collaborator

EwoutH commented Nov 15, 2023

For 2.5, I would say replace all Error with DeprecationWarning, including that we’re dropping it in 3.0.

I will try to review asap.

@quaquel
Copy link
Owner Author

quaquel commented Nov 15, 2023

Ok, so we need a 3.0 branch into which this can go.

I'll probably make a separate pull request for the deprecation warnings

@EwoutH
Copy link
Collaborator

EwoutH commented Nov 15, 2023

I think we can leave this PR open, merge the warnings in a separate PR, release 2.5, and then directly after that merge this PR to master.

If we need to do 2.5.x bugfixes we can do those on a maintenance branch.

@EwoutH
Copy link
Collaborator

EwoutH commented Dec 13, 2023

I was thinking, can we provide a minimal, best-effort attempt to convert invalid identifiers to valid ones? A very simple option would be to convert spaces to underscores. Or, we could just convert any invalid character to an underscore.

import re

def to_valid_identifier(s):
    # Replace invalid characters with an underscore
    s = re.sub(r'[^0-9a-zA-Z_]', '_', s)

    # Ensure the first character is not a digit
    if s[0].isdigit():
        s = "_" + s

    return s

I think in that case we still want to throw an warning, like:

UserWarning: f"Invalid characters found in {variable_name}. Converting all characters to underscores: {converted_variable_name} is now a valid Python identifier."

@quaquel
Copy link
Owner Author

quaquel commented Dec 13, 2023

I am hesitant about this: when in doubt, refuse the temptation to guess.

Also, by default parameter names map to something in the underlying model. So if you change the name, you also need to set the variable_name kwarg to the original name.

@EwoutH
Copy link
Collaborator

EwoutH commented Dec 13, 2023

Yeah I understand. How would it be as an option that's off by default?

I'm just trying to think about ways for users with huge Vensim models to not require major modifications.

Otherwise we could suggest a valid name in the error message:

if not name.isidentifier():
    raise ValueError(f"'{name}' is not a valid Python identifier. Changing it to '{to_valid_identifier(name)'} would make it one.")

@quaquel
Copy link
Owner Author

quaquel commented Dec 15, 2023

It would be nice to have a convenient way of dealing with this, but I need to figure out how to go about it.

The idea of automagically changing parameter names violates a classic Python principle. So, for me, that is a no-go.

An alternative would be adding a helper function to 2.5 to convert models to the new behavior. This helper function would try to fix all parameters and spit out not just a fixed model but also the revised code so you can replace the existing uncertainty/constant/outcome definitions. The problem is that this function is easy for 95% of all use cases: take the parameter name, split it on whitespace, and join it with an underscore. Set the parameter name to this new name and set the variable name to the original name. You would print this out with __repr__ on the parameter.

There are, however, several issues. First, variable_name may be already used. This is easy: update the name only. The problem is with the remaining cases where, for example, alternative distributions are being used. Or when some of the other optional kwargs have been used to instantiate the parameter instance. These are currently not well covered by __repr__, so spitting out the new correct code is a problem.

Suggesting correct names in the error message won't help much. The fix will be trivially obvious, but is just painful to do. Copy-pasting between the console and the model definition is even more painful than just spending 10 minutes fixing all names.

@EwoutH
Copy link
Collaborator

EwoutH commented Dec 15, 2023

Good points. I would just add the warnings to 2.5, then users know a change is coming, and then we have some more time to think about how to handle it exactly.

@EwoutH EwoutH modified the milestones: 2.5.0, 3.0 Dec 15, 2023
@quaquel
Copy link
Owner Author

quaquel commented Dec 15, 2023

Having a fix function would need to happen in 2.5 because, from 3 onward, you get a straight-up exception (at least atm).

@EwoutH
Copy link
Collaborator

EwoutH commented Dec 15, 2023

Fair, but I wouldn't let it be a blocker for 2.5.0. We can always backport to a 2.5 release branch and do a 2.5.1 path release with it.

@EwoutH
Copy link
Collaborator

EwoutH commented Mar 27, 2024

With 2.5 released we can merge this to the master branch (targeted for 3.0) right?

Copy link
Collaborator

@EwoutH EwoutH left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what everything in here exactly does, but I don't see really weird things.

@quaquel
Copy link
Owner Author

quaquel commented May 24, 2024

I'll fix the conflicts hopefully later today and merge it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants