Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proj_create_crs_to_crs_from_pj(): add a ONLY_BEST=YES option (fixes #3533) #3535

Merged
merged 11 commits into from
Jan 12, 2023

Conversation

rouault
Copy link
Member

@rouault rouault commented Jan 2, 2023

Also add a --only-best switch to cs2cs

ONLY_BEST=YES/NO: (PROJ >= 9.2)
Can be set to YES to cause PROJ to error out if the best transformation, known of PROJ, and usable by PROJ if all grids known and usable by PROJ were accessible, cannot be used. Best transformation should be understood as the transformation returned by
:cpp:func:proj_get_suggested_operation if all known grids were accessible (either locally or through network).

@rouault rouault added this to the 9.2.0 milestone Jan 2, 2023
@rouault rouault force-pushed the fix_3533 branch 2 times, most recently from 621ce5e to eeeb301 Compare January 2, 2023 18:34
@hobu
Copy link
Contributor

hobu commented Jan 2, 2023

Should we default to YES and fail transforms with an error message telling people to set it to NO or fetch the required grids if they want to go forward?

Behaving this way would satisfy @oleg-alexandrov's complaint. I find it a bit nanny'ing I guess but it would prevent a PROJ-naive user from shooting their foot off unknowingly.

@rouault
Copy link
Member Author

rouault commented Jan 2, 2023

Should we default to YES and fail transforms

I'm afraid this will break a lot of downstream software (GDAL, QGIS, MapServer, etc.)

@snowman2
Copy link
Contributor

snowman2 commented Jan 2, 2023

Should we default to YES and fail transforms

PROJ 10+? Could add a warning message to let users know of the upcoming change in behavior in the current version.

@rouault
Copy link
Member Author

rouault commented Jan 2, 2023

Could add a warning message

do you mean that if ONLY_BEST is not set and the best transformation is not available, we emit a warning message ? That makes sense, except we have no PJ_LOG_WARNING category currently. We could add one, but it should be between PJ_LOG_ERROR = 1 and PJ_LOG_DEBUG = 2 ... Well, we can always allocate a new PJ_LOG_WARNING = 5 and add non pretty logic in pj_log() to take into account that exception.

@snowman2
Copy link
Contributor

snowman2 commented Jan 2, 2023

do you mean that if ONLY_BEST is not set and the best transformation is not available, we emit a warning message ?

Yes, that sounds correct.

@oleg-alexandrov
Copy link
Contributor

Behaving this way would satisfy @oleg-alexandrov's complaint. I find it a bit nanny'ing I guess but it would prevent a PROJ-naive user from shooting their foot off unknowingly.

At least don't put the gun on silencer, will you? :)

@rouault
Copy link
Member Author

rouault commented Jan 2, 2023

Could add a warning message to let users know of the upcoming change in behavior in the current version.

implemented with 2nd and 3rd commit

Copy link
Member

@kbevers kbevers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this an improvement compared to what we have now. I like the ability to fail when grids arent't available and the mechanism for issueing warnings. I have some points for discussion though.

  1. The warning when a grid isn't present says that in PROJ 10 that will be an error. Can that be changed to say that it might be an error in PROJ 10? I'm worried we won't keep the promise. At least I'd like to see how this change works in the real world before committing to that promise. I think this change in behaviour is the right thing to do but I am not sure all possible problems with have been exhausted.

  2. The numerical values of PJ_LOG_LEVEL. What am I missing that makes it necessary to add a new function for properly reading the log level? Is it really that dangerous to change the them? Surely a simpler solution would be to just change the values so they are in order. I'd argue that comparing the loglevel to the numerical values of the enum members is using the API wrong and that instead users should be doing stuff like loglevel < PJ_LOG_DEBUG. I assume that's what most are doing already. If we just change the enum values that'll still work just fine whereas with the implementation in this PR we introduce a small breaking change. I'd like to avoid that.

  3. We have proj.ini, how about giving the option of setting ONLY_BEST here once and for all? I can imagine that being useful for some users of the CLI apps.

usable by PROJ if all grids known and usable by PROJ were accessible,
cannot be used.
Best transformation should be understood as the transformation returned by
:cpp:func:`proj_get_suggested_operation` if all known grids were
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works alright in the HTML docs but is probably not super great in the man page. It's probably more user-friendly to have an explanation here that doesn't involve a reference to a function in the API.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mention to proj_get_suggested_operation removed and sentence rephrased as " Best transformation should be understood as the most accurate transformation available among all relevant for the point to transform, and if all known grids required to perform such transformation were accessible (either locally or through network)."

@snowman2
Copy link
Contributor

snowman2 commented Jan 4, 2023

We have proj.ini, how about giving the option of setting ONLY_BEST here once and for all? I can imagine that being useful for some users of the CLI apps.

May not be bad to have an environment variable as well for the default value that can be overridden in the function/CLI.

Something like: PROJ_ONLY_BEST_DEFAULT

@rouault
Copy link
Member Author

rouault commented Jan 4, 2023

Is it really that dangerous to change the them?

That would indeed be simpler, but this is technically a ABI change which requires to bump the SONAME and to rebuild external code against the updated proj.h. Users could expect all the 9.x series to have the same ABI and SONAME (25 currently). That said, we could also decide to still release master as 9.2 but bump the SONAME to allow changing enumeration values. https://semver.org/ doesn't address API vs ABI (semver/semver#590). From an API point of view, changing enumeration values could be considered as a compatible change. But it is definitely not ABI compatible.

If we just change the enum values that'll still work just fine whereas with the implementation in this PR we introduce a small breaking change

Strictly speaking, that's not really breaking. If people use loglevel < PJ_LOG_DEBUG, they just won't see PJ_LOG_WARNING events, so this doesn't break anything as we didn't emit messages at such level before this PR.

@kbevers
Copy link
Member

kbevers commented Jan 4, 2023

this is technically a ABI change

This discussion on SO argues that adding members to the enum is also breaking the ABI. At least when functions return members of that enum. Which we do. I think that's taking it a bit to far though, functions should be allowed to return new things without the need for a bump of the SONAME...

Strictly speaking, that's not really breaking. If people use loglevel < PJ_LOG_DEBUG, they just won't see PJ_LOG_WARNING events, so this doesn't break anything as we didn't emit messages at such level before this PR.

True. If the values are changed so they are in order the warnings will appear by default in the above case which is probably not a bad thing.

I hope others will chime in on this. I don't think it's a clear cut case that this is an ABI change but I am far from an expert in this.

In case there's no way around the current implementation we need an issue regarding the changes to be applied in 10.0. It would probably be smart to leave gaps between the enum values so there's room to fit potential new members in between existing ones without it being a hassle.

@oleg-alexandrov
Copy link
Contributor

If the ABI will at some point break anyway, one could as well switch from using an enum to an int, then the existing enum values could be tabulated as ints with set values. Then each time one adds a new value one need not break the ABI, but may need to update some validation code maybe. (I guess I am speaking out of ignorance here, but looks like enums are too rigid.)

@rouault
Copy link
Member Author

rouault commented Jan 4, 2023

I don't think it's a clear cut case that this is an ABI change

To summarize my views:

  • Adding a new value to an enumeration: not really an ABI change / grey area, doesn't require SONAME bump (we have for example extended recently the PJ_TYPE enumeration to add PJ_TYPE_DERIVED_PROJECTED_CRS)
  • Changing existing values of an enumeration: true ABI change, requires SONAME bump

(I guess I am speaking out of ignorance here, but looks like enums are too rigid.)

The issue here is not really about enum vs int types, but rather than we'd want to insert a new constant between 2 existing ones, and there's no room to do that without renumbering existing constants.

@snowman2
Copy link
Contributor

snowman2 commented Jan 7, 2023

Here is a possible strategy to consider:

  • version 9.2: add only best option with warning message in the PJ_LOG_DEBUG level
  • version 10: introduce PJ_LOG_WARNING with correct order and update message to use that level
  • version 11: make only best the default

This strategy should give downstream programs plenty of warning and opportunity for them to try out the new behavior and give feedback. It also will remove the need for the new function and keep the versioning happy.

@kbevers
Copy link
Member

kbevers commented Jan 9, 2023

Here is a possible strategy to consider

I think this strategy is good, although I think it will be a very long time until we hit 11. Does this change in behaviour necessarily require a bump in major version? It's not changing the API. If we advertise in warnings that a change is made with, let's say, 10.5 everyone have had reasonable time to adjust. @rouault has GDAL done something similar in the past? I have a feeling that it might.

@rouault
Copy link
Member Author

rouault commented Jan 9, 2023

Does this change in behaviour necessarily require a bump in major version?

Each GDAL feature release (every 6 months) breaks the (C++) ABI due to the addition of new virtual methods (generally not the C ABI though, except point discussed below), so we bump the SONAME and that requires rebuilding reverse dependencies. People apparently survive to this (/me looking at the stack of dead bodies surrounding me :-))

has GDAL done something similar in the past?

we have regularly situations where we introduce new values in enumerations (in C enumerations), that might require little (or not so little) adjustments in user code to deal with the new returned values.

@kbevers
Copy link
Member

kbevers commented Jan 9, 2023

we have regularly situations where we introduce new values in enumerations (in C enumerations), that might require little (or not so little) adjustments in user code to deal with the new returned values

Question was more about change in behaviour that isn't affecting API or ABI. Let's say a change of default compression algoritm or something like that. With the ONLY_BEST option we are implementing a new strategy for selection transformations, not changing the API/ABI.

@rouault
Copy link
Member Author

rouault commented Jan 9, 2023

Let's say a change of default compression algoritm or something like that.

Every GDAL release does such kind of changes without prior notice. But generally not for things like defaulting to ONLY_BEST which is a real breaking change, whish we shouldn't definitely do it before a PROJ 10 or 11. And I agree with your previous remark that a warning message should be less imperative than its current state, as we might decide not to default to it after all.

not changing the API/ABI.

ah, I was concentrated on the PJ_LOG_WARNING issue for which I don't see a clear consensus on what to do.

@snowman2
Copy link
Contributor

the PJ_LOG_WARNING issue for which I don't see a clear consensus on what to do.

I am thinking it may not be bad to separate ONLY_BEST from PJ_LOG_WARNING into separate PRs as there seems to be agreement around the ONLY_BEST functionality. In the ONLY_BEST PR, using PJ_LOG_DEBUG can be sufficient and then the PJ_LOG_WARNING PR can be discussed separately.

@rouault
Copy link
Member Author

rouault commented Jan 10, 2023

3. We have proj.ini, how about giving the option of setting ONLY_BEST here once and for all? I can imagine that being useful for some users of the CLI apps.

That could make sense for a verbose warning, but given for now it is just a one time debug message, there's no pressing for this. This might be useful if/when ONLY_BEST=YES becomes the default

Besides that, I've removed the PJ_LOG_WARNING level and now the message is now emitted with _DEBUG level, so all potentially controversial/tricky aspects have gone.

@rouault
Copy link
Member Author

rouault commented Jan 10, 2023

  1. The warning when a grid isn't present says that in PROJ 10 that will be an error. Can that be changed to say that it might be an error in PROJ 10?

Reworded as "This might become an error in a future PROJ major release"

@snowman2
Copy link
Contributor

Thoughts about adding this link to the warning message: https://proj.org/resource_files.html ?

@kbevers
Copy link
Member

kbevers commented Jan 11, 2023

That could make sense for a verbose warning, but given for now it is just a one time debug message, there's no pressing for this. This might be useful if/when ONLY_BEST=YES becomes the default

This wasn't about warnings. The idea was to give users the option of setting the ONLY_BEST option once and for all. I think it's one of those things you either want or not, so why not have it be a global setting in a config file? You get the behaviour you ask for in proj.ini. Just an idea, it might not be practical in the real world where applications might want to override what's in proj.ini.

…ctly propagate errorIfBestTransformationNotAvailable to alternate operations
@rouault
Copy link
Member Author

rouault commented Jan 11, 2023

last 2 commits:

  • Add link to PROJ doc when ONLY_BEST warning/error triggers, and correctly propagate errorIfBestTransformationNotAvailable to alternate operations
  • Add PROJ_ONLY_BEST_DEFAULT environment variable, and 'only_best_default' proj.ini setting

Copy link
Member

@kbevers kbevers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor details but otherwise I think this looks good.

Thinking a bit ahead (trying to help future me), what do we need to put into the release notes about this change? Obviously mention that there's a new option but in terms of a potential change in behaviour in the future it might also be good to put an early warning out there.

docs/source/apps/cs2cs.rst Outdated Show resolved Hide resolved
docs/source/resource_files.rst Outdated Show resolved Hide resolved
rouault and others added 2 commits January 12, 2023 15:12
src/4D_api.cpp Outdated
msg += " This might become an error in a future PROJ major release. "
"Set the ONLY_BEST option to YES or NO. "
"This warning will no longer be emitted (for the current context).";
P->ctx->warnIfBestTransformationNotAvailable = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking that this warning may be useful to see in different operations. Thoughts about only showing this error once for each PJ* instead of once per PJ_CONTEXT*?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts about only showing this error once for each PJ* instead of once per PJ_CONTEXT*?

I'm slightly concerned, without numeric evidence, that there might be performance issues in situations where people use suboptimal transformations in an intensive way, while being happy with them. Logs might be overwhelmed with the warning. Once per context IMHO has the benefit of making attentive observers of logs that they might sub-optimaly use PROJ, while avoiding causing perf regressions.

Copy link
Contributor

@snowman2 snowman2 Jan 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could see that being an issue if we issued a warning every time. But, once per PJ* seems fairly reasonable to me as the same PJ* can be re-used. Additionally, if they are happy with the current mode, they have the option to set ONLY_BEST=OFF mode to improve performance.

However, your concerns are valid. If you want to do it once per PJ_CONTEXT*, I would remove the specific grids from the warning message to make a generic message.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh well, I've added a commit to move the warning at the PJ* level rather than a the PJ_CONTEXT* one. We have some time before the next release to see how it flies...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks 👍

src/4D_api.cpp Outdated
@@ -2069,6 +2069,7 @@ PJ *proj_create_crs_to_crs_from_pj (PJ_CONTEXT *ctx, const PJ *source_crs, cons
}
}

P->over = forceOver;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To simplify finding future issues with git bisect, thoughts about moving the force over fix to a separate PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thoughts about moving the force over fix to a separate PR?

it is needed in this PR, otherwise tests in gie_self_test break (or I would have to create a PR with this fix only, and rebase that one on top of it. Not sure if that changes much about bisection processes? at least this is a separate commit). To be honest, I don't fully understand why my changes have some interaction with the forceOver stuff. Well probably that the fact that now we look up transformations we cannot instantiate (to be able to warn about their lack) has revealed an underlying bug in the forceOver stuff that didn't pop up before.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping it as a separate commit works for me.

data/proj.ini Outdated
; passed to the proj_create_crs_to_crs() method, or with the --only-best
; option of the cs2cs program.
; (added in PROJ 9.2)
only_best_default = off
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to have this unset? The reason is for warnings. If this is set to ON/OFF, the warnings should be disabled similar to the ONLY_BEST option passed into proj_create_crs_to_crs_from_pj, correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be fixed with latest commit (gizz, all those accumulation of settings gets crazy complicated)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(gizz, all those accumulation of settings gets crazy complicated)

Perhaps a sign that some refactoring is needed in the area? There must be a smart way to do it without plastering the whole code-base with tripwires

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good if all problems had simple solutions :-)
To be honest, we're reaching a point where I'm losing interest spending more time working in this pull request.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, we're reaching a point where I'm losing interest spending more time working in this pull request.

Hey now, I was not suggesting that you include a major refactoring in this PR. If this is a pain point during development it is worth considering a solution to the problem. It should obviously be done in a separate issue but I don't want to open the discussion unless there's an actual problem. Hence my question.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only axis of simplification I can imagine would probably be to merge the 2 warn and error boolean variables in a single one with 3 states: no warning (only_best=off), warning (no explicit setting), error (only_best=on).
There isn't nothing that is particularly difficult in the code. There are places in PROJ that are > 10x times harder to make sense of ( CoordinateOperationFactory::Private::createOperationsBoundToGeog() probably holds the record... Or perhaps CoordinateOperationFactory::Private::createOperationsCompoundToGeog():-) )

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I'm glad we've de-escalated the situation from "crazy complicated" to "nothing .. particularly difficult" :-) My spidey-sense was tingling at the crazy complicated part - that sort of wording usually is an early warning that refactoring is in order. We don't always notice when knee-deep in the code so I just wanted to check whether this was an actual problem or not

@rouault
Copy link
Member Author

rouault commented Jan 12, 2023

ok, everything (but readthedocs) green, merging

@rouault rouault merged commit b9bad5a into OSGeo:master Jan 12, 2023
@snowman2
Copy link
Contributor

Thanks @rouault 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants