-
Notifications
You must be signed in to change notification settings - Fork 350
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ridiculously long instance names #2343
Comments
Let's start by reviewing the tried and potential naming schemes (numbering is purely for disambiguation)
It would seem that the Lean 3c approach presented a complex but practical compromise somewhere between approaches 1 and 2. If there were any remaining issues or annoyances with it, it would be great to document them here. But in the end, I really would like to ensure that we never have to think about instance name collisions again. And discouraging named access to unnamed instances, without outright preventing it, also seems reasonable for the sake of robustness. Thus I'm inclined to at least explore the potential design and impact of the latter two approaches instead of designing yet another heuristic that may or may not work out. Comments welcome. |
I would like to pursue option 6. I don't think it is an obstacle that Mathlib currently references unnamed instances by name. No one is happy about this, and it should be a relatively easy refactor to just avoid this. (I would probably do this by locally modifying Lean to break all of the auto-names in any dumb way, and then fix everything again.) |
Ah -- I have encountered a problem with using entirely inaccessible names: sometimes we need to
Possibly the correct response to this is to improve the deriving handler! |
I'm very happy to hear that!
Should inaccessible declarations be linted at all? |
It would really help if there was a reliable |
My main concern with this approach is that we lose the ability to have conversations on Zulip like:
Without having to first make a PR to give the instance an explicit name if it didn't already have one. |
yes, I am concerned that such options will just push us to never use anonymous instances at all in mathlib, which feels like a failure of the naming heuristic. And with things like |
We have plenty of linters for simp lemmas and type class instances that ensure a property of the whole simp set / instance set. Simp will apply lemmas whether they are inaccessible or not, so they need to be linted either way. And tbh this looks like a bug in the derive handler: attribute [nolint unusedArguments] instDecidableEqHom |
…6423) Per leanprover/lean4#2343, we are going to need to change the automatic generation of instance names, as they become too long. This PR ensures that everywhere in Mathlib that refers to an instance by name, that name is given explicitly, rather than being automatically generated. There are four exceptions, which are now commented, with links to leanprover/lean4#2343. This was implemented by running Mathlib against a modified Lean that appended `_ᾰ` to all automatically generated names, and fixing everything. Co-authored-by: Scott Morrison <scott.morrison@gmail.com>
Isn't the correct answer "Yes, the typeclass inference system knows this, it should just work"? |
No, because that doesn't tell them "it's only true under assumptions XYZ" |
OK then I vote for making the PR to give the instance an explicit name if it didn't already have one, because I suspect that this is a pretty marginal concern in practice. It is also a step forward from "Yes, it's Is it clear that one would not be able to link to an inaccessible term? Could the answer be "yes it's |
In case you missed it, @digama0: leanprover-community/mathlib4#6423 (now merged) seems like a big step along this path to never using anonymous instances. |
…6423) Per leanprover/lean4#2343, we are going to need to change the automatic generation of instance names, as they become too long. This PR ensures that everywhere in Mathlib that refers to an instance by name, that name is given explicitly, rather than being automatically generated. There are four exceptions, which are now commented, with links to leanprover/lean4#2343. This was implemented by running Mathlib against a modified Lean that appended `_ᾰ` to all automatically generated names, and fixing everything. Co-authored-by: Scott Morrison <scott.morrison@gmail.com>
…6423) Per leanprover/lean4#2343, we are going to need to change the automatic generation of instance names, as they become too long. This PR ensures that everywhere in Mathlib that refers to an instance by name, that name is given explicitly, rather than being automatically generated. There are four exceptions, which are now commented, with links to leanprover/lean4#2343. This was implemented by running Mathlib against a modified Lean that appended `_ᾰ` to all automatically generated names, and fixing everything. Co-authored-by: Scott Morrison <scott.morrison@gmail.com>
I don't know (@hargoniX?) but it would be good to ensure doc-gen4 can reasonably handle them before touching the naming scheme |
Noting that my PR leanprover-community/mathlib4#6423 was not intended as a vote for "Mathlib should never ever use anonymous instance". Instead it was to decouple Mathlib from the current (fairly obviously unsatisfactory) scheme, so that we have room to change the current scheme without having to fix a million regressions in Mathlib. If we can come up with a good replacement scheme, and implement it, then of course it's great if Mathlib is happy to go back to relying on the automatically generated names! |
…6423) Per leanprover/lean4#2343, we are going to need to change the automatic generation of instance names, as they become too long. This PR ensures that everywhere in Mathlib that refers to an instance by name, that name is given explicitly, rather than being automatically generated. There are four exceptions, which are now commented, with links to leanprover/lean4#2343. This was implemented by running Mathlib against a modified Lean that appended `_ᾰ` to all automatically generated names, and fixing everything. Co-authored-by: Scott Morrison <scott.morrison@gmail.com>
…6423) Per leanprover/lean4#2343, we are going to need to change the automatic generation of instance names, as they become too long. This PR ensures that everywhere in Mathlib that refers to an instance by name, that name is given explicitly, rather than being automatically generated. There are four exceptions, which are now commented, with links to leanprover/lean4#2343. This was implemented by running Mathlib against a modified Lean that appended `_ᾰ` to all automatically generated names, and fixing everything. Co-authored-by: Scott Morrison <scott.morrison@gmail.com>
What about choosing a name prefix based on option 2 (linear in the input size) then a short hygiened (i.e. inaccessible) name such as a hash (like the short git commit id) of the name scheme from option 1 as postfix? e.g. This way one can still talk about |
doc-gen has a custom filter for declarations it is not supposed to render, as long as this filter is made not to trigger on auto-named instances they will keep showing up just fine. |
@kmill what is the situation with #3089 ? It's still a draft? Someone suggested to me that we write a linter which checks to see if an auto-generated instance name is too long and if so then CI fails and the user is forced to supply a more sensible name. This would be another workaround for this issue but I'd rather see a PR merged which fixes this high priority mathlib issue properly. It's embarrassing to see members of our community answering questions on the Zulip with links to explicit declarations in the docs which are 1000 characters long. |
Implements a new method to generate instance names for anonymous instances that uses a heuristic that tends to produce shorter names. A design goal is to make them relatively unique within projects and definitely unique across projects, while also using accessible names so that they can be referred to as needed, both in Lean code and in discussions. The new method also takes into account binders provided to the instance, and it adds project-based suffixes. Despite this, a median new name is 73% its original auto-generated length. (Compare: [old generated names](https://gist.github.com/kmill/b72bb43f5b01dafef41eb1d2e57a8237) and [new generated names](https://gist.github.com/kmill/393acc82e7a8d67fc7387829f4ed547e).) Some notes: * The naming is sensitive to what is explicitly provided as a binder vs what is provided via a `variable`. It does not make use of `variable`s since, when names are generated, it is not yet known which variables are used in the body of the instance. * If the instance name refers to declarations in the current "project" (given by the root module), then it does not add a suffix. Otherwise, it adds the project name as a suffix to protect against cross-project collisions. * `set_option trace.Elab.instance.mkInstanceName true` can be used to see what name the auto-generator would give, even if the instance already has an explicit name. There were a number of instances that were referred to explicitly in meta code, and these have been given explicit names. Removes the unused `Lean.Elab.mkFreshInstanceName` along with the Command state's `nextInstIdx`. Fixes #2343
For the record, here's an explanation of the scheme that I used in #3098:
|
Prerequisites
Description
Lean 4's default instance naming algorithm, combined with mathlib's rich mathematics hierarchies, adds up to auto-generated instance names which are 2500 characters or more.
Steps to Reproduce
ContinuousLinearMap.instNormedSpaceContinuousLinearMapToSemiringToDivisionSemiringToSemifieldToFieldToNormedFieldIdToNonAssocSemiringContinuousLinearMapToTopologicalSpaceToUniformSpaceToPseudoMetricSpaceToAddCommMonoidToAddCommGroupToTopologicalSpaceToUniformSpaceToPseudoMetricSpaceToAddCommMonoidToAddCommGroupToModuleToModuleTopologicalSpaceToTopologicalAddGroupAddCommMonoidToContinuousAddToAddGroupToSeminormedAddGroupContinuousLinearMapToTopologicalSpaceToUniformSpaceToPseudoMetricSpaceToAddCommMonoidToAddCommGroupToModuleTopologicalSpaceToTopologicalAddGroupAddCommMonoidToContinuousAddToAddGroupToSeminormedAddGroupModuleSmulCommClass_selfToCommMonoidToCommRingToEuclideanDomainToMulActionToMonoidWithZeroToZeroToNegZeroClassToSubNegZeroMonoidToSubtractionMonoidToDivisionAddCommMonoidToMulActionWithZeroContinuousConstSMulToTopologicalSpaceToUniformSpaceToPseudoMetricSpaceToSeminormedRingToSeminormedCommRingToNormedCommRingToSMulToZeroToAddMonoidToSMulZeroClassToZeroToSMulWithZeroContinuousSMulToZeroToCommMonoidWithZeroToCommGroupWithZeroBoundedSMulModuleSmulCommClass_selfToMulActionToZeroToNegZeroClassToSubNegZeroMonoidToSubtractionMonoidToDivisionAddCommMonoidToMulActionWithZeroContinuousConstSMulToSMulToZeroToAddMonoidToSMulZeroClassToSMulWithZeroContinuousSMulBoundedSMulInstSeminormedAddCommGroupContinuousLinearMapToSemiringToDivisionSemiringToSemifieldToFieldToNormedFieldIdToNonAssocSemiringContinuousLinearMapToTopologicalSpaceToUniformSpaceToPseudoMetricSpaceToAddCommMonoidToAddCommGroupToTopologicalSpaceToUniformSpaceToPseudoMetricSpaceToAddCommMonoidToAddCommGroupToModuleToModuleTopologicalSpaceToTopologicalAddGroupAddCommMonoidToContinuousAddToAddGroupToSeminormedAddGroupContinuousLinearMapToTopologicalSpaceToUniformSpaceToPseudoMetricSpaceToAddCommMonoidToAddCommGroupToModuleTopologicalSpaceToTopologicalAddGroupAddCommMonoidToContinuousAddToAddGroupToSeminormedAddGroupModuleSmulCommClass_selfToCommMonoidToCommRingToEuclideanDomainToMulActionToMonoidWithZeroToZeroToNegZeroClassToSubNegZeroMonoidToSubtractionMonoidToDivisionAddCommMonoidToMulActionWithZeroContinuousConstSMulToTopologicalSpaceToUniformSpaceToPseudoMetricSpaceToSeminormedRingToSeminormedCommRingToNormedCommRingToSMulToZeroToAddMonoidToSMulZeroClassToZeroToSMulWithZeroContinuousSMulToZeroToCommMonoidWithZeroToCommGroupWithZeroBoundedSMulModuleSmulCommClass_selfToMulActionToZeroToNegZeroClassToSubNegZeroMonoidToSubtractionMonoidToDivisionAddCommMonoidToMulActionWithZeroContinuousConstSMulToSMulToZeroToAddMonoidToSMulZeroClassToSMulWithZeroContinuousSMulBoundedSMul
Versions
e.g.
Lean (version 4.0.0-nightly-2023-06-20, commit a44dd71ad62a, Release)
(OS independent)Additional Information
What prompted me to open this issue was the fact that these instance names now pollute search; any reasonably small string of reasonable characters is a substring of the above instance name. Others are worried about blow-up continuing and causing weird problems down the line. Another observation is that this causes problems with the docs: the relevant link for this instance doesn't allow you to look at the equations for the instance -- you get the message
One or more equations did not get rendered due to their size.
In community Lean 3 there was a more reasonable algorithm, which unfortunately I do not know the details of, although I do remember that it was changed from the Lean 3.4.2 algorithm (which was sometimes returning instance names which were too short :-) in the sense that they were missing relevant information or in the wrong namespace or something). It would be great to see this more controlled algorithm in Lean 4 before the exponential growth of auto-generated instance names starts to cause weird problems.
The text was updated successfully, but these errors were encountered: