New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
shapes: do not read_back entire shape to get aliases uids #13001
Conversation
typing/shape_reduce.ml
Outdated
(* When interested only of in the uids of aliased modules we do not read_back | ||
the entire shape of the module, just enough to unroll the chain of aliases. | ||
*) | ||
let read_back_aliases_uids env (nf : nf) = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My feeling is that the code is correct, the comment is somewhat redundant (I think it is clear from the code), and the naming is slightly wrong, as you are not reading back (to generate a term). I would call this reduce_aliases_for_uid
for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My reasoning for the name was that functions of the "reduce" take a shape as an input while "read_back" functions take a normal form as input. I am ok with both and change it in 4c3f43f
@gasche I applied your type-change suggestion in 83108d3. While adding new tests to show that we can return aliases to approximate modules I noticed that we didn't mark the resulting shape as approximated for values coming from a first-class-module. I did that in 19a778f. The reduction is "successful", but there is no uid. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This looks nicer. See two questions inline.
b9ba1b9
to
a85ff1c
Compare
@gasche I removed the fix for approximation. I think we do want to have the Right now, when looking for the uid of Reducing the right hand-side might result in an approximated shape, but this is not something we care about when reducing the shape for the uid of M. |
My current (and ever-changing) understanding is that there are two different needs:
Currently we are trying to use a single For (1), I wonder why we are not "just" using the For (2), I think that we should use a representation that allows on-demand computation, for example:
|
a85ff1c
to
c2fc14a
Compare
I don't think that this is correct. In (1) reducing the shapes to build the
Apart from the fact that, right now, tools are only interested in the result uid,
That was with the "original" shapes stored in the typing environment. |
@@ -18,10 +18,10 @@ | |||
(** The result of reducing a shape and looking for its uid *) | |||
type result = | |||
| Resolved of Shape.Uid.t (** Shape reduction succeeded and a uid was found *) | |||
| Resolved_alias of Shape.Uid.t list (** Reduction led to an alias chain *) | |||
| Resolved_alias of Shape.Uid.t * result (** Reduction led to an alias *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If possible I would rather avoid changing the interface for OCaml 5.2.0, but I am not sure if we are still aiming for 5.2.0 for this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If possible I would rather avoid changing the interface for OCaml 5.2.0
Is it because of the already released alpha and beta versions with bumped magic numbers ? Since the result
type was actually introduced in 5.2 itself, it seems like it could be better to change it right now...
I can also easily rollback to the first iteration on this PR that fixes the performance issue without changing the type if that is required.
I am not sure if we are still aiming for 5.2.0 for this PR
The discussion diverged to other unclear parts of the shape reduction, but I think this PR's original scope is smaller than that and mostly uncontroversial: removing extraneous work done in some cases, with an agreed-on fix (stepped read-back). We do need to re-think shape reduction and it's handling of uids at some point, but that's out-of-scope for that pr. What do you think @gasche ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is more a question of API stability for shape clients. In particular, this change would require a new beta and a patch to odoc. This is still ok-ish, if the PR converges this week, but this is starting to get late in the release cycle for this kind of changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand, thanks for the details, I can rollback the API changes if it make things simpler. Now that you mention odoc, it reminds me that the performance fix might actually be important for their usage of the shapes. (We discussed it a few weeks ago with @panglesd and @Julow, they perform similar actions as we do in Merlin to identify definitions' uids.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
odoc
would indeed would benefit from the improvements of this PR! (The "render source code" feature of odoc
is probably very inefficient by itself, but we will work on it when it is usable by drivers.)
Regarding API changes, we haven't yet released a 5.2 compatible version of odoc, so I think that for us it is still fine to include it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, let's have a beta 2 and I will amend my patch for odoc for 5.2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look good, and will avoid useless reductions indeed.
I don't have a strong opinion about the API changes; the new API seems better than the previous one, however. In any case, I'd say that at least the basic version of this PR should be merged for 5.2, in order to have the performance improvements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am green-ticking on behalf of @Ekdohibs
I won't have the time to look at this again in the next 10 days, but I trust @Ekdohibs' review and would be happy to approve on her behalf. @Octachron, can you say "yes" on improving the interface? |
Changes
Outdated
@@ -118,6 +118,10 @@ _______________ | |||
- #12959, #13055: Avoid an internal error on recursive module type inconsistency | |||
(Florian Angeletti, review by Jacques Garrigue and Gabriel Scherer) | |||
|
|||
- #13001: do not read_back entire shapes to get aliases' uids when building the | |||
usages index | |||
(Ulysse Gérard, review by Gabriel Scherer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also add Nathanaëlle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And let's move the entry in the 5.2 section too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
And for people reading the conversation in a linear way, yes I agree with the change of API. |
should be marked approximated but are not The tests also illustrate that we can get aliases to an "approximated" module.
2aea01b
to
fe55ae0
Compare
Cherry-picked to 5.2.0 in 89301a2 . |
Fully reducing the shapes of large modules (and all their components) is an expensive process. Merlin, when trying to jump to a module, used to do that to get the uid of the module, and thus it's location. This wasteful reduction slowed the query and we introduced "weak_reduction" which does not reduce a module's components when all we are interested in is the module's uid itself.
In #12508, we redesigned this feature along with @Ekdohibs and @gasche in the form of the
reduce_for_uid
function that doesn't leak incorrect shapes anymore. This function performs weak reduction and returns only the resulting uid. When looking for a module alias it returns the list of the aliases uids and the aliased module uid:reduce_for_uid 'path to A_N'
returns[uid_of_A_N; ...; uid_of_A_1; uid_of_X]
.This information allows Merlin to traverse aliases and jump to the actual definition of X, but keep aliasing information which is useful for other use cases like
occurrences
.To return this list I used a call to
read_back
before looking at the shape description to get the head aliases. This was, of course, a mistake since it will perform full reduction of the module's body. This lead to blow up when indexing large codebases and slower locate queries when jumping to an aliased module.This PR fixes that by only forcing step by step the shape reduction to get the head aliases.
@Octachron I think the performance issue I described should not have a large impact during the partial indexation that we are introducing in 5.2 because it only reduce shapes locally. It would take a very pathological compilation unit for the issue to manifest itself in a significant way. Still, it might be better to have this fix in the 5.2 branch.