Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support "exception" under or-patterns (PR#6422) #305

Merged
merged 3 commits into from Dec 9, 2015

Conversation

Projects
None yet
9 participants
@trefis
Copy link
Contributor

commented Nov 23, 2015

Cf. http://caml.inria.fr/mantis/view.php?id=6422

The implementation

Texp_match was modified to merge the value and exception cases, the distinction doesn't make sense if we allow exception both exception and value patterns in the same branch.
Tpat_exception was introduced to allow just that.

The splitting is done later, in Translcore.transl_match, to be able to share the body of branches with both value and exception patterns.
That code should be fairly straightforward.

Slight correction on what was just said: the splitting does still happen (temporarily) in typecore, to be able to run the exhaustivity checks performed by parmatch. This is required because the exhaustivity of value and exception branches should be checked separately.

So where are exception patterns allowed ?

In the same places as before and under or-patterns.
The only other places where an exception pattern might make sense is under Tpat_alias or Tpat_lazy.

The first one is actually problematic, consider:

match ... with
| (SomeConstructor (_, _) | exception SomeException) as x -> ...

which type do you give to x?

As for exceptions under lazy patterns, this might be possible, but can always be added at a later time and is not covered by this pull request.

Last comments

The careful reviewer will notice the apparition of Typedtree.pat_equiv, this was used during developpement to check the correctness of Typedtree.split_pattern; the assertion is still present in Translcore.transl_match but can be removed before merging (along with the pat_equiv function).

I added some comments in parmatch and typecore where I have some doubts about the correctness of my implementation.

@yallop

This comment has been minimized.

Copy link
Member

commented Nov 23, 2015

I'm very pleased to see this proposal.

There's a regression in the behaviour of match patterns without value cases, like this:

match f () with exception Not_found -> ()

This is currently disallowed:

# match f () with exception Not_found -> ();;
Characters 0-41:
  match f () with exception Not_found -> ();;
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error: None of the patterns in this 'match' expression match values.

following discussion on Mantis, but with your patch it's accepted.

The first one is actually problematic, consider:

match ... with
| (SomeConstructor (_, _) | exception SomeException) as x -> ...

which type do you give to x?

I don't think we should support aliasing exception patterns in this way.

@trefis trefis force-pushed the trefis:or-exception branch from 181e024 to a5b8943 Nov 23, 2015

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented Nov 23, 2015

There's a regression in the behaviour of match patterns without value cases

Ah indeed, I had misread the previous code. Thank you!
I fixed and added a test case.

I don't think we should support aliasing exception patterns in this way.

I agree. (Your comment seem to suggest there is another way though, what would it be?)

@alainfrisch

This comment has been minimized.

Copy link
Contributor

commented Nov 24, 2015

As a side note, it seems all uses of Typedtree.pat_bound_idents actually only use the first projection of returned tuples. Shouldn't it be simplified accordingly (i.e. do the List.map fst in Typedtree)?

However [inactive] is called only from [fluid] which is called only by
[Translcore.transl_function] where [Tpat_exception] doesn't appear, so it's
not critical.
But I'd be happy if someone could make a decision. *)

This comment has been minimized.

Copy link
@alainfrisch

alainfrisch Nov 24, 2015

Contributor

If you're saying that Tpat_exception should not occur here and its not clear what this function should do if it still appeared, you should probably assert false, no?

This comment has been minimized.

Copy link
@trefis

trefis Nov 24, 2015

Author Contributor

Fair point, I will update the commit.

@alainfrisch

This comment has been minimized.

Copy link
Contributor

commented Nov 24, 2015

I only browsed the commit quickly. Is it indeed the case that guards are duplicated (i.e. the code is generated twice for them)?

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented Nov 24, 2015

As a side note, it seems all uses of Typedtree.pat_bound_idents actually only use the first projection of returned tuples. Shouldn't it be simplified accordingly (i.e. do the List.map fst in Typedtree)?

Indeed, I didn't check for that.
I could do it in this PR if you want, but I'd also be happy rebasing my PR if the change is done before.

Is it indeed the case that guards are duplicated (i.e. the code is generated twice for them)?

Yes. I might have missed something but I don't know how to handle code such as

exception Found of int

let guarded f =
  match f () with
  | Some i | exception (Found i) when i > 0 -> print_endline "positive!"
  | _ -> print_endline "onoes"

if the guard is shared. Indeed when the condition is not verified the code path differs for the value and exception branches.
The lambda code generated by this PR for the previous example looks like this:

(function f/1209
  (catch
    (catch
      (try (exit -5 (apply f/1209 0a)) with exn/1229
        (catch
          (if (== (field 0 exn/1229) (field 2 (global Test!)))
            (let (i/1210 =a (field 1 exn/1229))
              (if (> i/1210 0) (exit -4 i/1210) (exit 7)))
            (exit 7))
         with (7) (reraise exn/1229)))
     with (-5 val/1228)
      (catch
        (if val/1228
          (let (i/1210 =a (field 0 val/1228))
            (if (> i/1210 0) (exit -4 i/1210) (exit 6)))
          (exit 6))
       with (6) (apply (field 30 (global Pervasives!)) "onoes")))
   with (-4 i/1210)
    (apply (field 30 (global Pervasives!)) "positive")))

@trefis trefis force-pushed the trefis:or-exception branch from a5b8943 to c0400a6 Nov 24, 2015

@alainfrisch

This comment has been minimized.

Copy link
Contributor

commented Nov 24, 2015

but I'd also be happy rebasing my PR if the change is done before

Committed to trunk.

the code path differs for the value and exception branches

I think @chambart had a proposal to extend "static jumps" in intermediate languages with "first class label parameters" (i.e. you could pass to a jump target a symbolic reference to another target, which it could jump to). This might allow to cover such cases without duplicating the code.

I'm not completely decided on which is the best approach: duplicate the code or reject this situation (guards in presence of or-combined exception and non-exception patterns), at least for now. If we keep the duplication, I think we need to rename all identifiers, since some optimizations (e.g. in simplif.ml) might assume that all let-bound identifiers in the lambda code are different. (I'm not sure this would break code in practice, but let's be on the safe side.)

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented Nov 24, 2015

I'm not completely decided on which is the best approach: duplicate the code or reject this situation (guards in presence of or-combined exception and non-exception patterns), at least for now. If we keep the duplication, I think we need to rename all identifiers, since some optimizations (e.g. in simplif.ml) might assume that all let-bound identifiers in the lambda code are different.

@lpw25 just proposed another alternative which is to wrap the guard in a function. We suspect this to be easier to implement that the renaming necessary if we keep the code duplicated.
Once flambda is merged it will probably do the inlining (i.e. duplication) for us.

What do you think, should I try this?

PS: I don't like the idea of rejecting guards in this context. It is a minor restriction but just as arbitrary as the one rejecting exceptions under or-patterns and eventually someone is going to complain about it.

@alainfrisch

This comment has been minimized.

Copy link
Contributor

commented Nov 24, 2015

What do you think, should I try this?

People complain about implicit allocations (even though there are already a bunch of them), so this might not be very consensual. Even if flambda could remove the allocation, this won't be guaranteed and this won't be available in bytecode. (For js_of_ocaml, the overhead could be quite big.) I've no strong opinion on it, though.

@trefis trefis force-pushed the trefis:or-exception branch from c0400a6 to baf5f96 Nov 24, 2015

@chambart

This comment has been minimized.

Copy link
Contributor

commented Nov 24, 2015

Duplicating code is something that needs to be done quite carefully. First you effectively need to substitute every variable, but you also must take care of not duplicating any string literal.

exception Exit

let r = ref ""

let guarded f =
  match f () with
  | true | exception Exit when r := "hello"; true -> !r
  | _ -> "other"

let a = guarded (fun () -> raise Exit)
let b = guarded (fun () -> true)
assert(a == b)

This is of course quite an involved example.
Generating a function is probably the easier way of duplicating code, and with flambda you can ensure that a function is effectively inlined, and that the closure allocation is removed.

One other way is to compile the pattern matching in a cps style using catch but I suppose it represent a huge change in the pattern matching compilation. By the way, you already have some way of building some first class label by returning an identifier of the label (encoded as an int) and branching with a swtich rather than an jump.

@hhugo

This comment has been minimized.

Copy link
Contributor

commented Nov 24, 2015

$cat test.ml 
exception String of string
let _ = match input_line stdin with
  | str | exception (String str) -> str
  | exception _ -> ""
$ ocamlc -c test.ml 
>> Fatal error: Bytegen.comp_expr: var val_1210
Fatal error: exception Misc.Fatal_error
Called from unknown location
Called from unknown location
EXIT STATUS 2
@alainfrisch

This comment has been minimized.

Copy link
Contributor

commented Nov 25, 2015

The lambda code generated by @hhugo example is:

(setglobal Test!
  (let (String/1203 = (makeblock 248 "Test.String" (caml_fresh_oo_id 0)))
    (seq
      (catch
        (catch
          (try
            (exit -2
              (apply (field 65 (global Pervasives!))
                (field 22 (global Pervasives!))))
           with exn/1211
            (if (== (field 0 exn/1211) String/1203)
              (let (str/1206 =a (field 1 exn/1211)) (exit -1 val/1210)) ""))
         with (-2 val/1210) (exit -1 val/1210))
       with (-1 str/1206) str/1206)
      (makeblock 0 String/1203))))

The same code compiled with -g works and produces:

(setglobal Test!
  (let (String/1203 = (makeblock 248 "Test.String" (caml_fresh_oo_id 0)))
    (seq
      (catch
        (catch
          (try
            (exit -2
              (after ../test.ml(2):42-58
                (apply (field 65 (global Pervasives!))
                  (field 22 (global Pervasives!)))))
           with exn/1211
            (let (tag/1212 =a (field 0 exn/1211))
              (if (== tag/1212 String/1203)
                (let (str/1206 =a (field 1 exn/1211)) (exit -1 str/1206))
                (before ../test.ml(4):125-127 ""))))
         with (-2 val/1210) (let (str/1206 =a val/1210) (exit -1 str/1206)))
       with (-1 str/1206) str/1206)
      (makeblock 0 String/1203))))

The problem is very probably related to the reuse of the same identifier, which indeed breaks assumptions in simplif.mf.

@alainfrisch

This comment has been minimized.

Copy link
Contributor

commented Nov 25, 2015

By the way, you already have some way of building some first class label by returning an identifier of the label (encoded as an int) and branching with a swtich rather than an jump.

This seems to be an interesting approach, although it probably requires some non-trivial refactoring.
Note that we would need to pass along the exn value and invent a dummy value when the guard is evaluated for non-exceptional cases.

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented Nov 25, 2015

Thank you!
I actually missed that problem because the only tests I ran with names (instead of wildcards) where from the toplevel, which generates the same lambda code as ocamlc -g...

I added a new commit with the fix and @hhugo 's test. I will squash it with the other commits once it is reviewed and you guys are happy with it (and I'm not sure you will be, I don't know much about that part of the compiler).

@alainfrisch

This comment has been minimized.

Copy link
Contributor

commented Nov 25, 2015

Duplicating affects the sharing of string literals, as @chambart noticed. I don't know how bad this is considered (esp. now that string are mostly immutable, the physical equality on them is no longer specified).

As for the implementation strategy, between duplicating the code (with proper renaming) or introducing a local function (hoping it will be inlined), I think I have a preference for duplicating: since guards are usually very small, this is likely to be ok, and the impact on bytecode and js_of_ocaml (assuming it can deal with this new construction!) will be better.

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented Nov 25, 2015

I expect that defining a local function with an inline_attribute of Always_inline should be enough to be sure that it is always inlined (which is better than hoping it to be).
I don't know if it is the case at the moment but will apparently definitely be true with flambda, so I don't mind trying that.

@alainfrisch

This comment has been minimized.

Copy link
Contributor

commented Nov 25, 2015

It will not be inlined in bytecode and thus not by js_of_ocaml (which does not do any non-trivial inlining for now). If we force inlining anyway, I don't see any problem with duplicating the code (modulo string literals).

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented Nov 25, 2015

If we force inlining anyway, I don't see any problem with duplicating the code (modulo string literals).

My plan was to let flambda introduce the sharing before duplication, but indeed that does not work for bytecode (although @chambart's example already works with ocamlc, the problem only happens with the native backend).

@maranget

This comment has been minimized.

Copy link
Contributor

commented Nov 26, 2015

As a principle of OCaml Pattern Matching : thou shalst not duplicate.
Now I look at the PR.

@trefis trefis force-pushed the trefis:or-exception branch from 6fa79c2 to 6802708 Nov 26, 2015

@gasche

This comment has been minimized.

Copy link
Member

commented Nov 27, 2015

I'm not completely decided on which is the best approach: duplicate the code or reject this situation (guards in presence of or-combined exception and non-exception patterns)

I'm rather of the opinion than rejecting this situation is the best move (and that creating an auxiliary function would be better than duplicating the code). "Mixing return and exception cases under when-guards is not supported.".

(A change proposal is easily shot down if the implementation has a defect. Duplicating code and implicit allocation are defects. The only way out is to accept a design limitation instead. There is little love for when guards above or-patterns anyway.)

@alainfrisch

This comment has been minimized.

Copy link
Contributor

commented Dec 7, 2015

"Mixing return and exception cases under when-guards is not supported."

Ok, it seems this is the way to go to get this PR merged quickly. We can always revisit later to drop the limitation.

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented Dec 7, 2015

Ok, it seems this is the way to go to get this PR merged quickly.

What do you mean by quickly? If you mean 4.03 I can do the required change this week, otherwise it might take a few more weeks before I get back on this.

@alainfrisch

This comment has been minimized.

Copy link
Contributor

commented Dec 7, 2015

I'm in favor of including this in 4.03 if you can do the change quickly, and according to @gasche reply, I think he would be on the same line. So unless another core dev objects to it, indeed, this could go in 4.03.

@hhugo

This comment has been minimized.

Copy link
Contributor

commented Dec 7, 2015

Note that I'm still not sure how would js_of_ocaml handle this.

@alainfrisch

This comment has been minimized.

Copy link
Contributor

commented Dec 7, 2015

Note that I'm still not sure how would js_of_ocaml handle this.

Well, this could indeed be a blocking point. We don't want libraries to start using this new feature and become unusable for js_of_ocaml. Can you check with your fellow js_of_ocaml developpers if there is some hope to have this supported in the near future?

@gasche

This comment has been minimized.

Copy link
Member

commented Dec 9, 2015

Let's go, then.

gasche added a commit that referenced this pull request Dec 9, 2015

Merge pull request #305 from trefis/or-exception
Support "exception" under or-patterns (PR#6422)

@gasche gasche merged commit cfeda89 into ocaml:trunk Dec 9, 2015

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@trefis

This comment has been minimized.

Copy link
Contributor Author

commented Dec 9, 2015

I just noticed the following line in the manual (in the chapter about language extensions, section "Exception cases in pattern matching"):

A new form of exception patterns is allowed, only as a toplevel pattern under a match...with pattern-matching (other occurrences are rejected by the type-checker).

Should it be updated?

@alainfrisch

This comment has been minimized.

Copy link
Contributor

commented Dec 9, 2015

Should it be updated?

Yes, that would be great. The new text should clarify that the restriction has been relaxed in 4.03.

@nojb

This comment has been minimized.

Copy link
Contributor

commented Dec 10, 2015

This PR seems to have introduced a bug in the exhaustiveness checker. See PR#7083.

alainfrisch added a commit that referenced this pull request Dec 11, 2015

Merge pull request #343 from trefis/pr7083
PR7083: fix assertion failure introduced by #305
@gasche

This comment has been minimized.

Copy link
Member

commented Dec 11, 2015

Thomas ( @trefis ) proposed in #343 a potential fix for the exhaustiveness issue. However, a preliminary review seems to suggest (to me) that we don't properly understand the implications of this change (the cause for the bug and the impact of the fix) to assess that it is correct.

I have second thoughts about having merged this PR in trunk. For now the only person that would be able to carefully review the code to check that it is correct is Luc ( @maranget ), and the timeframe to do this before the 4.03 release is too short. I would be glad to help (I'm starting to understand this part of the compiler better, and I think @trefis is as well), but I don't have the time before the release either.

Finally, someone remarked to me recently that this PR is a real language change that we hadn't discussed during the last development meeting (while we somewhat agreed to restrict ourselves, for large changes, to features that had been approved at that time). One last point against this PR.

I am thinking of reverting the present PR ( #305 ). I would like to work on warnings for pattern-matching in the first half of 2016 with @maranget , and @trefis if he has the time to do so. Then would be a better time to reconsider this change and its implications, and make sure we have a correct implementation. I think the feature is nice, but it's not absolutely essential that we rush it into 4.03.

@alainfrisch

This comment has been minimized.

Copy link
Contributor

commented Dec 11, 2015

I'm not sure about the timeframe argument. Are we still considering flambda in 4.03? If so, it will probably take several weeks or perhaps months before the release, and a lot of testing with a lot of code in the wild will hopefully happen, which should allow to find any regression caused by this one as well. I don't see the point of postponing nice additions which are ready to merge. If they are included only in a few months, this will simply delay so much the testing and people motivated by the change now might have less motivation or time to go back to them.

All this could be alleviated if the feature freeze came along the branching for 4.03 so that the trunk remains open for inclusion of ready additions (even if they do not go to 4.03), but I know this is not the chosen policy.

@alainfrisch

This comment has been minimized.

Copy link
Contributor

commented Dec 11, 2015

To add to the previous note: I'm pretty sure the bug would not have been found if the feature had not be included in trunk.

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented Dec 11, 2015

It might not have been clear in my previous messages but I was actually surprised this was considered for 4.03. So I don't particularly mind if it's delayed, I'm just mildly annoyed by your change of mind :) (there are other things I could have been working on, getting the native toplevel to work with flambda for example).

Are we still considering flambda in 4.03?

The last update I got was "flambda should be merged on Friday the 18th".
Of course this has been moving delayed by a week each week, so :)

@gasche

This comment has been minimized.

Copy link
Member

commented Dec 11, 2015

I'm sorry, indeed it was my mistake to be too hasty merging it. I should have realized the importance of @maranget review and not taken a decision without this oversight.

@maranget

This comment has been minimized.

Copy link
Contributor

commented Dec 11, 2015

I have probably over-reacted on this.
The fix should be relatively simple considering that pressure_variant should probably be called
on value and exception matchings separately, just like unused clause and exhaustivity are called.

Consider that those two matchings are independent as regards semantics. Grouping them in syntax is
let us say "by accident"

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented Dec 11, 2015

This is indeed what is done for the other checks.

For pressure_variants I went another way because I didn't consider that variants could be present deep inside exception patterns. That means that contrary to my bold claims #343 is not correct and should be reverted.
I will try to send an updated pull request later today.

gasche added a commit that referenced this pull request Dec 12, 2015

Revert GPR#305 (exception patterns under or-patterns) from trunk
This week we merged several changes from Thomas Refis, to allow the
use of exception patterns under or-patterns, to write code such as

  match foo x with
    | None | exception Not_found -> ...
    | Some -> ...

Unfortunately, I failed to properly assess the impact of this change,
and in particular to make sure that Luc Maranget had properly reviewed
this code -- any change to the pattern-matching machinery should be
reviewed by Luc.

The problem that I had not foreseen and that he would have immediately
realized is that, while adapting the pattern-matching *compiler* is
relatively easy (Thomas inserted a transformation at the right place
to separate exception patterns from the others and handle them
separately, using the staticraise construct used by the
pattern-matching compiler to avoid duplicating the
right-hand-side branch), adapting the pattern-matching warnings
machinery is both more subtle and easier to overlook (it may fail
silently and nobody notices, unlike wrong code production). This part
of the compiler is subtle and best understood by Luc, but he does not
have the time to do a proper review of those changes in the timeframe
for the 4.03 feature freeze (mid-December).

I believe the right move in this case, implemented in the present
commit, is to revert the change from trunk (this is not a feature that
we must *imperatively* have in 4.03), do a proper job of understanding
the changes, and integrate the change when we are confident it is
ready. I hope to do this in 2016, together with Luc Maranget and
Thomas Refis -- hopefully this would allow Thomas and I to be more
confident when changing the pattern-matching machinery in the future.

Revert "Merge pull request #343 from trefis/pr7083"
This reverts commit 22681b8, reversing
changes made to a24e4ed.

Revert "Merge pull request #341 from trefis/or-exception"
This reverts commit f8f68bd, reversing
changes made to 1534fe8.

Revert "Merge pull request #305 from trefis/or-exception"
This reverts commit cfeda89, reversing
changes made to 77cf36c.
@gasche

This comment has been minimized.

Copy link
Member

commented Dec 12, 2015

So (after some further discussion with @trefis) I went ahead and reverted this change from trunk ( d071da2 ). I would like to work on it again in early 2016, so hopefully there will be an updated PR at that time.

314eter added a commit to 314eter/language-ocaml that referenced this pull request Jun 2, 2016

@314eter 314eter referenced this pull request Jun 2, 2016

Merged

Add exception pattern #20

mroch added a commit to toroidal-code/language-ocaml that referenced this pull request Jun 14, 2016

@bobzhang bobzhang referenced this pull request Aug 12, 2016

Closed

Upgrade to OCaml 4.06.1 #633

@bobzhang bobzhang referenced this pull request Apr 18, 2018

Open

Upgrade to 4.06.1 #2755

@bobzhang bobzhang referenced this pull request Nov 13, 2018

Open

to-do list for upgrade #3000

31 of 37 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.