Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor [Location.print_loc] to print error locations more consistently #1925

Merged
merged 3 commits into from Jul 25, 2018

Conversation

Projects
None yet
6 participants
@Armael
Copy link
Member

commented Jul 23, 2018

This PR refactors Locations.print_loc, which is used to print the location of an error (e.g. File "foo.ml", line 5, characters 10-12). Its functionality was being re-implemented in ad-hoc ways in a couple places: in highlight_dumb (responsible for highlighting the error in the toplevel with a "dumb" terminal), and in expect tests (testsuite/tools/expect_test.ml).

With this PR, a single one-size-fits-all implementation is provided and used in these two places as well. As a side effect:

  • Locations printed in the toplevel now use line numbers (no more Characters 789-866)
  • Locations printed in expect tests now display line numbers instead of Line _
  • When File "_none_", line 15 was printed, now only Line 15 is printed (this could be changed but I assumed there was no point of printing the dummy file name)
  • A few incorrect locations present in the testsuite are now correct (thanks to the change in lexing.ml I think)
  • A bunch of tests needed to be updated, hence the somewhat big diff

@trefis trefis self-assigned this Jul 23, 2018

@Armael

This comment has been minimized.

Copy link
Member Author

commented Jul 23, 2018

I forgot:

  • In a "dumb" toplevel, with the "missing )" message, which reports 2 locations, two location lines are now printed. This makes the error more verbose but more precise -- this design aspect could be discussed, maybe there's a better way?
@bobot

This comment has been minimized.

Copy link
Contributor

commented Jul 24, 2018

When File "none", line 15 was printed, now only Line 15 is printed (this could be changed but I assumed there was no point of printing the dummy file name)

Does this change impact the compiler output? Some editors use regexp to find error lines and look for File. Indeed in that case the information is not useful because the file is not given, but it is prettier.

EDIT: the regexp used by emacs, I think

(python-tracebacks-and-caml "^[ 	]*File \\(\"?\\)\\([^,\" \n	<>]+\\)\\1, lines? \\([0-9]+\\)-?\\([0-9]+\\)?\\(?:$\\|,\\(?: characters? \\([0-9]+\\)-?\\([0-9]+\\)?:\\)?\\([ \n]Warning\\(?: [0-9]+\\)?:\\)?\\)" 2
                             (3 . 4)
                             (5 . 6)
                             (7))
@trefis
Copy link
Contributor

left a comment

I left a few questions, but this overall looks good.
I'm not approving just yet because although I can guess the answer to @bobot's question, I'm not sure it will satisfy everyone.

| _ -> true
in
let line_valid line = line <> -1 in
let chars_valid (startchar, _endchar) = startchar >= 0 in

This comment has been minimized.

Copy link
@trefis

trefis Jul 24, 2018

Contributor

Two questions:

  1. do you plan to change this at some point to also validate endchar? (otherwise why are you taking it as argument)
  2. why take a pair and not two (named) arguments?

This comment has been minimized.

Copy link
@Armael

Armael Jul 24, 2018

Author Member
  1. this is existing code that was moved here, and that was apparently just validating startchar. But I think the goal here is to detect a "ghost" location, in which case chars_valid startchar endchar could become startchar = -1 || endchar = -1 (this would be consistent with highlight_dumb). I'll probably do that.
  2. Good point, let me do that

This comment has been minimized.

Copy link
@Armael

Armael Jul 24, 2018

Author Member

Uh I meant startchar <> -1 && endchar <> -1

(* Nothing has been printed.
Print the position characters as a best effort. *)
Format.fprintf ppf "Characters %i-%i"
loc.loc_start.pos_cnum loc.loc_end.pos_cnum;

This comment has been minimized.

Copy link
@trefis

trefis Jul 24, 2018

Contributor

When do you expect to be in this situation?

I see many test cases which printed only Characters ... changed to print Line X, characters ..., but I'm not sure whether some remains.

Naively, I would think we'd get in this situation only when some preprocessor messed up the locations. If that is the case, maybe you could make that explicit in the comment above.

This comment has been minimized.

Copy link
@Armael

Armael Jul 24, 2018

Author Member

When do you expect to be in this situation?

I'm not completely sure. Indeed, it could happen if some preprocessor (or the compiler but that doesn't seem to be the case currently) badly messed up the locations. I will clear that up in the comment.

@@ -242,7 +350,8 @@ let highlight_dumb ~print_chars ppf lb locs =
end

let show_code_at_location ppf lb locs =
highlight_dumb ~print_chars:false ppf lb locs
try highlight_dumb ~print_chars:false ppf lb locs
with Exit -> ()

This comment has been minimized.

Copy link
@trefis

trefis Jul 24, 2018

Contributor

I think I overlooked something: who raises Exit?

This comment has been minimized.

Copy link
@Armael

Armael Jul 24, 2018

Author Member

highlight_dumb might raise Exit (this is not changed by this patch, it was already possible). Apparently this does not happen currently, but I added this safeguard just in case.

This comment has been minimized.

Copy link
@trefis

trefis Jul 24, 2018

Contributor

I had indeed missed it. :)
The change seems good then.

@trefis

This comment has been minimized.

Copy link
Contributor

commented Jul 24, 2018

Does this change impact the compiler output? Some editors use regexp to find error lines and look for File. Indeed in that case the information is not useful because the file is not given, but it is prettier.

It does, see the compat32 test output for example; which I guess your editor would fail to interpret correctly.

However, as I am an extremely optimistic person, I hope that with the work @Armael started doing (and is planing to continue) we could have the compiler error messages in a format that is easily parsed by tools (e.g. sexp, or json, or something similar).
@Armael: is that something you have on your todo list, does it seem at all realistic?

@trefis

This comment has been minimized.

Copy link
Contributor

commented Jul 24, 2018

In a "dumb" toplevel, with the "missing )" message, which reports 2 locations, two location lines are now printed. This makes the error more verbose but more precise -- this design aspect could be discussed, maybe there's a better way?

I'm not too fond of that indeed. But I probably dislike the previous state even more.
I don't think we should worry about that too much right now, that error message can be reworked at a later point; in particular I believe this only happens for syntax error, so we could hope having something nicer once we switch to menhir.

@bobot

This comment has been minimized.

Copy link
Contributor

commented Jul 24, 2018

However, as I am an extremely optimistic person, I hope that with the work @Armael started doing (and is planing to continue) we could have the compiler error messages in a format that is easily parsed by tools (e.g. sexp, or json, or something similar).

I think that would not solve the question. My point was simpler, just to keep the working highlighting by not changing the format. But the -color option should nowadays solve the highlighting problem (except not by default when called from emacs).

@trefis

This comment has been minimized.

Copy link
Contributor

commented Jul 24, 2018

I think that would not solve the question.

Yes, I answered beside the point. I was assuming everyone lived in an environment where the build system is constantly running, parsing errors from the compiler and communicating them to your editor when asked.
The parsing would have to change, but we could make the format much easier to parse, so it didn't seem to bad.

But apparently there are still people out there doing things like M-x compile, and the output has to respect some GNU-whatever format to be understood by the editor, which is much harder to change.

My point was simpler, just to keep the working highlighting by not changing the format.

Well, the highlighting is nice, but I assume having your cursor moved to the error is also done using the same regexp.

@Armael

This comment has been minimized.

Copy link
Member Author

commented Jul 24, 2018

To answer @bobot's remark, I agree there's currently no actual benefit of not printing dummy filenames ("" or "_none_"), while it can break editor support or whatnot. I'll change the patch to only discard the filename if it is "//toplevel//".

@trefis

This comment has been minimized.

Copy link
Contributor

commented Jul 24, 2018

Before you do that change, I'm wondering in which situations "_none_" and "" are actually used as filename.

It seems like the location is bogus in these situations, so the "pretty highlighting" that @bobot is referring to doesn't seem very convincing: any highlighting or cursor movement which will result from parsing that location is likely to be somewhat incorrect.

  • For "_none_" at least, that is clearly the case. For example, I see a Location.prerr_warning Location.none warn; in compilenv.ml which seems clearly wrong (unless one expects !input_name to be used as the filename, but that's not what happens).
  • For "", I don't know.
@trefis

This comment has been minimized.

Copy link
Contributor

commented Jul 24, 2018

To elaborate on my previous message:

  • it makes me sad that we'd have to print File "" or File "_none_" and I'd rather avoid that
  • I wonder if every case where we're calling print_loc on such a location is actually a bug
  • that doesn't prevent @Armael from reverting to the previous state where such bogus filenames are still printed

Actually, he probably should revert indeed and we should approach the problem of bad locations from another angle (i.e. instead of filtering them out when printing, we should avoid producing them in the first place).

I'm now done ranting.

@Armael

This comment has been minimized.

Copy link
Member Author

commented Jul 24, 2018

I fully agree with @trefis remarks.

To elaborate a bit on the provenance of "" and "_none_" filenames (and after some further investigation):

  • "_none_" comes from Location.none, or !Location.input_name
  • "" comes from Lexing.dummy_pos and Lexing.zero_pos.

But I just realized that, according to the comment that I wrote in print_loc, if the filename of a position is "", then !Location.input_name should be used -- and (I think!) !Location.input_name is either a valid filename, //toplevel// or _none_, but cannot be "". So this means File "" cannot happen! I can make that explicit in file_valid.

Then, for _none_, this is the symptom that some part of the compiler emitted a dummy location. There seems to be many places where this can happen, and it's probably something that should be improved, but as @trefis said it's a separate problem.

@Armael

This comment has been minimized.

Copy link
Member Author

commented Jul 24, 2018

!Location.input_name is either a valid filename, //toplevel// or _none_, but cannot be "". So this means File "" cannot happen!

Well nevermind, expect_test.ml currently breaks this invariant by setting Location.input_name to "" :).

@Armael Armael force-pushed the Armael:location-v1.5 branch 2 times, most recently from d50201e to a226bbc Jul 24, 2018

@Armael

This comment has been minimized.

Copy link
Member Author

commented Jul 24, 2018

I just pushed new commits that implement the suggestions above, + amended the big commit that updates the testsuite (do not get fooled by github's ordering of the commit, "Update the testsuite" actually comes last..!)

@Armael

This comment has been minimized.

Copy link
Member Author

commented Jul 24, 2018

The build of the manual is broken (as the CI shows), but I'm waiting for #1863 to be merged before fixing it.

struct type t module M = struct type t end type a = M.t end;;
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Definition of module M/1
Line _:
File "_none_", line 1:

This comment has been minimized.

Copy link
@Armael

Armael Jul 24, 2018

Author Member

This is one instance of Location.none emitted by the compiler and bubbling up :).

@xclerc

This comment has been minimized.

Copy link
Contributor

commented Jul 24, 2018

Locations printed in expect tests now display line numbers instead of Line _

Could you elaborate on the rationale behind this change? I thought line numbers
were stripped from expected test messages on purpose, so that the expectation
does not need to be updated after e.g. reformatting the code.

@Armael

This comment has been minimized.

Copy link
Member Author

commented Jul 24, 2018

Hum, but line numbers are relative to the current phrase only, so they won't change when adding/deleting/changing other phrases before.

@gasche

This comment has been minimized.

Copy link
Member

commented Jul 24, 2018

To clarify: before it was the case that line numbers where relative to the whole file, not the current sentence ( @Armael might remember, we worked on the Line _ change together). I believe that the sentence-relative numbers are a consequence of the change 65c8c79 , which is part of the present PR.

Re. "none": Note that there are several situations in which the compiler may want to emit a message that is hard to attach to a source code line. For example, when there is an error when loading a .cmi while processing an -open Foo directive on the command-line or in OCAMLPARAM. In almost all these cases there should be something nicer than "none" to show, but expect a long tail or tricky-to-make-decisions-about corner cases. I would not expect to have a complete overview of all of these before the present PR can be merged.

(@trefis has handled the PR beautifully so far, please keep doing that.)

@xclerc

This comment has been minimized.

Copy link
Contributor

commented Jul 24, 2018

Sure, but it is not what I meant; I was considering a reformatting
such as the following:

let x = ... in ... error ...

rewritten to

let x =
  ...
in
... error ...
@gasche

This comment has been minimized.

Copy link
Member

commented Jul 24, 2018

I don't see a problem with the line numbers of the specific part you edited changing. They were previously silenced on purpose to avoid the line numbers of all the subsequent tests changing, which would be much more annoying.

In fact, we currently show character ranges, so many kind of those refactorings will already result in a difference.

@Armael

This comment has been minimized.

Copy link
Member Author

commented Jul 24, 2018

@gasche So wrt "_none_", interestingly, the current situation is that there are actually several "dummy values" (as explained in the comment I added above Location.input_name): "_none_" denotes a dummy filename that will get printed, "" denotes a "dummy input" that will not get printed, and "//toplevel//" denotes the toplevel (and it will not get printed either).

We could indeed imagine having a more refined type for !Location.input_name (and stop using magic string values) that would allows giving more information in the case the input is not an actual file. I would leave the task of clearing that up for an other PR (the present PR only tries to document the present situation).

@xclerc

This comment has been minimized.

Copy link
Contributor

commented Jul 24, 2018

Indeed, it was a bad example; I guess a better example would be the insertion
of line/comment that would change the line number but leave the character
range unchanged.

@Armael

This comment has been minimized.

Copy link
Member Author

commented Jul 24, 2018

Like @gasche I would think it's fine to have to update the locations of an expect test if you're modifying it (and indeed I remember now that Line _ was printed because the code counting the lines didn't properly reset between phrases, but this is fixed here).

@gasche

This comment has been minimized.

Copy link
Member

commented Jul 24, 2018

I've seen weirder filenames reported than the three you list, look for "command line", "compiler internals" and other weird arguments passed to Location.in_file in the distribution. (They get funny when combined with the -absnames option).

The .cmi example I mentioned is an easy way to see a _none_ in action:

$ echo > foo.cmi
$ echo > test.ml
$ ocamlc -open Foo -i test.ml
File "_none_", line 1:
Error: Corrupted compiled interface
foo.cmi
@Armael

This comment has been minimized.

Copy link
Member Author

commented Jul 24, 2018

Do you think I should change the comment then?

@Armael Armael force-pushed the Armael:location-v1.5 branch from a226bbc to 1ce11af Jul 24, 2018

@Armael

This comment has been minimized.

Copy link
Member Author

commented Jul 24, 2018

Actually, I just removed it. I propose to leave for an other PR the task of changing the type of "filenames" from string to a sum type Filename of string | Toplevel | Dummy of string | None

@trefis

This comment has been minimized.

Copy link
Contributor

commented Jul 24, 2018

I propose to leave for an other PR the task of changing the type of "filenames" from string to a sum type Filename of string | Toplevel | Dummy of string | None

Agreed 100%

@Armael Armael force-pushed the Armael:location-v1.5 branch from 1ce11af to a08b74d Jul 25, 2018

@Armael Armael force-pushed the Armael:location-v1.5 branch from a08b74d to b367c0e Jul 25, 2018

@Armael

This comment has been minimized.

Copy link
Member Author

commented Jul 25, 2018

I just pushed a cleaned-up version of the history, rebased on top of latest trunk. CI should pass now.

@trefis

trefis approved these changes Jul 25, 2018

Copy link
Contributor

left a comment

Looks like all the comments, including @bobot's, were addressed. And the CIs are happy.

LGTM.

@trefis trefis merged commit 94bc8d7 into ocaml:trunk Jul 25, 2018

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@shindere

This comment has been minimized.

Copy link
Contributor

commented Jul 25, 2018

Armael added a commit to Armael/ocaml that referenced this pull request Jul 25, 2018

@shindere

This comment has been minimized.

Copy link
Contributor

commented Jul 25, 2018

Just pushed 9a9deee by @Armael to fix the
broken tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.