Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Location.none is broken with the LLVM assembler in 4.03.0+beta2 #7226

Closed
vicuna opened this Issue Apr 13, 2016 · 11 comments

Comments

Projects
None yet
2 participants
@vicuna
Copy link
Collaborator

vicuna commented Apr 13, 2016

Original bug ID: 7226
Reporter: antron
Assigned to: @gasche
Status: closed (set by @xavierleroy on 2017-09-24T15:32:57Z)
Resolution: fixed
Priority: normal
Severity: major
Target version: 4.03.0+dev / +beta1
Fixed in version: 4.03.0+dev / +beta1
Category: back end (clambda to assembly)
Monitored by: @gasche @mmottl

Bug description

LLVM's assembler does not (apparently) accept negative column numbers, e.g.

.loc 2 1 -1

The result is that it is not possible to compile on systems that use Clang for assembly, with the -g flag, with a PPX that sometimes uses Location.none, because Location.none has a column number of -1:

/var/folders/wr/52wphbkn0zx5nvbx61v640nm0000gn/T/camlasmef96b3.s:2804:11: error: unexpected token in '.loc' directive
.loc 2 1 -1
^

Steps to reproduce

  1. To check for such an assembler:

foo.s:

.file 1 "none"
.loc 1 1 -1

Then:

clang foo.s
as foo.s # If this is gas, it should accept the -1.

  1. To trigger this failure through OCaml:

ppx_loc_bug.ml:

(* Changes location of all application expressions to Location.none. *)

open Asttypes
open Parsetree
open Ast_mapper

let mapper _ =
{default_mapper with expr = fun mapper e ->
match e with
| {pexp_desc = Pexp_apply _} -> {e with pexp_loc = Location.none}
| _ -> default_mapper.expr mapper e}

let () =
register "ppx_loc_bug" mapper

test.ml:

let () =
print_endline "Hello world!"

Makefile:

PPX = ppx_loc_bug
TEST = test

.PHONY : build
build : clean
ocamlfind opt -linkpkg -package compiler-libs.common -o $(PPX) $(PPX).ml

@echo "  No Location.none: expected to succeed"
ocamlfind opt -g -o $(TEST) $(TEST).ml

@echo "  No -g: expected to succeed"
ocamlfind opt -ppx ./$(PPX) -o $(TEST) $(TEST).ml

@echo "  Location.none + -g: expected to fail on 4.03.0+beta2"
ocamlfind opt -ppx ./$(PPX) -g -o $(TEST) $(TEST).ml

.PHONY : clean
clean :
rm -f $(PPX) $(TEST) *.cm[iox] *.o

Additional information

This was apparently introduced by #212

I propose to solve it by not emitting the column number if it is negative, i.e.

.loc 2 1

If this is reasonable, I can make a PR.

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

vicuna commented Apr 13, 2016

Comment author: @lefessan

Is it a problem of OCaml, or a problem of the ppx that emits Location.none in a place where OCaml is expecting a true location ?

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

vicuna commented Apr 13, 2016

Comment author: @gasche

I see several ways to work around the problem:

(1) change the column-printing code of #212 to not emit the column when it is negative
(2) change the default ghost location value to use a non-negative column number (but this has the risk of breaking people assumptions, so I wouldn't recommend it)
(3) disable column printing when using clang/llvm (I think this is not necessary, and not necessarily easy to do)

Options (1) seems best to me. If you want to submit a PR, that would be very nice indeed. If you want it to be considered for inclusion into 4.03, it should be as non-invasive and "obviously non-regressing" as possible.

(We could also consider reverting #212 for 4.03 to be in the clear, but I'd rather not do that.)

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

vicuna commented Apr 13, 2016

Comment author: antron

It's a problem of OCaml.

If you are writing or maintaining a flawed PPX that emits Location.none, and you accidentally only use Location.none in constructs that don't cause emission of .loc directives, you won't notice that you have a problem.

And, if you use Location.none in constructs that do cause emission of .loc, but your OCaml is configured to assemble using the GNU toolchain, you still won't notice it.

If somebody else then tries to use your PPX on OCaml configured with the Clang toolchain, they will get a number of opaque error messages coming from the assembler.

The points are:

  1. There isn't uniform behavior across toolchains, and I am guessing GNU is the more common one. In practice, this means developers on Linux won't easily catch such a problem with their PPX, and will be in danger of unknowingly distributing buggy code.
  2. The error message seen by the user really makes no sense until non-trivial investigation. In particular, there is no reference to a PPX. It can arise in some preprocessed files but not others, depending on which constructs each file contains.
  3. Ast_helper.default_loc is set to Location.none. Combined with a tendency to skip ?loc arguments when using Ast_helper, the interface invites authors to make this mistake. Similarly, Ast_convenience in ppx_tools doesn't (yet) offer an obvious way to pass locations, and it takes some discipline to use [@metaloc] everywhere when using ppx_metaquot.
  4. 4.02 happily accepted Location.none, whether that was supposed to be correct or not. Breaking this now may require a review of existing PPXs.
  5. Sometimes there isn't really a sensible location to pick for a generated construct, and if a PPX author is reasonably certain that an error won't be reported on it, and the user can deal with the stack trace, the author may be tempted to use Location.none.

If it is genuinely important for OCaml that locations of expressions or other constructs are not Location.none, then I think the compiler should be modified to either check for Location.none, or to prevent or strongly discourage construction of ASTs with Location.none.

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

vicuna commented Apr 13, 2016

Comment author: @gasche

It's not even clear to me that using Location.none if ppx rewriters is a bad choice in the first place. For code that comes from user code, using user code locations is better, but there may be situations where it's reasonable to say that a particular code has "no location".

Let's get over this debate and fix the bug.

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

vicuna commented Apr 13, 2016

Comment author: antron

Ok, I'll do (1). Thanks.

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

vicuna commented Apr 13, 2016

Comment author: @lefessan

Except that, when debugging a program, faulty ppx programs have the same impact on debuggers with "gas" (where there is a location, but it is incorrect) as with "clang" (if the location is removed). I think the best solution would be, at least, to issue a warning ("your ppx generates code with incorrect locations"), so that users can send bug reports upstream to ppx developers.

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

vicuna commented Apr 14, 2016

Comment author: @damiendoligez

@lefessan how hard would it be to emit such a warning? I say let's release 4.03 with the simple patch, and add the warning in trunk.

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

vicuna commented Apr 14, 2016

Comment author: @gasche

Fixed in 4.03 ( 9c05581 ) and trunk ( 0746566 ), by the patch proposed by antron. Thanks!

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

vicuna commented Jun 3, 2016

Comment author: @mmottl

I have just run into apparently the exact same issue with OCaml 4.03+flambda. Not sure why this happens with Flambda turned on. The issue does seem to be fixed with the plain vanilla OCaml 4.03 OPAM switch.

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

vicuna commented Jun 4, 2016

Comment author: @gasche

I cannot reproduce under either 4.03-flambda or trunk-flambda.

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

vicuna commented Jun 4, 2016

Comment author: @mmottl

I've created a small test case, reproducible with OPAM 4.03.0+flambda on Mac OS X.

File "foo.ml":

type 'a t = 'a [@@deriving bin_io]

Compile it with:

ocamlfind ocamlopt -c -g -package ppx_bin_prot -o foo.cmx foo.ml

The important bit above is "-g". Without it the problem does not happen, neither does it without Flambda.

@vicuna vicuna closed this Sep 24, 2017

@vicuna vicuna added the back-end label Mar 14, 2019

@vicuna vicuna added this to the 4.03.0 milestone Mar 14, 2019

@vicuna vicuna added the bug label Mar 20, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.