Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Access to Clflags from ppx preprocessors #6497

Closed
vicuna opened this issue Jul 25, 2014 · 25 comments

Comments

Projects
None yet
2 participants
@vicuna
Copy link

commented Jul 25, 2014

Original bug ID: 6497
Reporter: @whitequark
Assigned to: @alainfrisch
Status: closed (set by @xavierleroy on 2016-12-07T10:34:29Z)
Resolution: fixed
Priority: normal
Severity: minor
Version: 4.02.0+beta1 / +rc1
Target version: 4.02.0+dev
Category: ~DO NOT USE (was: OCaml general)
Monitored by: jmeber

Bug description

I have ran several times into a need to access Clflags, which is obviously impossible, as the ppx preprocessor runs in a different process.

These include:

  • Getting the name of the current module, which is critical to 2 of 4 of the extensions I currently have. I'm currently using Location.input_name and deriving the module name from the file basename, but that would break with -for-pack.

  • Getting the list of -I flags. I have several extensions in mind that would do something with Typedtree of other modules, but it would currently be impractical to deploy them.

  • Getting the state of -g flag, to generate information only useful together with backtraces.

  • Getting the kind of compiler (ocamlc/ocamlnat), for conditional compilation.

So I think the Clflags.(for_pack, include_dirs, debug, native_code) should be exposed to ppx like Location.input_name currently is. I consider the lack of Clflags.for_pack a bug; it would be great to have it fixed in 4.02.

@vicuna

This comment has been minimized.

Copy link
Author

commented Jul 25, 2014

Comment author: @whitequark

I forgot one thing in description; I do not think exposing any other Clflags fields makes any sense. Those two should satisfy any needs a preprocessor may have.

@vicuna

This comment has been minimized.

Copy link
Author

commented Jul 25, 2014

Comment author: @alainfrisch

Indeed, it could make sense to pass more contextual information from the caller (which can be ocamlc, ocamlopt, but also ocaml, ocamldoc, ocamldep, etc). Extra information could be passed in several ways:

  • on the ppx command-line: the API already supports passing extra parameters to ppx (in addition to the input/output filenames), and we would just need to tweak the compilers to pass those informations and existing ppx tools to ignore them. "ppx drivers" (which orchestrate several actual ppx mappers) would need to pass all such argument along.

  • as extra @@@attributes inserted in front of the AST by the compiler (and maybe directly interpreted by the Ast_mapper driver so that ppx have easily access to the information). This might be more robust than command-line arguments (length limitation under Windows, escaping issues, easier to pass structured data if needed and to ignore irrelevant attributes, automatic passing between nested ppx).

  • by changing the binary protocol around ppx, i.e. change the type of the marshaled data to include an extra record holding all the relevant values. A downside is that currently, the exact same format (and magic number) is also used by -pp (i.e. camlp4/camlp5 in practice), which makes it possible, for instance, to send the binary output of camlp4 directly to a ppx, on the command-line (without going through a compiler).

Do you have an opinion?

Concerning the actual set of information to pass, I'm wondering whether we should define exactly which options are passed, or whether we should pass the entire command-line (ppx would need to reparse it, taking OCAMLPARAMS into account, but we would automatically support all current and future flags).

@vicuna

This comment has been minimized.

Copy link
Author

commented Jul 25, 2014

Comment author: @whitequark

Naturally, I would first reach for changing the binary protocol. It's actually simple to keep the ability to pipe camlp4 to ppx, you just need to match on magic number.

I think command-line is not a good idea for the reasons you have listed. I do not have a preference between changing binary protocol and @@@attributes, except for the fact that I really don't like the complexity of the serializer/deserializer associated with @@@attributes (but it could be hidden in the compiler).

I think parsing command-line should be done in exactly one place for consistency. Additionally, OCaml tools frequently have differents sets of valid arguments and it is not clear how Ast_mapper should deal with that; it would have to duplicate the option parsers of different tools or give up on "parsing all options" goal.

I don't see a point in passing all flags. I think the simplest solution is to change the protocol and magic number, and pass a record with the four flags I've mentioned, but there are certain advantages in passing that record via @@@attributes.

@vicuna

This comment has been minimized.

Copy link
Author

commented Jul 25, 2014

Comment author: @alainfrisch

If we don't pass the full command-line, we should at least send the tool name (either as a concrete sum type or as a string to be more extensible) to support the toplevel, ocamldoc, ocamldep in addition to ocamlc/ocamlopt.

I think that @@@attributes are more coherent with how we pass warnings/errors back from ppx to the compiler. Moreover, they would not require to change the magic number if we ever want to pass more information (e.g. from tools such as ocamldoc, or maybe merlin).

Do you want to give it a try?

@vicuna

This comment has been minimized.

Copy link
Author

commented Jul 25, 2014

Comment author: @whitequark

I think sending tool name as a string is OK. In @@@attributes there isn't a choice anyway.

I think @@@attributes are worth a try. Would you write a patch, or should I?

@vicuna

This comment has been minimized.

Copy link
Author

commented Jul 25, 2014

Comment author: @whitequark

Actually, if tool name is sent, there is no need to send native_code, as it is just [tool_name = "ocamlopt"].

@vicuna

This comment has been minimized.

Copy link
Author

commented Jul 25, 2014

Comment author: @alainfrisch

Would you write a patch, or should I?

I won't have time to work on it in the next days, so feel free to hack happily and provide a patch if we wish!

@vicuna

This comment has been minimized.

Copy link
Author

commented Jul 25, 2014

Comment author: @alainfrisch

Should we have a single attributes, e.g.:

[@@@ocaml.ppx.context {tool_name="ocamlc"; load_path=["...";...;"..."]; for_pack=None; debug=false}]

or rather multiple attributes:

[@@@ocaml.ppx.tool_name "ocamlc"]
[@@@ocaml.ppx.load_path ["...";...;"..."]]
[@@@ocaml.ppx.for_pack None] (* or allow to omit in case default is used )
[@@@ocaml.ppx.debug false] (
idem *)

I don't have a strong preference.

@vicuna

This comment has been minimized.

Copy link
Author

commented Jul 25, 2014

Comment author: @whitequark

I prefer single attributes. Easier to skip, less noise in debugging output, etc.

Where do we export tool_name? Ast_mapper?

@vicuna

This comment has been minimized.

Copy link
Author

commented Jul 25, 2014

Comment author: @alainfrisch

Where do we export tool_name? Ast_mapper?

Yes, this makes it clear that the information is for ppx. I assume you have a internal reference in mind, with an accessor function, and the reference (together with other references in Clflags) would be set by Ast_mapper.apply by parsing the top of the AST to find the discussed attribute?

@vicuna

This comment has been minimized.

Copy link
Author

commented Jul 25, 2014

Comment author: @whitequark

Yes, it is simple and consistent with existing interfaces.

@vicuna

This comment has been minimized.

Copy link
Author

commented Jul 25, 2014

Comment author: @whitequark

What to do if Ast_mapper somehow receives a malformed ppx.context attribute? I like any solution that does not silently ignore that.

@vicuna

This comment has been minimized.

Copy link
Author

commented Jul 25, 2014

Comment author: @alainfrisch

What about having the ppx report an error (or a warning?) in the resulting AST?

@vicuna

This comment has been minimized.

Copy link
Author

commented Jul 25, 2014

Comment author: @whitequark

Do you think replacing the @@@ocaml.ppx.context with @@@ocaml.error is appropriate?

@vicuna

This comment has been minimized.

Copy link
Author

commented Jul 25, 2014

Comment author: @alainfrisch

Probably rather with [%%ocaml.error] (i.e. the extension node recognized by the compiler).

@vicuna

This comment has been minimized.

Copy link
Author

commented Jul 25, 2014

Comment author: @whitequark

Right, I meant [%%ocaml.error].

@vicuna

This comment has been minimized.

Copy link
Author

commented Jul 26, 2014

Comment author: @whitequark

I just realized another (related) problem, there is no entry point to read Location.input_name from. You can't do it as:

let mapper argv =
foo := Location.input_name; (* foo = "_no" *)
{ default_mapper with ... }

and you can't do it in mapper.structure, because that can be called several times while nested.

Use case: I want to keep track of full module path for types in ppx_deriving. It's probably possible by adding more global state, but meh. Also, that'll break if ppx drivers will reuse whatever is returned by the mapper generator function applied to command-line options.

Any chance Ast_mapper.mapper will gather a toplevel_structure or a similar field, and signature counterparts?

@vicuna

This comment has been minimized.

Copy link
Author

commented Jul 28, 2014

Comment author: @alainfrisch

It would be possible to arrange so that Location.input_name is set before the function creating the mapper (from command-line arguments) is executed.

Concerning the "toplevel_structure", the current API can be used to have a specific behavior at toplevel:

let mapper argv =
let super = default_mapper in
let structure m str = (* ... ) in
let mapper = {super with structure} in
{mapper with structure = (fun _ str ->
(
do something special at toplevel, but recurse with mapper *)
mapper.structure mapper str)
}

@vicuna

This comment has been minimized.

Copy link
Author

commented Jul 30, 2014

Comment author: @whitequark

@doligez: I'm currently working on this. Expect a PR in a few days.

@vicuna

This comment has been minimized.

Copy link
Author

commented Jul 31, 2014

Comment author: @whitequark

@alain: how would we output the tool_name field? I'm currently thinking of setting the Ast_mapper.tool_name reference in the corresponding drivers. Don't see a cleaner solution.

@vicuna

This comment has been minimized.

Copy link
Author

commented Jul 31, 2014

Comment author: @whitequark

#85

Test code:

open Ast_mapper
open Parsetree

let print_env () =
  Printf.printf "\ntool_name: %s\n" !Ast_mapper.tool_name;
  Printf.printf "include_dirs: %s\n" ([%derive.Show: string list] !Clflags.include_dirs);
  Printf.printf "open_module: %s\n" ([%derive.Show: string list] !Clflags.open_module);
  Printf.printf "for_package: %s\n" ([%derive.Show: string option] !Clflags.for_package);
  Printf.printf "debug: %B\n" !Clflags.debug

let () =
  Ast_mapper.register "t" (fun argv ->
    { default_mapper with
      structure = (fun mapper items ->
        Pprintast.structure Format.std_formatter items; print_env (); items) })

How to build:

ocamlfind c -package ppx_deriving -package compiler-libs.common -linkpkg testmap.ml -o testmap

How to test:

$ ocamlfind c -g -for-pack Foo -package num -package unix -package ppx_tools -open Big_int -open Unix -ppx ./testmap  foo.ml 
[@@@ocaml.ppx.context
  {
    tool_name = "ocamlc";
    include_dirs =
      ["/home/whitequark/.opam/local/lib/ppx_tools";
      "/home/whitequark/.opam/local/lib/ocaml/compiler-libs";
      "/home/whitequark/.opam/local/lib/num"];
    open_module = ["Unix"; "Big_int"];
    for_package = (Some "Foo");
    debug = true
  }]
tool_name: ocamlc
include_dirs: ["/home/whitequark/.opam/local/lib/ppx_tools"; "/home/whitequark/.opam/local/lib/ocaml/compiler-libs"; "/home/whitequark/.opam/local/lib/num"]
open_module: ["Unix"; "Big_int"]
for_package: Some ("Foo")
debug: true
@vicuna

This comment has been minimized.

Copy link
Author

commented Aug 7, 2014

Comment author: @alainfrisch

Thanks! I've committed the patch to 4.02 (commit 15062), with some modifications (mostly cosmetics, except for tool_name which is now passed explicitly by the calling tool when it applies ppx processors instead of setting the global reference).

@vicuna

This comment has been minimized.

Copy link
Author

commented Aug 7, 2014

Comment author: @alainfrisch

Commit 15064 fixes the toplevel. It was broken because it received a typedtree containing the extra context attribute when it was expecting a simple Tstr_eval. The attribute is now removed in the same function which adds it (apply_rewriters) and it never flows to the type-checker.

@vicuna

This comment has been minimized.

Copy link
Author

commented Aug 7, 2014

Comment author: @alainfrisch

Also committed to trunk (15068).

@vicuna

This comment has been minimized.

Copy link
Author

commented Aug 7, 2014

Comment author: @whitequark

Excellent, thanks! 15064 would also fix utop (regular toplevel seemed to work for me, but perhaps I am mistaken.)

@vicuna vicuna closed this Dec 7, 2016

@vicuna vicuna added this to the 4.02.0 milestone Mar 14, 2019

@vicuna vicuna added the bug label Mar 20, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.