Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testing parsing/printing/parsing roundtripping shows problems with Pprintast / -dsource #7200

Open
vicuna opened this issue Mar 30, 2016 · 9 comments

Comments

Projects
None yet
1 participant
@vicuna
Copy link

commented Mar 30, 2016

Original bug ID: 7200
Reporter: @alainfrisch
Status: feedback (set by @damiendoligez on 2017-02-23T16:24:26Z)
Resolution: open
Priority: normal
Severity: minor
Category: lexing and parsing
Monitored by: runhang @diml

Bug description

I'm attached a small script which parses all .ml/.mli files from its command-line (recursing into subdirectories) and for each of them does the following:

  1. Parse.
  2. Print with Pprintast.
  3. Parse the result.
  4. Compare the two parse trees after removing locations).

Applying this tool to the OCaml source tree discovers multiple issues with Pprintast (cases where it produces code that cannot be parsed, and cases where it produces valid code which results in a different Parsetree). I'm also attaching the result. (Some files cannot be parsed at all as implementation files because they are to be interpreted by the toplevel; but we cannot always parse as the toplevel since this would reject valid files, those with multiple ";;" in a row.)

Interpretation of issues (I'll edit this list as I go through the output).

  • ~-1 being printed as -1, reparsed as a literal. [FIXED]

  • (function x | (_ as x) -> x) being printed as (function x | _ as x -> x), invalid precedence of "as" vs "|".

  • type expression [>] printed as [], which is invalid. [FIXED]

  • "let foo : type t'. t' = assert false" printed as "let foo : 't' . 't'= fun (type t') -> (assert false : t')", but 't' is not a valid type variable (parsed as a character literal token). Not clear what to do here.

  • " type t = (::) : int -> t" printed as "type t = | ::: int -> t". [FIXED, also in Printyp]

  • "type t = < foo: int [@foo] >" printed as "type t = < foo[@foo ] :int >".

  • "[%foo: < foo : t > ]" printed as "[%foo :< foo :t >]", cannot be parsed because ">]" is recognized as a token.

  • "type foo += private A of int" printed as "type foo += | A of int" (private is dropped).

  • "let f : 'a . < .. > = assert false" printed as "let f : 'a . < .. >= assert false", cannot be parsed because ">=" is recognized as a token.

  • "let module M = (functor (T : sig end) -> struct end)(struct end) in ()"
    printed as "let module M = functor (T : sig end) -> struct end(struct end) in ()".

  • "object method f : int = 42 end" printed as "object method f : int= 42 end", which stores the type annotation in the Pexp_poly node. (Should we fix the Parsetree definition to rely on a Pexp_constraint for that form?).

  • "class c = object inherit ((fun () -> object end) ()) end" printed as "class c = object inherit (fun () -> object end ()) end".

File attachments

@vicuna

This comment has been minimized.

Copy link
Author

commented Mar 30, 2016

Comment author: runhang

dsheets and I are improving -dsource (see: https://github.com/ocamllabs/compiler-hacking/wiki/Things-to-work-on#improve--dsource), and I think this script can help

@vicuna

This comment has been minimized.

Copy link
Author

commented Mar 30, 2016

Comment author: @alainfrisch

Another useful test would be to try extracting the source fragments based on locations in the Parsetree (for various kinds of items) and check that these fragments can be parsed correctly. This would allow to detect problems with locations in the parser.

@vicuna

This comment has been minimized.

Copy link
Author

commented Apr 13, 2016

Comment author: runhang

"type t = < foo: int [@foo] >" printed as "type t = < foo[@foo ] :int >". - FIXED

""[%foo: < foo : t > ]" printed as "[%foo :< foo :t >]", cannot be parsed because ">]" is recognized as a token." - FIXED

" "type foo += private A of int" printed as "type foo += | A of int" (private is dropped)" - FIXED

""let f : 'a . < .. > = assert false" printed as "let f : 'a . < .. >= assert false", cannot be parsed because ">=" is recognized as a token" - FIXED

@vicuna

This comment has been minimized.

Copy link
Author

commented Apr 13, 2016

Comment author: runhang

""let module M = (functor (T : sig end) -> struct end)(struct end) in ()"
printed as "let module M = functor (T : sig end) -> struct end(struct end) in ()"." - FIXED

""class c = object inherit ((fun () -> object end) ()) end" printed as "class c = object inherit (fun () -> object end ()) end"." - FIXED

@vicuna

This comment has been minimized.

Copy link
Author

commented Apr 13, 2016

Comment author: @alainfrisch

Cf #539

@vicuna

This comment has been minimized.

Copy link
Author

commented May 17, 2016

Comment author: runhang

class ['a] c () = object
constraint 'a = < .. > -> unit
method m = (fun x -> () : 'a)
end

is printed as

class ['a] c () =
object constraint 'a = < .. > -> unit method m : 'a = fun x -> () end

@vicuna

This comment has been minimized.

Copy link
Author

commented Feb 23, 2017

Comment author: @damiendoligez

@Frisch #539 was merged. Could you give an update on the status of this PR (including @runhang's latest example)?

@vicuna

This comment has been minimized.

Copy link
Author

commented Feb 23, 2017

Comment author: runhang

@doligez that example has been addressed in #915 (#915 [^])

@vicuna

This comment has been minimized.

Copy link
Author

commented Jun 9, 2017

Comment author: @damiendoligez

I tried it on the 4.05 branch and Alain's tool is still complaining about a few discrepancies. Output file attached.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.