Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New script language for ocamltest #12185

Merged
merged 31 commits into from
Apr 25, 2023

Conversation

damiendoligez
Copy link
Member

@damiendoligez damiendoligez commented Apr 14, 2023

This PR is the result of much brainstorming with @Octachron @shindere and @gasche.

It introduces a new language for tests scripts that allows for test sequences to avoid the huge nesting of the original language. The nesting is expressed with nested braces instead of the depth counters of the old language.

To do:

  • document the new language
  • automatic translation tool from the old language
  • translate all test files

For later PRs:

  • report line numbers in the test file rather than occurrences in the test tree
  • remove support for old syntax
  • remove automatic translation tool

@damiendoligez damiendoligez self-assigned this Apr 14, 2023
@shindere
Copy link
Contributor

shindere commented Apr 14, 2023 via email

@gasche
Copy link
Member

gasche commented Apr 14, 2023

Your PR or diff contain no examples of the new syntax! You should take a few representative/interesting tests and translate them to the new syntax as part of the PR, so that people get an idea of what the new syntax looks like and what the benefits are.

@damiendoligez
Copy link
Member Author

Oh, sorry I should have mentioned this is still a work in progress.

@avsm avsm marked this pull request as draft April 15, 2023 14:50
@damiendoligez
Copy link
Member Author

damiendoligez commented Apr 18, 2023

Translated all test files:

Most files are translated automatically without any need to change the reference files.

Some files (about 150 of them) rely on source locations, I have translated them to TEST_BELOW with enough blank lines to keep the same locations. Their reference files are also preserved.

Some files have character offsets in their reference files. I have translated them to TEST_BELOW with enough blank lines and spaces to keep the same locations and offsets. Their reference files are also preserved. [edit: I will have to change this in order to please check-typo]

Then, there are a few special cases:

  • 14 files had comments in their test scripts, I have copied them over.
  • parse-errors/unclosed_paren_module_expr1.ml and a few others rely at the same time on locations and the position of EOF. I have tweaked the whitespace in the test script to fit it in the same number of lines as the old test script.
  • basic-io/wc.ml uses its own source as input. Reference file updated.
  • tool-ocamldoc/Short_description.txt relies on ocamldoc parsing the ocamltest comment and accepting its syntax. That doesn't work with braces. I've split the file in two: Short_description.ml for the test script and Short_description.txt for the contents. Reference file updated.

Now ready for review.

@damiendoligez damiendoligez marked this pull request as ready for review April 18, 2023 15:47
@gasche
Copy link
Member

gasche commented Apr 18, 2023

Nice!

I'm looking at the translated test scripts, and I see many instances of the following diff that is not nice:

-(* TEST *)
+(* TEST
+{
+}
+*)

As as special case, could you support an empty-script syntax and use it in the translator?

@gasche
Copy link
Member

gasche commented Apr 18, 2023

The following also occurs several times with one-line tests:

 (* TEST
-   include dynlink
+{
+  include dynlink;
+}
 *)

could we revisit the idea of not making the toplevel braces mandatory? We wanted this to let the lexer easily distinguish between the two test syntaxes, but this is not necessary anymore with the translation tool.

@gasche
Copy link
Member

gasche commented Apr 18, 2023

Bonus idea for your translator: when moving to the TEST_BELOW syntax, include a short comment in the whitespace region to explain that the whitespace below the header is the result of an automatic translation tool.

@damiendoligez
Copy link
Member Author

@gasche these are all good ideas, I'll work on them.

@damiendoligez
Copy link
Member Author

@gasche all done. I've committed the script I used to translate all the tests, only one isn't handled by the script, Short_description.txt, which needed an update to its reference file.

Note that this latest version doesn't accept the old syntax anymore.

Copy link
Member

@gasche gasche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The translator inserts blank lines between statements that are not environment actions.

arch_power;

native;

ocamlopt_flags = "-flarge-toc";
ocamlopt.byte;

run;
flags = "-S -function-sections";
function_sections;
{
  arch_arm;

  reference = "${test_source_directory}/func_sections.arm.reference";
  native;
}{
  arch_arm64;

  reference = "${test_source_directory}/func_sections.arm.reference";
  native;
}{
  arch_amd64;

  reference = "${test_source_directory}/func_sections.reference";
  native;
}{
  arch_i386;

  reference = "${test_source_directory}/func_sections.reference";
  native;
}

I think that the output would be more compact and thus easier to read if it 2-indented the environment actions instead:

arch_power;
native;
  ocamlopt_flags = "-flarge-toc";
ocamlopt.byte;
run;
  flags = "-S -function-sections";
function_sections;
{
  arch_arm;
    reference = "${test_source_directory}/func_sections.arm.reference";
  native;
}{
  arch_arm64;
    reference = "${test_source_directory}/func_sections.arm.reference";
  native;
}{
  arch_amd64;
    reference = "${test_source_directory}/func_sections.reference";
  native;
}{
  arch_i386;
    reference = "${test_source_directory}/func_sections.reference";
  native;
}

flags = "-w +33"
{
flags = "-w +33";
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the test blocks in ocamltest.org are still using the mandatory-outer-brace syntax, and could benefit from an update.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. Done.

@@ -35,6 +36,9 @@ type tsl_item =

type tsl_block = tsl_item list
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the context of the new syntax, the name tsl_block to mean "a toplevel script" is a bit confusing as it could equally mean a block as in C. I would remove this synonym and expand it away in the parser.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is from the old code and I haven't changed it. My plan is to remove it altogether in the follow-up PR that will remove the old syntax and the translation tool, refactor this AST code, and change the log format.

%token TSL_BEGIN_C_STYLE TSL_END_C_STYLE
%token TSL_BEGIN_OCAML_STYLE TSL_END_OCAML_STYLE
%token COMMA
%token <bool> TSL_BEGIN_C_STYLE
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bool is meh here, could we use type script_position = [ `Above | `Below ] or something?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

%type <Tsl_ast.tsl_block> tsl_block
%start tsl_block_old tsl_block
%type <Tsl_ast.tsl_block> tsl_block_old
%type <Tsl_ast.t> tsl_block
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the name tsl_block confusing, the symbols could be called tsl_script and tsl_script_old instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reverted the old one to its original tsl_block (which will go away soon) and used tsl_script for the new syntax.

ocamltest/tsl_parser.mly Outdated Show resolved Hide resolved
@shindere
Copy link
Contributor

shindere commented Apr 19, 2023 via email

@damiendoligez
Copy link
Member Author

Oh no, please. To me indentation is so strongly associated with nesting
that I'm pretty sure I'll have a hard time not makingmistakes if I write
tests or not misunderstanding tests I read.

Same here. If we indent, it gives the impression that they are in the scope of the previous test, which is completely wrong.

I have two alternatives:

  1. change the syntax to add a * in front of test statements to make them stand out
  2. do nothing and use the translator in compact mode for all tests

@gasche
Copy link
Member

gasche commented Apr 20, 2023

I also thought of proposing to terminate environment updates with , instead of ;.

Maybe you could try with the compact mode and we could see how it looks like?

@shindere
Copy link
Contributor

shindere commented Apr 20, 2023 via email

@damiendoligez
Copy link
Member Author

Here's the compact version for all the scripts.

Copy link
Member

@gasche gasche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reviewed the implementation and believe that it is correct.

Note: we should wait to see what @shindere wants to say on this PR before merging.

ocamltest/main.ml Outdated Show resolved Hide resolved
let tsl_block = tsl_block_of_file_safe test_filename in
let (rootenv_statements, test_trees) = test_trees_of_tsl_block tsl_block in
let tsl_ast = tsl_block_of_file_safe test_filename in
let (rootenv_statements, test_trees) = test_trees_of_tsl_ast tsl_ast in
let test_trees = match test_trees with
| [] ->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The handling of the empty test (empty means "run the default test please") is a bit fragile here as it depends on the behavior of your test_trees_of_tsl_ast function in subtle ways -- it breaks the reasonable assumption that (env, tests) is equivalent to ([], [Node (env, Tests.null, [], tests)]). It would be cleaner to expose a Default case in Tsl_ast.t, and handle that explicitly, something like

let (env, test_trees) = match tsl_ast with
| Default env -> default_test env
| Script ast -> test_trees_of_tsl_ast ast

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's in fact worse than that, as there is a difference in behavior between env outside the tests and the env part of the first node of the test tree, even if there is only one tree, even if we are not running the default tests. I had a few tests failing because of that at some point.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is weird.

  1. Can you explain the difference?
  2. Do you see it as a part of the specification of ocamltest that should not change, or as an irregularity to be fixed by a future PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Ocaml_actions.find_source_module function is run on the initial environment before starting to interpret the tree. It is run through the Environments.intializers hook system. It seems to be used to compute the all_modules variable from a few other variables.

I didn't dig deeply, so I can't say if this would be easy to change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a look.

I think that we could get rid of those by avoiding environment initializers: config_variables is basically part of the initial empty environment, and find_source_modules could be avoided by calling a builtin function to recompute the all_modules data whenever it is needed by an action, taking the current environment into account (or using the content of the all_modules variable if specified explicitly by the user).

But this is also unimportant and independent of the present change, so I agree with not doing anything about it for now: the initial user-provided environment (the environment-statement prefix before the first test) have a special status, so be it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This supports the choice of removing the toplevel braces, as the toplevel node has a special status.)

ocamltest/tsl_semantics.ml Show resolved Hide resolved
@gasche
Copy link
Member

gasche commented Apr 20, 2023

I forgot to mention that I think the new test syntax is a nice improvement over the previous one, making writing tests more pleasant in practice. An example of before/after on a reasonably complex test script chosen at random from the diff:

Before:

include dynlink
readonly_files = "backtrace_dynlink_plugin.ml"
libraries = ""
* shared-libraries
** native-dynlink
*** setup-ocamlopt.byte-build-env
**** ocamlopt.byte
module = "backtrace_dynlink.ml"
flags = "-g"
**** ocamlopt.byte
program = "backtrace_dynlink_plugin.cmxs"
flags = "-shared -g"
all_modules = "backtrace_dynlink_plugin.ml"
**** ocamlopt.byte
program = "${test_build_directory}/main.exe"
libraries = "dynlink"
all_modules = "backtrace_dynlink.cmx"
***** run
ocamlrunparam += ",b=1"
****** no-flambda
******* check-program-output
****** flambda
reference = "${test_source_directory}/backtrace_dynlink.flambda.reference"
******* check-program-output

After:

 include dynlink;
 readonly_files = "backtrace_dynlink_plugin.ml";
 libraries = "";
 shared-libraries;
 native-dynlink;
 setup-ocamlopt.byte-build-env;
 {
   module = "backtrace_dynlink.ml";
   flags = "-g";
   ocamlopt.byte;
 }{
   program = "backtrace_dynlink_plugin.cmxs";
   flags = "-shared -g";
   all_modules = "backtrace_dynlink_plugin.ml";
   ocamlopt.byte;
 }{
   program = "${test_build_directory}/main.exe";
   libraries = "dynlink";
   all_modules = "backtrace_dynlink.cmx";
   ocamlopt.byte;
   ocamlrunparam += ",b=1";
   run;
   {
     no-flambda;
     check-program-output;
   }{
     reference = "${test_source_directory}/backtrace_dynlink.flambda.reference";
     flambda;
     check-program-output;
   }
 }

In fact, it looks like this example is not doing what we want it to do, as the final test block does not depend on the build of backtrace_dynlink.ml and backtrace_dynlink_plugin.ml, which is probably wrong. I remember that @damiendoligez noticed a bug when discussing an example test script earlier, I don't know if this is the same issue -- probably. The test that we want is probably the following:

include dynlink;
readonly_files = "backtrace_dynlink_plugin.ml";
libraries = "";
shared-libraries;
native-dynlink;
setup-ocamlopt.byte-build-env;

module = "backtrace_dynlink.ml";
flags = "-g";
ocamlopt.byte;
program = "backtrace_dynlink_plugin.cmxs";
flags = "-shared -g";
all_modules = "backtrace_dynlink_plugin.ml";
ocamlopt.byte;
program = "${test_build_directory}/main.exe";
libraries = "dynlink";
all_modules = "backtrace_dynlink.cmx";
ocamlopt.byte;

ocamlrunparam += ",b=1";
run;
{
  no-flambda;
  check-program-output;
}{
  reference = "${test_source_directory}/backtrace_dynlink.flambda.reference";
  flambda;
  check-program-output;
}

If I'm not mistaken, this corresponds to the following current-syntax test:

include dynlink
readonly_files = "backtrace_dynlink_plugin.ml"
libraries = ""
* shared-libraries
** native-dynlink
*** setup-ocamlopt.byte-build-env
**** ocamlopt.byte
module = "backtrace_dynlink.ml"
flags = "-g"
***** ocamlopt.byte
program = "backtrace_dynlink_plugin.cmxs"
flags = "-shared -g"
all_modules = "backtrace_dynlink_plugin.ml"
****** ocamlopt.byte
program = "${test_build_directory}/main.exe"
libraries = "dynlink"
all_modules = "backtrace_dynlink.cmx"
******* run
ocamlrunparam += ",b=1"
******** no-flambda
********* check-program-output
******** flambda
reference = "${test_source_directory}/backtrace_dynlink.flambda.reference"
********* check-program-output

@gasche
Copy link
Member

gasche commented Apr 20, 2023

Regarding errors in the current scripts: in addition to backtrace_dynlink.ml, it looks like the test scripts for the lib-dynlink-initializers/test*_main.ml files (10 files in total I think) are also wrong for similar reasons, as well as scripts in lib-dynlink-packed and lib-dynlink-pr4229.

@shindere
Copy link
Contributor

shindere commented Apr 20, 2023 via email

@gasche
Copy link
Member

gasche commented Apr 20, 2023

Regarding errors in the test scripts: I believe that we should wait after the present PR is merged, and then fix the scripts in a later PR. (Preserve the property that the current PR is doing a one-to-one translation that preserves the current test scripts.) (We could also try to fix the bugs with the current-syntax test scripts, but this is too painful.)

@shindere
Copy link
Contributor

@damiendoligez and I have done an offline review of this PR.

I suggested a few changes which have now been pushed.

I'll approve this PR. As a note to the person who will merge it,
it may be good to squash the commits (@damiendoligez agrees).

… special. Treat them specially when using the new syntax.
fix test warnings/w59.ml
…ailing now: tool-ocamldoc/Short_description.txt, which needs an update to its reference file.
@damiendoligez
Copy link
Member Author

Rebased to resolve conflict.

@damiendoligez damiendoligez merged commit 0a7c5fe into ocaml:trunk Apr 25, 2023
9 checks passed
@damiendoligez damiendoligez deleted the ocamltest-new-tsl branch April 25, 2023 14:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants