Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chained if_then_else application change #8671

Merged
merged 9 commits into from
Feb 22, 2024

Conversation

JaroslavTulach
Copy link
Member

@JaroslavTulach JaroslavTulach commented Jan 4, 2024

Pull Request Description

Attempt to fix #8489 by

  • adding test describing current behavior: c5bef79
  • another test showing desired Group-like behavior: 1c722f4
  • changing the first test to behave like the group one: 3f7c373

Checklist

Please ensure that the following checklist has been satisfied before submitting the PR:

  • All code follows the
    Scala,
    Java,
    and
    Rust
    style guides.
  • All code has been tested:
    • Unit tests have been written where possible.

@JaroslavTulach JaroslavTulach added the CI: No changelog needed Do not require a changelog entry for this PR. label Jan 4, 2024
@JaroslavTulach JaroslavTulach self-assigned this Jan 4, 2024
@JaroslavTulach JaroslavTulach marked this pull request as draft January 4, 2024 07:13
@JaroslavTulach JaroslavTulach changed the title Test describing the current behavior of chained if then else application Chained if_then_else application change Jan 4, 2024
@@ -237,6 +237,7 @@ impl<'s> Resolver<'s> {
self.context = Context::Statement;
}
token::Variant::BlockStart(_) => {
self.finish_current_line();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This finish_current_line call fixes the if_then_else_chained_block test, but it breaks then_block and else_block. Do I need to finish only when the macro is already saturated? What is the easiest way to find that out, @kazcw?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot do this unconditionally--in most cases a block should be attached to the preceding line.

We need to introduce a concept of inline context. When we enter a macro segment (by reading the header token), we enter inline context until the next newline. When the newline is reached, if any tokens were read while in inline context, then we finish_current_line and the following block will be attached to the full preceding macro invocation; in any other case, we don't call finish_current_line, allowing the child block to be attached to its preceding line as usual.

I could implement it after my current issue (#8683).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to introduce a concept of inline context. When we enter a macro segment (by reading the header token), we enter inline context until the next newline.

In the next example we enter a macro segment ...

if True then
  False

...by seeing the if token...

When the newline is reached, if any tokens were read while in inline context,

...we have seen if token, some token(s) in between and the then token....

then we finish_current_line and the following block will be attached to the full preceding macro invocation;

...however calling finish_current_line would turn the example into (if True then) False which is not what we want. Or I am not getting the concept of any tokens were read properly...

Do I need to finish only when the macro is already saturated?

...I still tend to think in the term of saturation. The if True then macro isn't saturated as we know there has to be something behind then - unsaturated macros would not finish_current_line.

I could implement it after my current issue (#8683).

Maybe you will have to. However James asked the engine-ers to learn the Rust parser part and I volunteered. The learning is valuable even if it doesn't deliver workable solution. Thanks for your hints.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then is the macro segment header before the newline, so no tokens have been read in the segment before the newline occurs. (When the body of the if segment ends without a newline, we leave its inline context without incident).

I think there's only a semantic difference between our paradigms in one case: IIUC, in the saturation model, a macro segment cannot accept a block as its body unless the macro segment requires a body. In the inline-context model, an everything() pattern (which does match nothing) can match an expression on the same line, or a block, but not both.

Macros use everything() a lot; even when a segment is semantically required to be non-empty we don't generally use a pattern that must be non-empty, because a macro non-match results in interpreting the input as a non-macro. When we have a near miss, like a macro with a missing body in one segment, we allow it to match but wrap it in Invalid if necessary, because recognizing poorly-matched macros as such gives the best analysis and error reporting.

Maybe saturation is a better paradigm, but it's not obvious to me how to reconcile it with the use of optional segment bodies.

Copy link
Member Author

@JaroslavTulach JaroslavTulach Feb 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot do this unconditionally

6f5ec2a introduces start_segment flag and only finishes line when the flag isn't set. That fixes if_then_else_chained_block as well as then_block and else_block tests.

However there are still problems: 96 passed; 8 failed, failures: attributes_in_types, doc_comments, type_constructors, type_def_defaults, type_def_full, type_def_nested, type_methods, type_operator_methods.

I investigated attributes_in_types failure:

  left: `"(BodyBlock #((ArgumentBlockApplication (TypeDef type A #() #()) #((Annotated \"@\" a (Ident z) #() ()) (Annotated \"@\" b () #() ()) (Ident x)))))"`,
 right: `"(BodyBlock #((TypeDef type A #() #((Annotated \"@\" a (Ident z) #(()) (Annotated \"@\" b () #(()) (Ident x)))))))"`

e.g. there is extra ArgumentBlockApplication in the tree. I am not sure how to differentiate this situation and avoid finish_current_line? Your guidance is needed, @kazcw. Thanks in advance for your help.

Btw. I was surprised there is a macro in "type A\n @a z\n @b\n x" at all! But yeah, it is defined here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After looking in to this in more depth, I think the right solution is to define a new macro pattern type and use it in the if-then-else definition, so I've implemented it that way.

@kazcw
Copy link
Contributor

kazcw commented Feb 21, 2024

Using parse_all_enso_files.sh, I have found that this syntax change affects exactly one of our .enso files.

The interpretation of these two assignments involving if/else expressions has changed:

mixed_values = if setup.is_database then [] else [[My_Type.Value "1", My_Type.Value "2", Value_Type.Mixed]]
+ [[[1], [2], Value_Type.Mixed]]
date_time_values = if setup.test_selection.date_time.not then [] else [[Date.new 2024 1 3, Date.new 2024 1 4, Value_Type.Date]]
+ [[Date_Time.new 2024 1 3 2 30 10 zone=zone, Date_Time.new 2024 1 3 2 30 11 zone=zone, Value_Type.Date_Time]]
+ [[Time_Of_Day.new 2 30 10, Time_Of_Day.new 2 30 11, Value_Type.Time]]

Previously, the lines starting with + were interpreted as part of the else branch; they now apply to the value of the whole if/else expression. It looks like that was the intent of the code, so no change is necessary.

Copy link
Contributor

@kazcw kazcw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll mark this "approved" because @JaroslavTulach can't technically review it, as the PR author. Jaroslav, if this looks good to you it should be ready to merge.

@kazcw
Copy link
Contributor

kazcw commented Feb 21, 2024

The tests don't pass if reinterpreted, so they were probably right before. I updated the syntax in the test file to maintain the original semantics.

Copy link
Member Author

@JaroslavTulach JaroslavTulach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @kazcw for reimplementing the logic with patterns. Yes, the changes look good to me to the extent I can review them.

items: Vec<syntax::Item<'s>>,
context: Context,
precedence: syntax::operator::Precedence<'s>,
start_segment: bool,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the start_segment idea wasn't the right way to solve the problem. 287e307 (with its Block and NotBlock patterns) is.

@@ -230,7 +230,7 @@ fn export_body<'s>(
/// If-then-else macro definition.
pub fn if_then_else<'s>() -> Definition<'s> {
crate::macro_definition! {
("if", everything(), "then", everything(), "else", everything()) if_body}
("if", everything(), "then", everything(), "else", or(block(), many(not_block()))) if_body}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not completely sure what is supposed to be a difference between everything() and or(x,not(x)) - in binary logic x or not x is always true - e.g. close to everything(). Here we have or(block(), many(not_block()))) which is almost or(block(), not_block())) - e.g. x or not x.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The many is important! The difference is that this will either match a block at the beginning, or a sequence of non-block items--but unlike everything, it won't match a sequence of non-block items followed by a block.

@@ -341,6 +357,23 @@ impl Pattern {
Err(input)
},
},
PatternData::Block => match input.pop_front() {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we have some parser combinators, here! @hubertp is currrently trying to get rid of one such library as part of #9089...

@JaroslavTulach
Copy link
Member Author

Thanks a lot again, @kazcw. It is fun to debug the code and see what it does - moreover it is way easier than to try to write the code myself. I've added two more tests to check the chained behavior a bit more: 8348ce8

@kazcw
Copy link
Contributor

kazcw commented Feb 22, 2024

Thanks a lot again, @kazcw. It is fun to debug the code and see what it does - moreover it is way easier than to try to write the code myself. I've added two more tests to check the chained behavior a bit more: 8348ce8

Yeah. I think what I ended up doing is probably more along the lines of what you were getting at with saturation--I was missing the saturation solution because it needed to be saturation of an alternation subpattern, and we hadn't been using the Pattern mechanism much (in most cases simple patterns enable better error reporting).

@kazcw kazcw merged commit ad2f5b0 into develop Feb 22, 2024
27 of 28 checks passed
@kazcw kazcw deleted the wip/jtulach/IfThenElseChain_8489 branch February 22, 2024 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI: No changelog needed Do not require a changelog entry for this PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

if then else and chained block syntax
3 participants