Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix AST generated from a dict literal containing dict unpacking #4449

Merged
merged 13 commits into from
Jan 22, 2023

Conversation

harupy
Copy link
Contributor

@harupy harupy commented Jan 15, 2023

Related to astral-sh/ruff#1879.

{"foo": 1, **{"bar": 1}} is parsed into the AST below in CPython and RustPython. In CPython, Dict.keys contains two keys (Constant and None), but in RustPython, Dict.keys only contains one key. This PR fixes it.

CPython:

Expand
Module(
  body=[
    Expr(
      value=Dict(
        👇👇👇 Contains two keys, but the second key is None
        keys=[
          Constant(
            value='foo',
            lineno=1,
            col_offset=1,
            end_lineno=1,
            end_col_offset=6),
          None],
        values=[
          Constant(
            value=1,
            lineno=1,
            col_offset=8,
            end_lineno=1,
            end_col_offset=9),
          Dict(
            keys=[
              Constant(
                value='bar',
                lineno=1,
                col_offset=14,
                end_lineno=1,
                end_col_offset=19)],
            values=[
              Constant(
                value=1,
                lineno=1,
                col_offset=21,
                end_lineno=1,
                end_col_offset=22)],
            lineno=1,
            col_offset=13,
            end_lineno=1,
            end_col_offset=23)],
        lineno=1,
        col_offset=0,
        end_lineno=1,
        end_col_offset=24),
      lineno=1,
      col_offset=0,
      end_lineno=1,
      end_col_offset=24)],
  type_ignores=[])

RustPython

Expand
[
    Located {
        location: Location {
            row: 1,
            column: 0,
        },
        end_location: Some(
            Location {
                row: 1,
                column: 24,
            },
        ),
        custom: (),
        node: Expr {
            value: Located {
                location: Location {
                    row: 1,
                    column: 0,
                },
                end_location: Some(
                    Location {
                        row: 1,
                        column: 24,
                    },
                ),
                custom: (),
                node: Dict {
                    👇👇👇 contains one key
                    keys: [
                        Located {
                            location: Location {
                                row: 1,
                                column: 1,
                            },
                            end_location: Some(
                                Location {
                                    row: 1,
                                    column: 6,
                                },
                            ),
                            custom: (),
                            node: Constant {
                                value: Str(
                                    "foo",
                                ),
                                kind: None,
                            },
                        },
                    ],
                    👇👇👇 values look correct?
                    values: [
                        Located {
                            location: Location {
                                row: 1,
                                column: 8,
                            },
                            end_location: Some(
                                Location {
                                    row: 1,
                                    column: 9,
                                },
                            ),
                            custom: (),
                            node: Constant {
                                value: Int(
                                    1,
                                ),
                                kind: None,
                            },
                        },
                        Located {
                            location: Location {
                                row: 1,
                                column: 13,
                            },
                            end_location: Some(
                                Location {
                                    row: 1,
                                    column: 23,
                                },
                            ),
                            custom: (),
                            node: Dict {
                                keys: [
                                    Located {
                                        location: Location {
                                            row: 1,
                                            column: 14,
                                        },
                                        end_location: Some(
                                            Location {
                                                row: 1,
                                                column: 19,
                                            },
                                        ),
                                        custom: (),
                                        node: Constant {
                                            value: Str(
                                                "bar",
                                            ),
                                            kind: None,
                                        },
                                    },
                                ],
                                values: [
                                    Located {
                                        location: Location {
                                            row: 1,
                                            column: 21,
                                        },
                                        end_location: Some(
                                            Location {
                                                row: 1,
                                                column: 22,
                                            },
                                        ),
                                        custom: (),
                                        node: Constant {
                                            value: Int(
                                                1,
                                            ),
                                            kind: None,
                                        },
                                    },
                                ],
                            },
                        },
                    ],
                },
            },
        },
    },
]

@harupy harupy changed the title Fix AST when parsing a dict literal containing dict spreading Fix AST generated from a dict literal containing dict spreading Jan 15, 2023
Comment on lines 1141 to 1175
// let (keys, values) = match pairs.iter().position(|(k,_)| k.is_none()) {
// Some(unpack_idx) => {
// let mut pairs = pairs;
// let (keys, mut values): (_, Vec<_>) = pairs.drain(..unpack_idx).map(|(k, v)| (*k.unwrap(), v)).unzip();
//
// fn build_map(items: &mut Vec<(ast::Expr, ast::Expr)>) -> ast::Expr {
// let location = items[0].0.location;
// let end_location = items[0].0.end_location;
// let (keys, values) = items.drain(..).unzip();
// ast::Expr {
// location,
// end_location,
// custom: (),
// node: ast::ExprKind::Dict { keys, values }
// }
// }
//
// let mut items = Vec::new();
// for (key, value) in pairs.into_iter() {
// if let Some(key) = key {
// items.push((*key, value));
// continue;
// }
// if !items.is_empty() {
// values.push(build_map(&mut items));
// }
// values.push(value);
// }
// if !items.is_empty() {
// values.push(build_map(&mut items));
// }
// (keys, values)
// },
// None => pairs.into_iter().map(|(k, v)| (k, v)).unzip()
// };
Copy link
Contributor Author

@harupy harupy Jan 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What this block does is:

{
  "a": "b",
  **c,
  "d": "e", # <- treat this as Dict so we can process this item with Instruction::DictUpdate
  **f,
}

We can remove this.

@harupy harupy marked this pull request as ready for review January 15, 2023 06:47
keys: &[Option<ast::Expr>],
values: &[ast::Expr],
) -> CompileResult<()> {
emit!(self, Instruction::BuildMap { size: 0 });
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation looks like to make performance regression

Copy link
Contributor Author

@harupy harupy Jan 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Reverted.

@harupy harupy requested review from youknowone and removed request for coolreader18 January 16, 2023 00:20
@harupy harupy changed the title Fix AST generated from a dict literal containing dict spreading Fix AST generated from a dict literal containing dict unpacking Jan 16, 2023
@harupy
Copy link
Contributor Author

harupy commented Jan 17, 2023

@youknowone @coolreader18 Would you mind taking a look?

@@ -195,7 +195,7 @@ pub enum ExprKind<U = ()> {
orelse: Box<Expr<U>>,
},
Dict {
keys: Vec<Expr<U>>,
keys: Vec<Option<Expr<U>>>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found I changed it to the previous form to keep cpython compatibility in #4042.
I'd like to ask Option<Expr<U>> is fit in cpython way.
I thought this is the new cpython way when i was working on #4042 which is changed in 3.10

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@youknowone Why did you change it to the previous form?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't remember well so looking in it again.
I thought CPython's Python.asdl actually changed it to not optional here.
https://github.com/RustPython/RustPython/pull/4042/files#diff-242a44b9d53b6c6212a752e0c97d6cf2f87794f579700aa8015bb99cef63925cR63

But I still can see ast.parse generate None for empty keys in Python 3.10 as you described. I am confused.
Do you know what the '?' means in Python.asdl?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think ? means optional.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that was the reason I thought CPython changed it from optional key to non-optional one in 3.10. How they still have None in parse.dump?

@@ -195,7 +195,7 @@ pub enum ExprKind<U = ()> {
orelse: Box<Expr<U>>,
},
Dict {
keys: Vec<Expr<U>>,
keys: Vec<Option<Box<Expr<U>>>>,
Copy link
Contributor Author

@harupy harupy Jan 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
keys: Vec<Option<Box<Expr<U>>>>,
keys: Vec<Option<Expr<U>>>,

@youknowone Is Box necessary?

@@ -60,7 +60,7 @@ module Python
| UnaryOp(unaryop op, expr operand)
| Lambda(arguments args, expr body)
| IfExp(expr test, expr body, expr orelse)
| Dict(expr* keys, expr* values)
| Dict(expr?* keys, expr* values)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@youknowone True, but dict.keys does contain None.

Copy link
Contributor Author

@harupy harupy Jan 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/python/cpython/blob/main/Parser/Python.asdl#L64 used to look like | Dict(expr?* keys, expr* values)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/python/cpython/blob/3325f054e33b318aa56b74472f76a56b8afc0510/Grammar/python.gram#L904

double_starred_kvpair[KeyValuePair*]:
    | '**' a=bitwise_or { _PyPegen_key_value_pair(p, NULL, a) }
                                                     ^^^^
    | kvpair

Copy link
Contributor Author

@harupy harupy Jan 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@youknowone Can we move forward with expr?* keys or should we fix compiler/ast/asdl_rs.py as I did in
ea9db0e?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think expr?* is the right thing here. Not only is it probably not valid ASDL syntax (see here, the modifiers are in brackets as opposed to braces, though this might not really be an issue), but it is also kinda ambiguous, is it an Optional list or a list of optionals?

Can't we get the same behavior by modifying asdl_rs and keeping Python.asdl the same as CPythons?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not only is it probably not valid ASDL syntax (see here, the modifiers are in brackets as opposed to braces, though this might not really be an issue), but it is also kinda ambiguous, is it an Optional list or a list of optional?

Agreed!

Can't we get the same behavior by modifying asdl_rs and keeping Python.asdl the same as CPythons?

We can. I'm happy to revert the last commit.

@youknowone What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we could do, I prefer to keep Python.asdl as same as CPython. It was your original approach, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, updated the code.

@youknowone
Copy link
Member

youknowone commented Jan 21, 2023 via email

@harupy
Copy link
Contributor Author

harupy commented Jan 21, 2023

I will add a comment that part is our local patch not to break it again in future

Makes sense! Feel free to push a commit to this PR if necessary :)

@harupy harupy requested review from youknowone and DimitrisJim and removed request for youknowone and DimitrisJim January 21, 2023 16:50
@harupy
Copy link
Contributor Author

harupy commented Jan 22, 2023

@youknowone @DimitrisJim Updated the code! Would you mind taking another look?

Copy link
Member

@youknowone youknowone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! And I am really sorry for confusing review for this PR 😅

@harupy
Copy link
Contributor Author

harupy commented Jan 22, 2023

Not a problem at all!

@youknowone youknowone merged commit 4f38cb6 into RustPython:main Jan 22, 2023
charliermarsh pushed a commit to astral-sh/ruff that referenced this pull request Jan 22, 2023
This PR upgrades RustPython to fix the type of `Dict.keys` to `Vec<Option<Expr>>` (see RustPython/RustPython#4449 for why this change was needed) and unblock #1884.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants