Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor expression handling #292

Merged
merged 14 commits into from Sep 11, 2021

Conversation

pdeffebach
Copy link
Collaborator

This PR re-factors the expression handling. As a result it's now simpler, fewer LOC, and has more abstractions so the code is less fragile.

@pdeffebach
Copy link
Collaborator Author

Okay @bkamins, @nalimilan this PR is ready for a review.

Basically, I was tired of having very complicated code for parsing. I also had the column selection methods hard-coded, which resulted in a lot of brittle code.

I also standardized the parsing, so that @eachrow and the main macros have the same parsing rules. There was a small discrepancy in behavior that I added a test for.

Overall I'm happy with this. But I understand this is a hard thing for you to review since it's very technical.

One point to highlight: I use the following pattern

"""
    get_column_expr(x)

If the input is a valid column identifier, i.e.
a `QuoteNode` or an expression beginning with
`$DOLLAR`, returns the underlying identifier.

If input is not a valid column identifier,
returns nothing.
"""
get_column_expr(x) = nothing
function get_column_expr(e::Expr)
    e.head == :$ && return e.args[1]
    onearg(e, :cols) && return e.args[2]
    return nothing
end
get_column_expr(x::QuoteNode) = x

To check if something is a column expression, I check if the output is === nothing. This means I don't have to have two separate code paths -- one for checking if an expression is a column and another for actually getting the column reference. I would assume that the === nothing stuff propagates perfectly so there is no performance loss, as this is how iteration works.

@bkamins
Copy link
Member

bkamins commented Aug 23, 2021

onearg(e, :cols) && return e.args[2]

is cols deprecated or it will be kept as an alternative to $?

@bkamins
Copy link
Member

bkamins commented Aug 23, 2021

Yes - it is hard to follow as the changes are small and scattered. But it is nice that all tests pass 😄.

What I would recommend. Can you please add a testset testing get_column_expr explicitly? Then feed into it various values - good and bad (and try to think of some corner cases). This will:

  1. make it easier to make sure all is right (and maybe some hard corner cases would come to my mind)
  2. kind-of document the expected behavior via tests (so it is easier to understand what this function achieves)

@pdeffebach
Copy link
Collaborator Author

is cols deprecated or it will be kept as an alternative to $?

No, I need to add back in warning messages.

Okay, I will add a test set just for parsing.

@pdeffebach
Copy link
Collaborator Author

Okay I have added new tests that should give a clearer idea of the rules for expression handling.

Ready for a review!

One thing: The functions replace_syms! and eachrow_replace are very similar. However they are different enough that writing a single function would require some over-engineering at this point in time.

i = @protect $1
@test i == 1

n = @protect "hello"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so without $ only symbol is accepted? What is the result of:

@protect Symbol("x")

?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That will give a Symbol, you can't get a column through Symbol("x"). I will add this to the tests to clarify.

Copy link
Member

@nalimilan nalimilan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few minor comments.

src/parsing.jl Outdated Show resolved Hide resolved

If the input is a valid column identifier, i.e.
a `QuoteNode` or an expression beginning with
`$DOLLAR`, returns the underlying identifier.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this work?

Suggested change
`$DOLLAR`, returns the underlying identifier.
`\$`, returns the underlying identifier.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it shows up as a backslash inside the auto-generated docs. This was the only solution that seemed to not cause runtime errors from interpolation and also not cause problems in the html of the docs.

src/parsing.jl Outdated Show resolved Hide resolved
Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>
@pdeffebach
Copy link
Collaborator Author

Okay, will merge this after CI and then turn to the AsTable stuff

@pdeffebach
Copy link
Collaborator Author

Actually, I will wait for an approval. With a single approval I will merge.

@pdeffebach pdeffebach merged commit 6ba85a7 into JuliaData:master Sep 11, 2021
@pdeffebach
Copy link
Collaborator Author

Thanks!

@pdeffebach pdeffebach deleted the refactor_expression branch September 11, 2021 20:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants