Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deselecting multiple columns not working as expected #1

Closed
mapi1 opened this issue Jul 21, 2023 · 4 comments
Closed

Deselecting multiple columns not working as expected #1

mapi1 opened this issue Jul 21, 2023 · 4 comments

Comments

@mapi1
Copy link

mapi1 commented Jul 21, 2023

I was trying to deselect some columns in a single @select() using - and ran into some errors. Here is a MWE with the cases I tried, at least case 1 I would have expected to work:

using Tidier
using RDatasets

movies = dataset("ggplot2", "movies")

@chain movies begin
    # @select(-Title, -Year) # case 1: Does not do anything
    # @select(-([1, 2]))     # case 2: Errors
    @select(-(1:2))          # case 3: Works, but inconvenient for columns that are not next to each other
end

Do I miss sth?

PS: Great work on this package, immediately felt home, having worked a lot in R!

@kdpsingh
Copy link
Member

Thanks for catching. Case 1 should definitely work so this looks like it may be a bug. Case 2 isn't supported right now but we should support something like it. We will look into it. Stay tuned.

And glad you felt at home!

@kdpsingh
Copy link
Member

kdpsingh commented Jul 21, 2023

Ok I figured out what is going on for case 1.

This code in Tidier.jl...

@chain movies begin
    @select(-Title, -Year)
end

...is being converted to the following code in DataFrames.jl.

@chain movies begin
    select(Not(:Title), Not(:Year))
end

Because of the way negated selection works in DataFrames.jl, this results in Title being deselected but Year still remaining selected. The correct way to do this in DataFrames.jl is to produce either this in DF.jl v1.5...

@chain movies begin
    select(Not(Cols(:Title, :Year)))
end

...or this in DF.jl v1.6:

@chain movies begin
    select(Not(:Title, :Year))
end

I have a few ideas on how to fix but they are all a bit tricky so will continue to give this some thought. I do want to enable case 2 as well and have ideas around how to implement that as well.

Stay tuned!

@kdpsingh
Copy link
Member

kdpsingh commented Jul 21, 2023

One other thing to be aware of is that the following options are also valid in Tidier.jl and currently work fine. Fully agree that these options are inconvenient when the columns aren't next to each other.

@chain movies begin
    @select(-(Title:Year))
end
@chain movies begin
    @select(!(Title:Year))
end

@kdpsingh kdpsingh transferred this issue from TidierOrg/Tidier.jl Jul 28, 2023
kdpsingh pushed a commit that referenced this issue Jun 8, 2024
fix(slice docs): slice_sample(n=5)
@kdpsingh
Copy link
Member

kdpsingh commented Jun 9, 2024

Sorry it took so long!

This works now, as of #104 with a slight modification to your first example

@chain movies @select(-(Title, Year)) # with negated columns in a tuple
@chain movies @select(-[Title, Year]) # with negated columns in a vector
@chain movies @select(-[1, 2])
@chain movies @select(-(1:2))

@kdpsingh kdpsingh closed this as completed Jun 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants