Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make startswith, endswith, occursin work for Symbols #35033

Open
oxinabox opened this issue Mar 6, 2020 · 1 comment
Open

Make startswith, endswith, occursin work for Symbols #35033

oxinabox opened this issue Mar 6, 2020 · 1 comment
Labels
domain:strings "Strings!" kind:feature Indicates new feature / enhancement requests

Comments

@oxinabox
Copy link
Contributor

oxinabox commented Mar 6, 2020

For better or worse tabular data in julia has settled on using Symbols for column names.
(This definately has some advantages, but this it not the place to talk about the pros and cons of that decision).

because of this one often finds oneself wanting to do string-like queries against column names.
Like if you have a bunch of columns like :temperature_london, :temperature_boston, :temperature_bangalore, :tempurature_winnipeg as well as some others, and you just want to get all the tempurature ones then you do:

filter(col_name->startswith(colname, "temperature_"), String.(names(df)))

At first the explict call to String doesn't seem like much.
But when you see a file that does this like 5 or even 20 times,
it starts to become annoying.
It adds a lot of visual noise and it doesn't really tell you that much of interest about what the program does.

Thus I propose we make limitted extension of a few of the operations for querying Strings, to make them work on Symbols as well.
In particular: startswith, endswith, and occursin.

(we are not talking in this issue about other string operations, lets keep focused on these ones for now.)


This is part of a series of issues relating to invenia/Wrangling.jl#3
and removing its type-piracy
Will link others below for cross referencing.
#35031
#35032

@KristofferC
Copy link
Sponsor Member

KristofferC commented Mar 7, 2020

(we are not talking in this issue about other string operations, lets keep focused on these ones for now.)

This looks very similar to the f(::Missing) = missing argument. And inevitably, someone will try to query using some other function than startswith and when it doesn't work they will want to add that to the "blessed list of functions" where f(s::Symbol) = f(string(s)). And there is no argument that can be used to say either yes or no to these request, it's completely ad hoc.

The core issue here seems to be about using Symbols as column names. Why isn't the discussion about that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:strings "Strings!" kind:feature Indicates new feature / enhancement requests
Projects
None yet
Development

No branches or pull requests

3 participants