Skip to content
This repository has been archived by the owner on Jun 4, 2024. It is now read-only.

V2 filtering_settings #169

Closed
chriddyp opened this issue Oct 24, 2018 · 29 comments
Closed

V2 filtering_settings #169

chriddyp opened this issue Oct 24, 2018 · 29 comments
Labels
dash-type-epic Placeholder for multiple related issues attaining a specific goal

Comments

@chriddyp
Copy link
Member

chriddyp commented Oct 24, 2018

a placeholder issue to discuss the next version of the filtering_settings property. In particular:

  • Its name
  • Its structure: is it a string that the user has to parse or a more complex object?
  • What does the grammar look like? country eq("Canada") or country is Canada or country == Canada or country = Canada, etc

Related issues
#252
#301

chriddyp added a commit that referenced this issue Oct 24, 2018
Many of these props will change in the future, I’ve linked properties
to the following issues:
- #166
- #167
- #168
- #168
- #167
- #169
- #169
- #169
- #170
- #171
chriddyp added a commit that referenced this issue Oct 25, 2018
Many of these props will change in the future, I’ve linked properties
to the following issues:
- #166
- #167
- #168
- #168
- #167
- #169
- #169
- #169
- #170
- #171
chriddyp added a commit that referenced this issue Oct 25, 2018
Many of these props will change in the future, I’ve linked properties
to the following issues:
- #166
- #167
- #168
- #168
- #167
- #169
- #169
- #169
- #170
- #171
@chriddyp
Copy link
Member Author

chriddyp commented Nov 2, 2018

How can we make it easier for backend authors to write queries? num() and string casting (") is more complex to write and so in the docs, we don't support them. This makes the front-end filtering different than the back-end filtering, which is confusing.

For example, in the docs, front-end filtering only supports:

  • Enter eq "Asia" in the "continent" column
  • Enter > num(5000) in the "gdpPercap" column
  • Enter < num(80) in the lifeExp column

whereas backend filtering only supports:

  • Enter eq Asia in the "continent" column
  • Enter > 5000 in the "gdpPercap" column
  • Enter < 80 in the lifeExp column

@jonmmease
Copy link
Contributor

jonmmease commented Nov 3, 2018

Could we provide some way to automatically convert these expressions into expressions in the NumExpr DSL (https://numexpr.readthedocs.io/en/latest/user_guide.html#casting-rules)?

This DSL is directly support by pandas in the pd.query operation (https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.query.html), making backend filtering an efficient one-liner. It also opens up the possibility to filter on expression involving multiple columns.

Update:
For that matter, any reason to not adopt the expression language itself for the front-end as well?

@chriddyp
Copy link
Member Author

chriddyp commented Nov 3, 2018

That's a really great idea 👍

@chriddyp
Copy link
Member Author

chriddyp commented Nov 3, 2018

In addition to the NumExpr DSL, I'd like for us to consider how this would scale for additional datatypes beyond str and numbers, including:

  • dates
  • times
  • IP addresses
  • categories vs freeform strings
  • numbers that could be considered strings (like zip codes)

Also, we should consider what the default "operator" is in the UI. Users shouldn't have to always specify an operator like contains or eq or ==: they should be able to just type some text (e.g. canada or 15) and a reasonable filtering operation should occur.

Here's a proposal that I worked on previously. Perhaps this can be updated to be more compatible with the NumExpr DSL


With the introduction of datatypes, the following changes could be made:

  • string and enum
    • Default operator is still "contains"
    • >, >=, <, <= will filter alphanumerically
    • Regex will be supported
  • number
    • Default operator is =, contains is not available
    • >, >=, <, <= will filter numerically
    • Decimal places will be supported
  • date
    • Default operator is "contains" and will filter all dates at a lower resolution. For example:
      • Searching 2015 will filter all values with the year 2015, regardless of the month, day, or time
      • Searching 2015-10-05 will filter all values on the day 2015-10-05, regardless of the time
      • In this mode, it will not be possible to search just "January" across all years
      • >, >=, <, <= will filter by date, starting at the lowest date (e.g. > 2015 is equivalent to > 2015-01-01 00:00:00)
      • = won't be supported. The user can search a specific date with the default and implicit "contains" operator

@eliasdabbas
Copy link

I've used RStudio's implementation of DataTable, and I think they have a very nice solution for filtering visually, especially for a non-tech audience

Here's quickly how it works.

I think it's a great idea to provide filters for different data types, especially dates.

If this solution makes sense, then it might probably boil down to utilizing dash-core-components elements within the filtering cells somehow, and that would make a beautiful easy-to-use solution.

By automatically detecting the data type of the column, the appropriate filter would be provided.

Or maybe even give the developer the option to determine which filter they want to add per column.

dcc.RangeSlider: for numerics
dcc.Checklist OR dcc.Dropdown with multi=True: for categorical values when there is a small number of unique options
dcc.DatePickerRange: for dates

@jonmmease
Copy link
Contributor

It doesn't sound like the exact NumExpr/df.query DSLs are going to cover exactly what you're talking about here, especially when considering the idea of a default operator. So it probably does make sense to still define a dash table DSL, but it would be nice to do it in such a way that the combination of all filtering operations can be transformed into a single pandas query string.

Here are a few example expressions that I was playing around with:

For the numerical column a

df.query('a == 1')
df.query('a > 0')
df.query('0 < a and a < 2')

For the string or categorical column c

df.query('c.str.contains("Tw")')
df.query('c.str.startswith("T")')
df.query('c.str.match("\w*o")')
df.query('c.str.count("([a-g])") > 1')

For the datetime column d

df.query('d.dt.year == 2018')
df.query('d >= "2018-04"')
df.query('d.dt.day_name() == "Monday"')
df.query('d.dt.is_month_start')

So my thought is that if a table has filter expressions on columns a, b, and c then the combination of expressions gets transformed into a query string like

df.query('a > 0 and c.str.match("\w*o") and d >= "2018-04"')

@chriddyp
Copy link
Member Author

chriddyp commented Nov 7, 2018

Another thing to consider is how this relates to conditional formatting. For conditional formatting, we use the same filtering DSL (see https://dash.plot.ly/datatable/style). It's not clear to me if pandas uses the NumExpr/df.query DSL for their conditional formatting, their docs mostly show an imperative workflow (https://pandas.pydata.org/pandas-docs/stable/style.html)

@Marc-Andre-Rivet Marc-Andre-Rivet added the dash-type-epic Placeholder for multiple related issues attaining a specific goal label Nov 8, 2018
@gaoy400
Copy link

gaoy400 commented Nov 20, 2018

@eliasdabbas I am also looking forward to have a filter with RangeSlider and/or dropdown. The datatable video in your comment is of great interest to me. One thing that I am looking for beyond the video is a backend filtering option even when using dropdown/RangeSlider filter.

Really looking forward to have this feature in the dash table in the future!

@eliasdabbas
Copy link

@gaoy400 That would also be very nice.
This might be related somehow to #213.

@rbolado

This comment has been minimized.

@marfel
Copy link

marfel commented Nov 30, 2018

I appreciate the discussion here, but implementing all this really looks like an epic endeavor, and I wonder if its really necessary for the average user? When I first stumbled on upon the dash_table package, ran the example and it took me a while to figure out how to specify filter expressions correctly. I found it not very intuitive.

Wouldn't it cover about 95% of all cases, if any column (even numerical ones) could be filtered by simply typing a few letters or numbers into their filter field, which are then matched anywhere and without case. In this datatable implementation
https://datatables.net/manual/styling/jqueryui
there is even only one global search field! If you get too many results, you just add another word. Aside from that, the "<" and ">" operators for numerical columns can simply be achieved by sorting by the respective columns.

IMHO, simplicity, intuitiveness and responsiveness are more valuable than being able to "program" a filter with lots of conditions and operators. Perhaps such a simple, alternative filtering interface could be implemented quickly, as an option? The RStudio example is certainly fancy, but as an interface for the typical non-programmer user it's probably a little bit over the top.

Just my two cents.

@eliasdabbas
Copy link

Thanks @marfel
I fully agree that having the user "program" filters is difficult, even for programmers, because they would still have to learn the syntax.
Incidentally, the RStudio example is build with datatables that you link to.
The default is one search field and the others are optional if the developer thinks they are necessary.
I don't see why they are difficult to use. I think they are very easy and intuitive. They are visual controls, and once you click on them you immediately see the max and min numbers and a slider, (or text filters) so they seem, to me at least, very easy to use.

@gaoy400
Copy link

gaoy400 commented Dec 3, 2018

I totally agree with @marfel that the current filtering feature in dash_table is not quite user-friendly. But the one in the video shared by @eliasdabbas seems really intuitive to me. It does not require users to type any programming-like strings or so.

Still, implementing a similar feature as the one in the video would be an epic endeavor for sure...

@JeroneGagliano
Copy link

I agree with @marfel
I love what you all are doing with Dash but this feature is really needed. I had been using dash-table-experiments and chose it because of the easy filtering. Why not implement the "Filter Rows" that was in dash-table-experiments? It is so easy to use and intuitive!!! It was dynamic text filtering and numbers only required an intuitive <, >, <= etc operator. The current filtering in the currently Dash DataTable is not useful. One has to type the full text string to filter which is not useful when strings are long or have changing case.

Is the code used in dash-table-experiments not compatible with the new DashTable?

@chriddyp
Copy link
Member Author

chriddyp commented Jan 16, 2019

Why not implement the "Filter Rows" that was in dash-table-experiments? It is so easy to use and intuitive!

That's what I've proposed in this issue (#169 (comment)) and what we're planning on doing. we're working on it!

@JeroneGagliano
Copy link

Thanks, great to hear that! Sorry I did not see that in your previous post.
I see that this issue is a Priority#2. Does this mean in the next few months?

@chriddyp
Copy link
Member Author

Does this mean in the next few months?

That's the goal. Also, as always, if your organization has budget for software, then certain features (like this one) can be prioritized and sponsored (https://plot.ly/products/consulting-and-oem/). It's really just a question of bandwidth :)

@JeroneGagliano
Copy link

I hear you. I am an independent consultant but would like to contribute financially to plotly dash development once my cashflow is up as a way of thanking you all.

@Harsha-Nori
Copy link

Same story here as @JeroneGagliano , I just wanted to chime in and say that the Dash-Table-Experiments was a real joy to use for filtering / sorting. It'd be awesome if there's any current way of getting DataTable to operate like that!

@kswann-imb
Copy link

I'm currently using the data table, and overall I think it's great, but the lack of filtering options is killing me. I understand this is being worked on, but I'd like to suggest that for the backend filtering filtering='be' that it doesn't pre-filter the strings for eq, > or <. I think for back end, a good solution while you are making a super awesome well thought out filtering system would be to just let people define their own logic.

For me I'd like a contains, that accepts a list of options like contains (John, Mary). It would be pretty easy to do this if you didn't pre-filter.

@chriddyp
Copy link
Member Author

chriddyp commented Jun 4, 2019

Done in #397 and #410. Documented in https://dash.plot.ly/datatable/filtering

@chriddyp chriddyp closed this as completed Jun 4, 2019
@xmatthias
Copy link

It's very strange that this is a closed issue, while the documentation at https://dash.plot.ly/datatable/filtering and https://dash.plot.ly/datatable/callbacks still highlights to follow this Issue for improved syntax, so either this is not completely closed, or the documentation was not updated properly.

callback page

image

filtering documentation page

image

@newuser357
Copy link

All,

I am still trying to use the Filtering = True without a callback and the filtering is not working.

Has there been a solution that has been implemented?

@r-chris
Copy link

r-chris commented Nov 6, 2019

@newuser357 filtering=true is not the correct way to enable front-end filtering. Please read: https://dash.plot.ly/datatable/filtering

Taken from that documentation:

dash_table.DataTable(
        id='datatable-filtering-fe',
        columns=[
            {"name": i, "id": i, "deletable": True} for i in df.columns
        ],
        data=df.to_dict('records'),
        filter_action="native",
    )

@sumelinadar
Copy link

If i want to filter on the word negative, datatable is considering it to be ne gative, but its a contains filter and not ne. Is there anyway to overcome this problem??

@alexcjohnson
Copy link
Collaborator

@sumelinadar do you have the latest release? negative=>ne gative should have been fixed by #612 which was released with dash-table v4.4 (dash v1.4)

@tashrifbillah
Copy link

In case I missed, is there a way to filter whole table in Dash just like there is in R?
I am new to Dash and based on my reading of the documentation, filtering is column-wise only.

Hi @eliasdabbas , I would appreciate your comment in this regard.

@eliasdabbas
Copy link

I believe it's column based, but should be easy to make the search go through all the columns if that is what you want?

The filtering documentation shows how you can also filter with Python. Hope that helps.

@szkjn
Copy link

szkjn commented Feb 18, 2022

a placeholder issue to discuss the next version of the filtering_settings property. In particular:

  • Its name
  • Its structure: is it a string that the user has to parse or a more complex object?
  • What does the grammar look like? country eq("Canada") or country is Canada or country == Canada or country = Canada, etc

Related issues #252 #301

Hi.

Is filter_query the new property name replacing filtering_settings ? If not, what is the latter suppose to return ?
I'm assuming this as explained in #370 but would love to have confirmation.

Thanks.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
dash-type-epic Placeholder for multiple related issues attaining a specific goal
Projects
None yet
Development

No branches or pull requests