Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: use ibis.desc()/.desc() with selectors #9369

Closed
1 task done
lostmygithubaccount opened this issue Jun 13, 2024 · 3 comments · Fixed by #9376
Closed
1 task done

feat: use ibis.desc()/.desc() with selectors #9369

lostmygithubaccount opened this issue Jun 13, 2024 · 3 comments · Fixed by #9376
Labels
feature Features or general enhancements

Comments

@lostmygithubaccount
Copy link
Member

Is your feature request related to a problem?

I want to be able to order by column name(s) that contain count after doing value_counts()

What is the motivation behind your request?

[ins] In [1]: import ibis

[ins] In [2]: import ibis.selectors as s

[ins] In [3]: ibis.options.interactive = True

[ins] In [4]: t = ibis.examples.penguins.fetch()

[ins] In [5]: t.select("species", "island").value_counts().order_by(s.contains("count"))
Out[5]:
┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┓
┃ speciesislandspecies_island_count ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━┩
│ stringstringint64                │
├───────────┼───────────┼──────────────────────┤
│ AdelieBiscoe44 │
│ AdelieTorgersen52 │
│ AdelieDream56 │
│ ChinstrapDream68 │
│ GentooBiscoe124 │
└───────────┴───────────┴──────────────────────┘

[nav] In [6]: t.select("species", "island").value_counts().order_by(ibis.desc(s.contains("count")))
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[6], line 1
----> 1 t.select("species", "island").value_counts().order_by(ibis.desc(s.contains("count")))

File ~/repos/ibis/ibis/expr/api.py:612, in desc(expr)
    576 def desc(expr: ir.Column | str) -> ir.Value:
    577     """Create a descending sort key from `expr` or column name.
    578
    579     Parameters
   (...)
    610
    611     """
--> 612     return _deferred_method_call(expr, "desc")

File ~/repos/ibis/ibis/expr/api.py:573, in _deferred_method_call(expr, method_name)
    571 else:
    572     value = expr
--> 573 return method(value)

AttributeError: 'Predicate' object has no attribute 'desc'

[nav] In [7]: t.select("species", "island").value_counts().order_by(s.contains("count").desc())
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[7], line 1
----> 1 t.select("species", "island").value_counts().order_by(s.contains("count").desc())

AttributeError: 'Predicate' object has no attribute 'desc'

Describe the solution you'd like

the above works

What version of ibis are you running?

main

What backend(s) are you using, if any?

duckdb

Code of Conduct

  • I agree to follow this project's Code of Conduct
@lostmygithubaccount lostmygithubaccount added the feature Features or general enhancements label Jun 13, 2024
@cpcloud
Copy link
Member

cpcloud commented Jun 13, 2024

@lostmygithubaccount Is there an equivalent dplyr syntax here? Would be good to mirror that as much as possible if it exists.

@cpcloud
Copy link
Member

cpcloud commented Jun 13, 2024

Looks like in dplyr you can use across:

image

And for desc as well:

image

If that doesn't work we should implement it that way.

Reference: https://dplyr.tidyverse.org/reference/arrange.html

@cpcloud
Copy link
Member

cpcloud commented Jun 13, 2024

Looks like that method already works, so let's document it and close this out!

In [8]: from ibis.interactive import *

In [9]: t = ex.penguins.fetch()

In [10]: expr = t.select("species", "island").value_counts().order_by(s.across(s.contains("count"), _.desc()))

In [11]: ibis.to_sql(expr)
Out[11]:
SELECT
  *
FROM (
  SELECT
    "t1"."species",
    "t1"."island",
    COUNT(*) AS "species_island_count"
  FROM (
    SELECT
      "t0"."species",
      "t0"."island"
    FROM "penguins" AS "t0"
  ) AS "t1"
  GROUP BY
    1,
    2
) AS "t2"
ORDER BY
  "t2"."species_island_count" DESC

In [12]: expr
Out[12]:
┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┓
┃ species   ┃ island    ┃ species_island_count ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━┩
│ string    │ string    │ int64                │
├───────────┼───────────┼──────────────────────┤
│ Gentoo    │ Biscoe    │                  124 │
│ Chinstrap │ Dream     │                   68 │
│ Adelie    │ Dream     │                   56 │
│ Adelie    │ Torgersen │                   52 │
│ Adelie    │ Biscoe    │                   44 │
└───────────┴───────────┴──────────────────────┘

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Features or general enhancements
Projects
Status: done
Development

Successfully merging a pull request may close this issue.

2 participants