Skip to content

Commit

Permalink
docs: use to_pandas instead of execute
Browse files Browse the repository at this point in the history
  • Loading branch information
lostmygithubaccount authored and jcrist committed Jun 15, 2023
1 parent fe1fafe commit 882949e
Show file tree
Hide file tree
Showing 12 changed files with 41 additions and 41 deletions.
24 changes: 12 additions & 12 deletions docs/backends/Impala.md
Expand Up @@ -192,15 +192,15 @@ table.drop()

## Expression execution

Ibis expressions have an `execute` method with compiles and runs the
Ibis expressions have execution methods like `to_pandas` that compile and run the
expressions on Impala or whichever backend is being referenced.

For example:

```python
>>> fa = db.functional_alltypes
>>> expr = fa.double_col.sum()
>>> expr.execute()
>>> expr.to_pandas()
331785.00000000006
```

Expand Down Expand Up @@ -235,7 +235,7 @@ If you pass an Ibis expression to `create_table`, Ibis issues a
>>> db.create_table('string_freqs', expr, format='parquet')

>>> freqs = db.table('string_freqs')
>>> freqs.execute()
>>> freqs.to_pandas()
string_col count
0 9 730
1 3 730
Expand Down Expand Up @@ -387,7 +387,7 @@ an Ibis table expression:
>>> target.insert(t[:3])
>>> target.insert(t[:3])

>>> target.execute()
>>> target.to_pandas()
id bool_col tinyint_col ... timestamp_col year month
0 5770 True 0 ... 2010-08-01 00:00:00.000 2010 8
1 5771 False 1 ... 2010-08-01 00:01:00.000 2010 8
Expand Down Expand Up @@ -824,7 +824,7 @@ a major part of the Ibis roadmap).
Ibis's Impala tools currently interoperate with pandas in these ways:

- Ibis expressions return pandas objects (i.e. DataFrame or Series)
for non-scalar expressions when calling their `execute` method
for non-scalar expressions when calling their `to_pandas` method
- The `create_table` and `insert` methods can accept pandas objects.
This includes inserting into partitioned tables. It currently uses
CSV as the ingest route.
Expand All @@ -838,7 +838,7 @@ For example:

>>> db.create_table('pandas_table', data)
>>> t = db.pandas_table
>>> t.execute()
>>> t.to_pandas()
bar foo
0 a 1
1 b 2
Expand All @@ -851,7 +851,7 @@ For example:

>>> to_insert = db.empty_for_insert
>>> to_insert.insert(data)
>>> to_insert.execute()
>>> to_insert.to_pandas()
bar foo
0 a 1
1 b 2
Expand All @@ -868,7 +868,7 @@ For example:

>>> db.create_table('pandas_table', data)
>>> t = db.pandas_table
>>> t.execute()
>>> t.to_pandas()
foo bar
0 1 a
1 2 b
Expand All @@ -879,7 +879,7 @@ For example:
>>> db.create_table('empty_for_insert', schema=t.schema())
>>> to_insert = db.empty_for_insert
>>> to_insert.insert(data)
>>> to_insert.execute()
>>> to_insert.to_pandas()
foo bar
0 1 a
1 2 b
Expand Down Expand Up @@ -1215,7 +1215,7 @@ may significantly speed up queries on smaller datasets:
```

```bash
$ time python -c "(t.double_col + rand()).sum().execute()"
$ time python -c "(t.double_col + rand()).sum().to_pandas()"
27.7 ms ± 996 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
```

Expand All @@ -1225,7 +1225,7 @@ con.disable_codegen(False)
```

```bash
$ time python -c "(t.double_col + rand()).sum().execute()"
$ time python -c "(t.double_col + rand()).sum().to_pandas()"
27 ms ± 1.62 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
```

Expand Down Expand Up @@ -1303,7 +1303,7 @@ The object `fuzzy_equals` is callable and works with Ibis expressions:

>>> expr = fuzzy_equals(t.float_col, t.double_col / 10)

>>> expr.execute()[:10]
>>> expr.to_pandas()[:10]
0 True
1 False
2 False
Expand Down
10 changes: 5 additions & 5 deletions docs/backends/app/backend_info_app.py
Expand Up @@ -35,7 +35,7 @@ def support_matrix_df():
short_operation=_.full_operation.split(".")[-1],
operation_category=_.full_operation.split(".")[-2],
)
.execute()
.to_pandas()
)


Expand Down Expand Up @@ -75,7 +75,7 @@ def get_all_backend_categories():
backend_info_table.select(category=_.categories.unnest())
.distinct()
.order_by('category')['category']
.execute()
.to_pandas()
.tolist()
)

Expand All @@ -85,7 +85,7 @@ def get_all_operation_categories():
return (
support_matrix_table.select(_.operation_category)
.distinct()['operation_category']
.execute()
.to_pandas()
.tolist()
)

Expand All @@ -96,7 +96,7 @@ def get_backend_names(categories: Optional[List[str]] = None):
if categories:
backend_expr = backend_expr.filter(_.category.isin(categories))
return (
backend_expr.select(_.backend_name).distinct().backend_name.execute().tolist()
backend_expr.select(_.backend_name).distinct().backend_name.to_pandas().tolist()
)


Expand Down Expand Up @@ -170,7 +170,7 @@ def get_selected_operation_categories():
table_expr = table_expr[current_backend_names + ["index"]]

# Execute query
df = table_expr.execute()
df = table_expr.to_pandas()
df = df.set_index('index')

# Display result
Expand Down
2 changes: 1 addition & 1 deletion docs/example_streamlit_app/example_streamlit_app.py
Expand Up @@ -28,7 +28,7 @@ def query():
.mutate(ner=_.ner.map(lambda n: n.lower()).unnest())
.ner.topk(max(options))
.relabel(dict(ner="ingredient"))
.execute()
.to_pandas()
.assign(
emoji=lambda df: df.ingredient.map(
lambda emoji: f"{emojis.get(emoji, '-')}"
Expand Down
14 changes: 7 additions & 7 deletions docs/getting_started.md
Expand Up @@ -66,11 +66,11 @@ AlchemyTable: penguins
```

Ibis is lazily evaluated, so instead of seeing the data, we see the schema of
the table, instead. To peek at the data, we can call `head` and then `execute`
the table, instead. To peek at the data, we can call `head` and then `to_pandas`
to get the first few rows of the table as a pandas DataFrame.

```python
>>> penguins.head().execute()
>>> penguins.head().to_pandas()
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex year
0 Adelie Torgersen 39.1 18.7 181.0 3750.0 male 2007
1 Adelie Torgersen 39.5 17.4 186.0 3800.0 female 2007
Expand All @@ -79,9 +79,9 @@ to get the first few rows of the table as a pandas DataFrame.
4 Adelie Torgersen 36.7 19.3 193.0 3450.0 female 2007
```

`execute` takes the existing lazy table expression and evaluates it. If we
`to_pandas` takes the existing lazy table expression and evaluates it. If we
leave it off, you'll see the Ibis representation of the table expression that
`execute` will evaluate (when you're ready!).
`to_pandas` will evaluate (when you're ready!).

```python
>>> penguins.head()
Expand All @@ -100,17 +100,17 @@ Limit[r0, n=5]

!!! note "Results in pandas DataFrame"

Ibis returns results as a pandas DataFrame by default. It isn't using pandas to
Ibis returns results as a pandas DataFrame using `to_pandas`, but isn't using pandas to
perform any of the computation. The query is executed by the backend (DuckDB in
this case). Only when the query is executed does Ibis then pull back the results
this case). Only when `to_pandas` is called does Ibis then pull back the results
and convert them into a DataFrame.

## Interactive Mode

For the rest of this intro, we'll turn on interactive mode, which partially
executes queries to give users a preview of the results. There is a small
difference in the way the output is formatted, but otherwise this is the same
as calling `execute()` on the table expression with a limit of 10 result rows
as calling `to_pandas` on the table expression with a limit of 10 result rows
returned.

```python
Expand Down
4 changes: 2 additions & 2 deletions docs/how_to/memtable-join.md
Expand Up @@ -89,7 +89,7 @@ and joining is the same as joining any two TableExpressions:
...: measures.join(
...: mem_events,
...: measures['event_id'] == mem_events['event_id']
...: ).execute()
...: ).to_pandas()
Out[11]:
event_id measured_on measurement event_name
0 0 2021-06-01 NaN e0
Expand All @@ -106,4 +106,4 @@ and joining is the same as joining any two TableExpressions:
11 3 2021-07-12 NaN e3
```

Note that the return result of the `join` is a TableExpression and that `execute` returns a pandas DataFrame.
Note that the return result of the `join` is a TableExpression and that `to_pandas` returns a pandas DataFrame.
2 changes: 1 addition & 1 deletion docs/how_to/sessionize.md
Expand Up @@ -76,4 +76,4 @@ sessionized = (

Calling `ibis.show_sql(sessionized)` displays the SQL query and can be used to confirm that this Ibis table expression does not rely on any join operations.

Calling `sessionized.execute()` should complete in less than a minute, depending on the speed of the internet connection to download the data and the number of CPU cores available to parallelize the processing of this nested query.
Calling `sessionized.to_pandas()` should complete in less than a minute, depending on the speed of the internet connection to download the data and the number of CPU cores available to parallelize the processing of this nested query.
8 changes: 4 additions & 4 deletions docs/ibis-for-dplyr-users.ipynb

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 4 additions & 4 deletions docs/ibis-for-pandas-users.ipynb

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/index.md
Expand Up @@ -35,7 +35,7 @@ ORDER BY t1.year DESC
```

```py title="Execute on multiple backends"
>>> con.execute(q)
>>> con.to_pandas(q)

year mean(avg_rating)
0 2021 2.586362
Expand Down
2 changes: 1 addition & 1 deletion docs/user_guide/extending/elementwise.ipynb

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions docs/user_guide/extending/reduction.ipynb

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/user_guide/self_joins.md
Expand Up @@ -104,7 +104,7 @@ distinct object within Ibis. To do this, use the `view` function:
>>> results = (current.join(prior, ((current.region == prior.region) &
... (current.year == (prior.year - 1))))
... [current.region, current.year, yoy_change])
>>> df = results.execute()
>>> df = results.to_pandas()
```

```python
Expand Down

0 comments on commit 882949e

Please sign in to comment.