Skip to content

Commit

Permalink
docs(language): de-simple-fy prose in docs
Browse files Browse the repository at this point in the history
Removing some usage of the word `simple` or `simply` where it doesn't
add anything to the prose.

My favorite (now removed) example:

```
... simply pass `auth_mechanism='GSSAPI'` or
`auth_mechanism='LDAP'` (and set `kerberos_service_name` if necessary along
with `user` and `password` if necessary) to the
`ibis.impala_connect(...)` method when instantiating an
`ImpalaConnection`.
```
  • Loading branch information
gforsyth authored and cpcloud committed Sep 11, 2023
1 parent f4fdfd3 commit 0617271
Show file tree
Hide file tree
Showing 11 changed files with 76 additions and 49 deletions.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion docs/backends/impala.qmd
Expand Up @@ -1300,7 +1300,7 @@ connection semantics are similar to the other access methods for working with
secure clusters.

Specifically, after authenticating yourself against Kerberos (e.g., by issuing
the appropriate `kinit` command), simply pass `auth_mechanism='GSSAPI'` or
the appropriate `kinit` command), pass `auth_mechanism='GSSAPI'` or
`auth_mechanism='LDAP'` (and set `kerberos_service_name` if necessary along
with `user` and `password` if necessary) to the
`ibis.impala_connect(...)` method when instantiating an `ImpalaConnection`.
Expand Down
28 changes: 20 additions & 8 deletions docs/how-to/extending/elementwise.qmd
Expand Up @@ -2,13 +2,17 @@

This notebook will show you how to add a new elementwise operation to an existing backend.

We are going to add `julianday`, a function supported by the SQLite database, to the SQLite Ibis backend.
We are going to add `julianday`, a function supported by the SQLite database, to
the SQLite Ibis backend.

The Julian day of a date, is the number of days since January 1st, 4713 BC. For more information check the [Julian day](https://en.wikipedia.org/wiki/Julian_day) Wikipedia page.
The Julian day of a date, is the number of days since January 1st, 4713 BC. For
more information check the [Julian
day](https://en.wikipedia.org/wiki/Julian_day) Wikipedia page.

## Step 1: Define the Operation

Let's define the `julianday` operation as a function that takes one string input argument and returns a float.
Let's define the `julianday` operation as a function that takes one string input
argument and returns a float.

```python
def julianday(date: str) -> float:
Expand All @@ -31,15 +35,21 @@ class JulianDay(Value):
shape = rlz.shape_like('arg')
```

We just defined a `JulianDay` class that takes one argument of type string or binary, and returns a float.
We just defined a `JulianDay` class that takes one argument of type string or
binary, and returns a float.

## Step 2: Define the API

Because we know the output type of the operation, to make an expression out of ``JulianDay`` we simply need to construct it and call its `ibis.expr.types.Node.to_expr` method.
Because we know the output type of the operation, to make an expression out of
``JulianDay`` we can construct it and call its `ibis.expr.types.Node.to_expr`
method.

We still need to add a method to `StringValue` (this needs to work on both scalars and columns).
We still need to add a method to `StringValue` (this needs to work on both
scalars and columns).

When you add a method to any of the expression classes whose name matches `*Value` both the scalar and column child classes will pick it up, making it easy to define operations for both scalars and columns in one place.
When you add a method to any of the expression classes whose name matches
`*Value` both the scalar and column child classes will pick it up, making it
easy to define operations for both scalars and columns in one place.

We can do this by defining a function and assigning it to the appropriate class
of expressions.
Expand Down Expand Up @@ -120,7 +130,9 @@ jday_expr
ibis.to_sql(jday_expr)
```

Because we've defined our operation on `StringValue`, and not just on `StringColumn` we get operations on both string scalars *and* string columns for free
Because we've defined our operation on `StringValue`, and not just on
`StringColumn` we get operations on both string scalars *and* string columns for
free.


```{python}
Expand Down
27 changes: 19 additions & 8 deletions docs/how-to/extending/reduction.qmd
@@ -1,12 +1,15 @@
# Add a reduction operation

This notebook will show you how to add a new *reduction* operation `last_date` to the existing backend SQLite.
This notebook will show you how to add a new *reduction* operation `last_date`
to the existing backend SQLite.

A reduction operation is a function that maps $N$ rows to 1 row, for example the `sum` function.
A reduction operation is a function that maps $N$ rows to 1 row, for example the
`sum` function.

## Description

We're going to add a **`last_date`** function to ibis. `last_date` simply returns the latest date of a list of dates.
We're going to add a **`last_date`** function to ibis. `last_date` returns the
latest date of a list of dates.

## Step 1: Define the Operation

Expand Down Expand Up @@ -41,15 +44,22 @@ class LastDate(Reduction):
shape = ds.scalar
```

We just defined a `LastDate` class that takes one date column as input, and returns a scalar output of the same type as the input. This matches both the requirements of a reduction and the specifics of the function that we want to implement.
We just defined a `LastDate` class that takes one date column as input, and
returns a scalar output of the same type as the input. This matches both the
requirements of a reduction and the specifics of the function that we want to
implement.

**Note**: It is very important that you write the correct argument rules and output type here. The expression *will not work* otherwise.
**Note**: It is very important that you write the correct argument rules and
output type here. The expression *will not work* otherwise.

## Step 2: Define the API

Because every reduction in ibis has the ability to filter out values during aggregation, to make an expression out of `LastDate` we need to pass an additional argument `where` to our `LastDate` constructor.
Because every reduction in Ibis has the ability to filter out values during
aggregation, to make an expression out of `LastDate` we need to pass an
additional argument `where` to our `LastDate` constructor.

Additionally, reductions should be defined on `Column` classes because reductions are not always well-defined for a scalar value.
Additionally, reductions should be defined on `Column` classes because
reductions are not always well-defined for a scalar value.


```{python}
Expand Down Expand Up @@ -143,7 +153,8 @@ expr
ibis.to_sql(expr)
```

Show the last country to gain independence from the Spanish Empire, using the `where` parameter:
Show the last country to gain independence from the Spanish Empire, using the
`where` parameter:


```{python}
Expand Down
3 changes: 2 additions & 1 deletion docs/how-to/visualization/altair.qmd
Expand Up @@ -16,7 +16,8 @@ t.head(3)

## Using Altair with Ibis

Refer to the [Altair documentation](https://altair-viz.github.io/). Simply pass in Ibis tables or expressions:
Refer to the [Altair documentation](https://altair-viz.github.io/). You can pass
in Ibis tables or expressions:

```{python}
import altair as alt
Expand Down
3 changes: 2 additions & 1 deletion docs/how-to/visualization/plotly.qmd
Expand Up @@ -16,7 +16,8 @@ t.head(3)

## Using Plotly with Ibis

Refer to the [Plotly documentation](https://plotly.com/python/). Simply pass in Ibis tables or expressions:
Refer to the [Plotly documentation](https://plotly.com/python/). You can pass in
Ibis tables or expressions:

```{python}
import plotly.express as px
Expand Down
3 changes: 2 additions & 1 deletion docs/how-to/visualization/plotnine.qmd
Expand Up @@ -16,7 +16,8 @@ t.head(3)

## Using plotnine with Ibis

Refer to the [plotnine documentation](https://plotnine.readthedocs.io/). Simply pass in Ibis tables or expressions:
Refer to the [plotnine documentation](https://plotnine.readthedocs.io/). You can
pass in Ibis tables or expressions:

```{python}
from plotnine import ggplot, aes, geom_bar, theme
Expand Down
4 changes: 2 additions & 2 deletions docs/posts/campaign-finance/index.qmd
Expand Up @@ -262,8 +262,8 @@ able to determine the election type.
cleaned.election_type.topk(10)
```

About 1/20 of transactions are negative. These could represent refunds, or they could be data
entry errors. Let's simply drop them to keep it simple.
About 1/20 of transactions are negative. These could represent refunds, or they
could be data entry errors. Let's drop them to keep it simple.


```{python}
Expand Down
5 changes: 3 additions & 2 deletions docs/posts/ffill-and-bfill-using-ibis/index.qmd
Expand Up @@ -152,8 +152,9 @@ Under this design, we now have another partition.
Our first partition is by `event_id`.
Within each set in that partition, we have a partition by `grouper`, where each set has up to one non-null value.

Since there less than or equal to one non-null value in each group of `['event_id', 'grouper']`,
we can simply fill values by overwriting _all_ values within the group by the max value in the group.
Since there less than or equal to one non-null value in each group of
`['event_id', 'grouper']`, we can fill values by overwriting _all_ values within
the group by the max value in the group.

So:

Expand Down
33 changes: 18 additions & 15 deletions docs/tutorials/ibis-for-pandas-users.qmd
Expand Up @@ -225,8 +225,8 @@ subset.columns

### Modifying columns

Replacing existing columns is done using the `mutate` method just like adding columns. You simply
add a column of the same name to replace it.
Replacing existing columns is done using the `mutate` method just like adding
columns. You add a column of the same name to replace it.


```{python}
Expand All @@ -241,8 +241,8 @@ mutated

### Renaming columns

In addition to replacing columns, you can simply rename them as well. This is done with the `relabel` method
which takes a dictionary containing the name mappings.
In addition to replacing columns, you can rename them as well. This is done with
the `relabel` method which takes a dictionary containing the name mappings.


```{python}
Expand All @@ -257,9 +257,10 @@ relabeled

## Selecting rows

There are several methods that can be used to select rows of data in various ways. These are described
in the sections below. We'll use the Palmer Penguins$^1$ dataset to investigate!
Ibis has several built-in example datasets that you can access using the `ibis.examples` module.
There are several methods that can be used to select rows of data in various
ways. These are described in the sections below. We'll use the Palmer
Penguins$^1$ dataset to investigate! Ibis has several built-in example datasets
that you can access using the `ibis.examples` module.

$^1$: Horst AM, Hill AP, Gorman KB (2020). palmerpenguins: Palmer Archipelago (Antarctica) penguin data. R package version 0.1.0. https://allisonhorst.github.io/palmerpenguins/. doi: 10.5281/zenodo.3960218.

Expand Down Expand Up @@ -299,10 +300,11 @@ penguins.limit(5)

### Filtering rows

In addition to simply limiting the number of rows that are returned, it is possible to filter the
rows using expressions. Expressions are constructed very similarly to the way they are in pandas.
Ibis expressions are constructed from operations on columns in a table which return a boolean result.
This result is then used to filter the table.
In addition to limiting the number of rows that are returned, it is possible to
filter the rows using expressions. Expressions are constructed very similarly to
the way they are in pandas. Ibis expressions are constructed from operations on
columns in a table which return a boolean result. This result is then used to
filter the table.


```{python}
Expand Down Expand Up @@ -402,10 +404,11 @@ df.sort_values(
).head(5)
```

The same operation in Ibis would look like the following. Note that the index values of the
resulting `DataFrame` start from zero and count up, whereas in the example above, they retain
their original index value. This is simply due to the fact that rows in tables don't necessarily
have a stable index in database backends, so the index is just generated on the result.
The same operation in Ibis would look like the following. Note that the index
values of the resulting `DataFrame` start from zero and count up, whereas in the
example above, they retain their original index value. This is because rows in
tables don't necessarily have a stable index in database backends, so the index
is generated on the result.


```{python}
Expand Down

0 comments on commit 0617271

Please sign in to comment.