Skip to content
This repository has been archived by the owner on Sep 26, 2023. It is now read-only.

Commit

Permalink
Apply dprint for Markdown formatting (#384)
Browse files Browse the repository at this point in the history
  • Loading branch information
stinodego committed Aug 28, 2023
1 parent e3ede23 commit 1a83df4
Show file tree
Hide file tree
Showing 58 changed files with 374 additions and 416 deletions.
5 changes: 4 additions & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,10 +27,13 @@ To update your own repo with code pushed on the upstream repo:
1. `git push origin <BRANCH>`

### Building locally

To build the documentation locally you will need to install the python libraries defined in the `requirements.txt` file.

<!-- markdown-link-check-disable -->

When these steps are done run `mkdocs serve` to run the server. You can then view the docs at http://localhost:8000/

<!-- markdown-link-check-enable -->

### Want to discuss something?
Expand Down Expand Up @@ -72,7 +75,7 @@ Find the correct placement for the functionality. Is it an expression add it to

The `Markdown` file should roughly match the following structure:

1. A clear short title (for example: "*Interact with an AWS bucket*").
1. A clear short title (for example: "_Interact with an AWS bucket_").
1. A one-ish-liner to introduce the code snippet.
1. The code example itself under the corresponding folder (e.g. `docs/src/user-guide/expressions/...py), using the [Snippets](https://facelessuser.github.io/pymdown-extensions/extensions/snippets/) syntax.
1. The output of the example, using [markdown-exec](https://pawamoy.github.io/markdown-exec/)
Expand Down
4 changes: 4 additions & 0 deletions docs/_build/scripts/people.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,10 @@ def get_people_md():
contributors = repo.get_contributors()
with open("./docs/people.md", "w") as f:
for c in itertools.islice(contributors, 50):
# We love dependabot, but he doesn't need a spot on our website
if c.login == "dependabot[bot]":
continue

f.write(
ICON_TEMPLATE.format(
login=c.login,
Expand Down
2 changes: 1 addition & 1 deletion docs/_build/snippets/under_construction.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
!!! warning ":construction: Under Construction :construction: "

This section is still under development. Want to help out? Consider contributing and making a [pull request](https://github.com/pola-rs/polars-book) to our repository.
Please read our [Contribution Guidelines](https://github.com/pola-rs/polars-book/blob/master/CONTRIBUTING.md) on how to proceed.
Please read our [Contribution Guidelines](https://github.com/pola-rs/polars-book/blob/master/CONTRIBUTING.md) on how to proceed.
5 changes: 2 additions & 3 deletions docs/getting-started/expressions.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,6 @@ print(
)
```


### Filter

The `filter` option allows us to create a subset of the `DataFrame`. We use the same `DataFrame` as earlier and we filter between two specified dates.
Expand Down Expand Up @@ -81,7 +80,6 @@ print(

{{code_block('getting-started/expressions','with_columns',['with_columns'])}}


```python exec="on" result="text" session="getting-started/expressions"
print(
--8<-- "python/getting-started/expressions.py:with_columns"
Expand Down Expand Up @@ -112,7 +110,7 @@ print(
```python exec="on" result="text" session="getting-started/expressions"
print(
--8<-- "python/getting-started/expressions.py:groupby2"
)
)
```

### Combining operations
Expand All @@ -124,6 +122,7 @@ Below are some examples on how to combine operations to create the `DataFrame` y
```python exec="on" result="text" session="getting-started/expressions"
--8<-- "python/getting-started/expressions.py:combine"
```

{{code_block('getting-started/expressions','combine2',['select','with_columns'])}}

```python exec="on" result="text" session="getting-started/expressions"
Expand Down
5 changes: 3 additions & 2 deletions docs/getting-started/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ Polars is a library and installation is as simple as invoking the package manage
``` shell
cargo add polars
```

=== ":fontawesome-brands-node-js: NodeJS"

``` shell
Expand All @@ -23,7 +24,6 @@ Polars is a library and installation is as simple as invoking the package manage

To use the library import it into your project


=== ":fontawesome-brands-python: Python"

``` python
Expand All @@ -35,6 +35,7 @@ To use the library import it into your project
``` rust
use polars::prelude::*;
```

=== ":fontawesome-brands-node-js: NodeJS"

``` javaScript
Expand All @@ -43,4 +44,4 @@ To use the library import it into your project

// require
const pl = require('nodejs-polars');
```
```
2 changes: 1 addition & 1 deletion docs/getting-started/joins.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,4 +23,4 @@ We can also `concatenate` two `DataFrames`. Vertical concatenation will make the

```python exec="on" result="text" session="getting-started/joins"
--8<-- "python/getting-started/joins.py:hstack"
```
```
5 changes: 2 additions & 3 deletions docs/getting-started/reading-writing.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@

Polars supports reading & writing to all common files (e.g. csv, json, parquet), cloud storage (S3, Azure Blob, BigQuery) and databases (e.g. postgres, mysql). In the following examples we will show how to operate on most common file formats. For the following dataframe


{{code_block('getting-started/reading-writing','dataframe',['DataFrame'])}}

```python exec="on" result="text" session="getting-started/reading"
Expand All @@ -11,7 +10,7 @@ Polars supports reading & writing to all common files (e.g. csv, json, parquet),

#### CSV

Polars has its own fast implementation for csv reading with many flexible configuration options.
Polars has its own fast implementation for csv reading with many flexible configuration options.

{{code_block('getting-started/reading-writing','csv',['read_csv','write_csv'])}}

Expand Down Expand Up @@ -43,4 +42,4 @@ As we can see above, Polars made the datetimes a `string`. We can tell Polars to
--8<-- "python/getting-started/reading-writing.py:parquet"
```

To see more examples and other data formats go to the [User Guide](../user-guide/io/csv.md), section IO.
To see more examples and other data formats go to the [User Guide](../user-guide/io/csv.md), section IO.
9 changes: 4 additions & 5 deletions docs/getting-started/series-dataframes.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
# Series & DataFrames

The core base data structures provided by Polars are `Series` and `DataFrames`.
The core base data structures provided by Polars are `Series` and `DataFrames`.

## Series

Series are a 1-dimensional data structure. Within a series all elements have the same data type (e.g. int, string).
The snippet below shows how to create a simple named `Series` object. In a later section of this getting started guide we will learn how to read data from external sources (e.g. files, database), for now lets keep it simple.
Series are a 1-dimensional data structure. Within a series all elements have the same data type (e.g. int, string).
The snippet below shows how to create a simple named `Series` object. In a later section of this getting started guide we will learn how to read data from external sources (e.g. files, database), for now lets keep it simple.

{{code_block('getting-started/series-dataframes','series',['Series'])}}

Expand All @@ -17,7 +17,6 @@ The snippet below shows how to create a simple named `Series` object. In a later

Although it is more common to work directly on a `DataFrame` object, `Series` implement a number of base methods which make it easy to perform transformations. Below are some examples of common operations you might want to perform. Note that these are for illustration purposes and only show a small subset of what is available.


##### Aggregations

`Series` out of the box supports all basic aggregations (e.g. min, max, mean, mode, ...).
Expand Down Expand Up @@ -84,7 +83,7 @@ The `tail` function shows the last 5 rows of a `DataFrame`. You can also specify

#### Sample

If you want to get an impression of the data of your `DataFrame`, you can also use `sample`. With `sample` you get an *n* number of random rows from the `DataFrame`.
If you want to get an impression of the data of your `DataFrame`, you can also use `sample`. With `sample` you get an _n_ number of random rows from the `DataFrame`.

{{code_block('getting-started/series-dataframes','sample',['sample'])}}

Expand Down
17 changes: 7 additions & 10 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
hide:
- navigation
---

# Polars

![logo](https://raw.githubusercontent.com/pola-rs/polars-static/master/logos/polars_github_logo_rect_dark_name.svg)
Expand Down Expand Up @@ -30,20 +31,19 @@ hide:

Polars is a highly performant DataFrame library for manipulating structured data. The core is written in Rust, but the library is available in Python, Rust & NodeJS. Its key features are:


- **Fast**: Polars is written from the ground up, designed close to the machine and without external dependencies.
- **I/O**: First class support for all common data storage layers: local, cloud storage & databases.
- **Fast**: Polars is written from the ground up, designed close to the machine and without external dependencies.
- **I/O**: First class support for all common data storage layers: local, cloud storage & databases.
- **Easy to use**: Write your queries the way they were intended. Polars, internally, will determine the most efficient way to execute using its query optimizer.
- **Out of Core**: Polars supports out of core data transformation with its streaming API. Allowing you to process your results without requiring all your data to be in memory at the same time
- **Parallel**: Polars fully utilises the power of your machine by dividing the workload among the available CPU cores without any additional configuration.
- **Parallel**: Polars fully utilises the power of your machine by dividing the workload among the available CPU cores without any additional configuration.
- **Vectorized Query Engine**: Polars uses [Apache Arrow](https://arrow.apache.org/), a columnar data format, to process your queries in a vectorized manner. It uses [SIMD](https://en.wikipedia.org/wiki/Single_instruction,_multiple_data) to optimize CPU usage.

## About this guide

The `Polars` user guide is intended to live alongside the API documentation. Its purpose is to explain (new) users how to use `Polars` and to provide meaningful examples. The guide is split into two parts:

- [Getting Started](getting-started/intro.md): A 10 minute helicopter view of the library and its primary function.
- [User Guide](user-guide/index.md): A detailed explanation of how the library is setup and how to use it most effectively.
- [User Guide](user-guide/index.md): A detailed explanation of how the library is setup and how to use it most effectively.

If you are looking for details on a specific level / object, it is probably best to go the API documentation: [Python](https://pola-rs.github.io/polars/py-polars/html/reference/index.html) | [NodeJS](https://pola-rs.github.io/nodejs-polars/index.html) | [Rust](https://docs.rs/polars/latest/polars/).

Expand All @@ -54,7 +54,6 @@ See the results in h2oai's [db-benchmark](https://duckdblabs.github.io/db-benchm

`Polars` [TPCH Benchmark results](https://www.pola.rs/benchmarks.html) are now available on the official website.


## Example

{{code_block('home/example','example',['scan_csv','filter','groupby','collect'])}}
Expand All @@ -65,16 +64,14 @@ See the results in h2oai's [db-benchmark](https://duckdblabs.github.io/db-benchm

## Community

`Polars` has a very active community with frequent releases (approximately weekly). Below are some of the top contributors to the project:
`Polars` has a very active community with frequent releases (approximately weekly). Below are some of the top contributors to the project:

--8<-- "docs/people.md"


## Contribute
## Contribute

Thanks for taking the time to contribute! We appreciate all contributions, from reporting bugs to implementing new features. If you're unclear on how to proceed read our [contribution guide](https://github.com/pola-rs/polars/blob/main/CONTRIBUTING.md) or contact us on [discord](https://discord.com/invite/4UfP5cfBE7).


## License

This project is licensed under the terms of the MIT license.
16 changes: 8 additions & 8 deletions docs/user-guide/concepts/contexts.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Contexts

Polars has developed its own Domain Specific Language (DSL) for transforming data. The language is very easy to use and allows for complex queries that remain human readable. The two core components of the language are Contexts and Expressions, the latter we will cover in the next section.
Polars has developed its own Domain Specific Language (DSL) for transforming data. The language is very easy to use and allows for complex queries that remain human readable. The two core components of the language are Contexts and Expressions, the latter we will cover in the next section.

A context, as implied by the name, refers to the context in which an expression needs to be evaluated. There are three main contexts [^1]:
A context, as implied by the name, refers to the context in which an expression needs to be evaluated. There are three main contexts [^1]:

1. Selection: `df.select([..])`, `df.with_columns([..])`
1. Filtering: `df.filter()`
Expand All @@ -17,7 +17,7 @@ The examples below are performed on the following `DataFrame`:
--8<-- "python/user-guide/concepts/contexts.py:dataframe"
```

## Select
## Select

In the `select` context the selection applies expressions over columns. The expressions in this context must produce `Series` that are all the same length or have a length of 1.

Expand All @@ -29,7 +29,7 @@ A `Series` of a length of 1 will be broadcasted to match the height of the `Data
--8<-- "python/user-guide/concepts/contexts.py:select"
```

As you can see from the query the `select` context is very powerful and allows you to perform arbitrary expressions independent (and in parallel) of each other.
As you can see from the query the `select` context is very powerful and allows you to perform arbitrary expressions independent (and in parallel) of each other.

Similarly to the `select` statement there is the `with_columns` statement which also is an entrance to the selection context. The main difference is that `with_columns` retains the original columns and adds new ones while `select` drops the original columns.

Expand All @@ -39,17 +39,17 @@ Similarly to the `select` statement there is the `with_columns` statement which
--8<-- "python/user-guide/concepts/contexts.py:with_columns"
```

## Filter
## Filter

In the `filter` context you filter the existing dataframe based on arbritary expression which evaluates to the `Boolean` data type.
In the `filter` context you filter the existing dataframe based on arbritary expression which evaluates to the `Boolean` data type.

{{code_block('user-guide/concepts/contexts','filter',['filter'])}}

```python exec="on" result="text" session="user-guide/contexts"
--8<-- "python/user-guide/concepts/contexts.py:filter"
```

## Groupby / Aggregation
## Groupby / Aggregation

In the `groupby` context expressions work on groups and thus may yield results of any length (a group may have many members).

Expand All @@ -61,4 +61,4 @@ In the `groupby` context expressions work on groups and thus may yield results o

As you can see from the result all expressions are applied to the group defined by the `groupby` context. Besides the standard `groupby`, `groupby_dynamic`, and `groupby_rolling` are also entrances to the groupby context.

[^1]: There are additional List and SQL contexts which are covered later in this guide. But for simplicity, we leave them out of scope for now.
[^1]: There are additional List and SQL contexts which are covered later in this guide. But for simplicity, we leave them out of scope for now.
10 changes: 4 additions & 6 deletions docs/user-guide/concepts/data-structures.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
# Data Structures

The core base data structures provided by Polars are `Series` and `DataFrames`.
The core base data structures provided by Polars are `Series` and `DataFrames`.

## Series

Series are a 1-dimensional data structure. Within a series all elements have the same [Data Type](data-types.md) .
The snippet below shows how to create a simple named `Series` object.
Series are a 1-dimensional data structure. Within a series all elements have the same [Data Type](data-types.md) .
The snippet below shows how to create a simple named `Series` object.

{{code_block('getting-started/series-dataframes','series',['Series'])}}

Expand Down Expand Up @@ -33,7 +33,6 @@ The `head` function shows by default the first 5 rows of a `DataFrame`. You can

{{code_block('getting-started/series-dataframes','head',['head'])}}


```python exec="on" result="text" session="getting-started/series"
--8<-- "python/getting-started/series-dataframes.py:head"
```
Expand All @@ -50,7 +49,7 @@ The `tail` function shows the last 5 rows of a `DataFrame`. You can also specify

#### Sample

If you want to get an impression of the data of your `DataFrame`, you can also use `sample`. With `sample` you get an *n* number of random rows from the `DataFrame`.
If you want to get an impression of the data of your `DataFrame`, you can also use `sample`. With `sample` you get an _n_ number of random rows from the `DataFrame`.

{{code_block('getting-started/series-dataframes','sample',['sample'])}}

Expand All @@ -67,4 +66,3 @@ If you want to get an impression of the data of your `DataFrame`, you can also u
```python exec="on" result="text" session="getting-started/series"
--8<-- "python/getting-started/series-dataframes.py:describe"
```

Loading

0 comments on commit 1a83df4

Please sign in to comment.