Skip to content

Commit

Permalink
Update readme (#1940)
Browse files Browse the repository at this point in the history
* Update README.md and remove obsolete information.

* Update CONTRIBUTING.md with `make` commands.
  • Loading branch information
ghuls committed Dec 1, 2021
1 parent 1c655bf commit 151f34a
Show file tree
Hide file tree
Showing 2 changed files with 195 additions and 81 deletions.
114 changes: 92 additions & 22 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,69 +7,139 @@ We love your input! We want to make contributing to this project as easy and tra
- Submitting a fix
- Adding/Proposing new features


## We Develop with GitHub

We use GitHub to host code, to track issues and feature requests, as well as accept pull requests.

1. Fork the repo and create your branch from `master`.
2. If you've added code that should be tested, add tests.
3. If you've changed APIs, update the documentation.
4. Ensure the test suite passes.
5. Make sure your code lints.
6. Issue that pull request!
1. Fork the repo and create your branch from `master`.
2. If you've added code that should be tested, add tests.
3. If you've changed APIs, update the documentation.
4. Ensure the test suite passes:
```bash
# For Rust code in ./polars in subdir
cd ./polars
make test

# For Python code and Rust code in ./py-polars subdir.
cd ./py-polars
make test
```
5. Make sure your code lints:
```bash
# For Rust code in ./polars subdir.
# - cargo clippy: Lint Rust code (./polars/).
# - cargo fmt: Format Rust code (./polars/).
# - dprint fmt: Format TOML files.
cd ./polars
make clippy
make fmt

# For Python code and rust code in ./py-polars subdir:
# - isort: Sort Python imports.
# - black: Format Python code.
# - blackdoc: Format Python doctests.
# - mypy: Type checking of Python code.
# - flake8: Enforce Python style guide.
# - dprint fmt: Format TOML files.
# - cargo fmt: Format Rust code (./py-polars/src).
cd ./py-polars
make pre-commit
```
6. Issue that pull request!


## Want to discuss something?

I can imagine that some questions don't fit an issue.
Therefore there is also a [chat on Gitter](https://gitter.im/polars-rs/community).
Therefore there is also a [Polars Dicsord server]("https://discord.gg/4UfP5cfBE7").


## Any contributions you make will be under the MIT Software License

In short, when you submit code changes, your submissions are understood to be under the same
[MIT License](https://choosealicense.com/licenses/mit/) that covers the project.
Feel free to contact the maintainers if that's a concern.


## Report bugs using GitHub's [issues](https://github.com/pola-rs/polars/issues)

We use GitHub issues to track public bugs. Report a bug by [opening a new issue](https://github.com/pola-rs/polars/issues/new/choose).

**Great Bug Reports** tend to have:
- A quick summary and/or background
- Steps to reproduce
- What you expected would happen
- What actually happens
- Notes (possibly including why you think this might be happening, or stuff you tried that didn't work)

- A quick summary and/or background
- Steps to reproduce
- What you expected would happen
- What actually happens
- Notes (possibly including why you think this might be happening, or stuff you tried that didn't work)

## Code formatting

We test the code formatting in the CI pipelines. If you don't want these to fail, you need to format:

- **Rust** code with `$ cargo fmt`
- **Python** code with [black](https://github.com/psf/black) (version 21.6b0) and [isort](https://github.com/PyCQA/isort) (version 5.9.2). Run both from the `py-polars` directory with `$ black . && isort .`
- **Rust** code with:
- [cargo fmt](https://rust-lang.github.io/)
* In `./polars subdir`: `$ cargo fmt --all`
* In `./py-polars subdir` for Rust Python-bindings: `$ cargo fmt --all`

- **Python** code with:
- [isort](https://github.com/PyCQA/isort) (version 5.9.2):
* Sort Python imports.
* `isort .`
- [black](https://github.com/psf/black) (version 21.6b0):
* Format Python code.
* `black .`

- **Python** code in doctests:
- [blackdoc](https://blackdoc.readthedocs.io/en/latest/) (version 0.3.4):
* `blackdoc .`

- **TOML** files with:
- [dprint](https://github.com/dprint/dprint) (version 0.18.2):
* `$ dprint fmt`

See `5. Make sure your code lints` for running it easily.


## Linting

We use linters to enforce code quality. This will be checked in CI.

- **Rust** We use [clippy](https://github.com/rust-lang/rust-clippy) as linter.
- **Python** We use [flake8](https://flake8.pycqa.org/en/latest/) as linter.
- **Rust**:
- [clippy](https://github.com/rust-lang/rust-clippy) as Rust linter.

- **Python**:
- [flake8](https://flake8.pycqa.org/en/latest/) as Python linter.

See `5. Make sure your code lints` for running it easily.


## Type checking

For Python, type hints are enforced using [mypy](https://github.com/python/mypy). This will be checked in CI.

See `5. Make sure your code lints.` for running it easily.


## Testing

See `4. Ensure the test suite passes` for running it easily.


## Python setup

If you want to contribute to the Python code, you also have to setup a Rust installation to be able to test your changes.
You have to follow these steps:

- install rust nightly via [rustup](https://www.rust-lang.org/tools/install)
- run `$ rustup override set nightly` from the root of the repo.
- from [./py-polars](./py-polars) run `$ pip3 install -r build.requirements.txt`
- **tests:** from [./py-polars](./py-polars) run `$ make test`
- **formatting + linting:** from [./py-polars](./py-polars) run `$ make pre-commit`
- Install Rust nightly via [rustup](https://www.rust-lang.org/tools/install)
- run `$ rustup override set nightly` from the root of the repo.
- from [./py-polars](./py-polars) run `$ pip3 install -r build.requirements.txt`
- **tests:** from [./py-polars](./py-polars) run `$ make test`
- **formatting + linting:** from [./py-polars](./py-polars) run `$ make pre-commit` before committing.

`make test` installs a (slow) development build in your current environment and runs `pytest`.

The last step installs a (slow) development build in your current environment and runs pytest.

## License

Expand Down
162 changes: 103 additions & 59 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
# Polars

[![rust docs](https://docs.rs/polars/badge.svg)](https://docs.rs/polars/latest/polars/)
[![Build and test](https://github.com/pola-rs/polars/workflows/Build%20and%20test/badge.svg)](https://github.com/pola-rs/polars/actions)
[![](https://img.shields.io/crates/v/polars.svg)](https://crates.io/crates/polars)
Expand All @@ -9,49 +10,57 @@
|
<a href="https://pola-rs.github.io/polars/polars/index.html">Rust Documentation</a>
|
<a href="https://pola-rs.github.io/polars-book/">User Guide</a>
|
<a href="https://discord.gg/4UfP5cfBE7">Discord</a>
|
|
<a href="https://stackoverflow.com/questions/tagged/python-polars">StackOverflow</a>
</p>


## Blazingly fast DataFrames in Rust & Python

Polars is a blazingly fast DataFrames library implemented in Rust using Apache Arrow(2) as memory model.
Polars is a blazingly fast DataFrames library implemented in Rust using
[Apache Arrow Columnar Format](https://arrow.apache.org/docs/format/Columnar.html) as memory model.

* Lazy | eager execution
* Multi-threaded
* SIMD
* Query optimization
* Powerful expression API
* Rust | Python | ...
* Lazy | eager execution
* Multi-threaded
* SIMD
* Query optimization
* Powerful expression API
* Rust | Python | ...

To learn more, read the [User Guide](https://pola-rs.github.io/polars-book/).

```python
>>> import polars as pl
>>> df = pl.DataFrame(
{
"A": [1, 2, 3, 4, 5],
"fruits": ["banana", "banana", "apple", "apple", "banana"],
"B": [5, 4, 3, 2, 1],
"cars": ["beetle", "audi", "beetle", "beetle", "beetle"],
}
)
... {
... "A": [1, 2, 3, 4, 5],
... "fruits": ["banana", "banana", "apple", "apple", "banana"],
... "B": [5, 4, 3, 2, 1],
... "cars": ["beetle", "audi", "beetle", "beetle", "beetle"],
... }
... )

# embarrassingly parallel execution
# very expressive query language
>>> (df
.sort("fruits")
.select([
"fruits",
"cars",
pl.lit("fruits").alias("literal_string_fruits"),
pl.col("B").filter(pl.col("cars") == "beetle").sum(),
pl.col("A").filter(pl.col("B") > 2).sum().over("cars").alias("sum_A_by_cars"), # groups by "cars"
pl.col("A").sum().over("fruits").alias("sum_A_by_fruits"), # groups by "fruits"
pl.col("A").reverse().over("fruits").flatten().alias("rev_A_by_fruits"), # groups by "fruits
pl.col("A").sort_by("B").over("fruits").flatten().alias("sort_A_by_B_by_fruits") # groups by "fruits"
]))
>>> (
... df
... .sort("fruits")
... .select(
... [
... "fruits",
... "cars",
... pl.lit("fruits").alias("literal_string_fruits"),
... pl.col("B").filter(pl.col("cars") == "beetle").sum(),
... pl.col("A").filter(pl.col("B") > 2).sum().over("cars").alias("sum_A_by_cars"), # groups by "cars"
... pl.col("A").sum().over("fruits").alias("sum_A_by_fruits"), # groups by "fruits"
... pl.col("A").reverse().over("fruits").flatten().alias("rev_A_by_fruits"), # groups by "fruits
... pl.col("A").sort_by("B").over("fruits").flatten().alias("sort_A_by_B_by_fruits"), # groups by "fruits"
... ]
... )
... )
shape: (5, 8)
┌──────────┬──────────┬──────────────┬─────┬─────────────┬─────────────┬─────────────┬─────────────┐
│ fruits ┆ cars ┆ literal_stri ┆ B ┆ sum_A_by_ca ┆ sum_A_by_fr ┆ rev_A_by_fr ┆ sort_A_by_B │
Expand All @@ -73,68 +82,103 @@ shape: (5, 8)
```



## Performance 🚀🚀
Polars is very fast, and in fact is one of the best performing solutions available.

Polars is very fast, and in fact is one of the best performing solutions available.
See the results in [h2oai's db-benchmark](https://h2oai.github.io/db-benchmark/).


## Python setup

Install the latest polars version with:

```
$ pip3 install polars
```

Update existing polars installation to the lastest version with:

```
$ pip3 install -U polars
```

Releases happen quite often (weekly / every few days) at the moment, so updating polars regularily to get the latest bugfixes / features might not be a bad idea.


## Rust setup
You can take latest release from `crates.io`, or if you want to use the latest features/ performance improvements

You can take latest release from `crates.io`, or if you want to use the latest features / performance improvements
point to the `master` branch of this repo.

```toml
polars = { git = "https://github.com/pola-rs/polars", rev = "<optional git tag>" }
```
## Rust version
Required Rust version `>=1.52`

## Python users read this!
Polars is currently transitioning from `py-polars` to `polars`. Some docs may still refer the old name.

Install the latest polars version with:
`$ pip3 install polars`
#### Rust version

Required Rust version `>=1.52`


## Documentation
Want to know about all the features Polars support? Read the docs!

#### Rust
* [Documentation (master branch)](https://pola-rs.github.io/polars/polars/index.html).
* [User Guide](https://pola-rs.github.io/polars-book/)

Want to know about all the features Polars supports? Read the docs!

#### Python
* installation guide: `$ pip3 install polars`
* [User Guide](https://pola-rs.github.io/polars-book/)
* [Reference guide](https://pola-rs.github.io/polars/py-polars/html/reference/index.html)

* Installation guide: `$ pip3 install polars`
* [Python documentation](https://pola-rs.github.io/polars/py-polars/html/reference/index.html)
* [User guide](https://pola-rs.github.io/polars-book/)

#### Rust

* [Rust documentation (master branch)](https://pola-rs.github.io/polars/polars/index.html)
* [User guide](https://pola-rs.github.io/polars-book/)


## Contribution

Want to contribute? Read our [contribution guideline](https://github.com/pola-rs/polars/blob/master/CONTRIBUTING.md).

## \[Python\] compile py-polars from source
If you want a bleeding edge release or maximal performance you should compile **py-polars** from source.

This can be done by going through the following steps in sequence:
## \[Python\]: compile polars from source

1. install the latest [Rust compiler](https://www.rust-lang.org/tools/install)
2. `$ pip3 install maturin`
4. Choose any of:
* Very long compile times, fastest binary: `$ cd py-polars && maturin develop --rustc-extra-args="-C target-cpu=native" --release`
* Shorter compile times, fast binary: `$ cd py-polars && maturin develop --rustc-extra-args="-C codegen-units=16 -C lto=thin -C target-cpu=native" --release
`
If you want a bleeding edge release or maximal performance you should compile **polars** from source.

Note that the Rust crate implementing the Python bindings is called `py-polars` to distinguish from the wrapped
This can be done by going through the following steps in sequence:

1. Install the latest [Rust compiler](https://www.rust-lang.org/tools/install)
2. Install [maturin](https://maturin.rs/): `$ pip3 install maturin`
3. Choose any of:
* Fastest binary, very long compile times:
```bash
$ cd py-polars && maturin develop --rustc-extra-args="-C target-cpu=native" --release
```
* Fast binary, Shorter compile times:
```bash
$ cd py-polars && maturin develop --rustc-extra-args="-C codegen-units=16 -C lto=thin -C target-cpu=native" --release
```

Note that the Rust crate implementing the Python bindings is called `py-polars` to distinguish from the wrapped
Rust crate `polars` itself. However, both the Python package and the Python module are named `polars`, so you
can `pip install polars` and `import polars` (previously, these were called `py-polars` and `pypolars`).
can `pip install polars` and `import polars`.


## Arrow2
Polars has transitioned to [arrow2](https://crates.io/crates/arrow2). Arrow2 is a faster and safer implementation of the arrow spec.

Polars has transitioned to [arrow2](https://crates.io/crates/arrow2).
Arrow2 is a faster and safer implementation of the [Apache Arrow Columnar Format](https://arrow.apache.org/docs/format/Columnar.html).
Arrow2 also has a more granular code base, helping to reduce the compiler bloat.
There is still a maintained `arrow-rs` branch for users who want to use another backend.


## Acknowledgements

Development of Polars is proudly powered by

[![Xomnia](https://raw.githubusercontent.com/pola-rs/polars-static/master/sponsors/xomnia.png)](https://www.xomnia.com/)


## Sponsors
* [Xomnia](https://www.xomnia.com/)
* [JetBrains](https://www.jetbrains.com/company/brand/img/jetbrains_logo.png)

* [Xomnia](https://www.xomnia.com/)
* [JetBrains](https://www.jetbrains.com/company/brand/img/jetbrains_logo.png)

0 comments on commit 151f34a

Please sign in to comment.