<img width="10%" alt="Naas" src="https://landen.imgix.net/jtci2pxwjczr/assets/5ice39g4.png?w=160"/>

# Polars - Select both rows and columns
<a href="https://app.naas.ai/user-redirect/naas/downloader?url=https://raw.githubusercontent.com/jupyter-naas/awesome-notebooks/master/Polars/Polars_Select_Rows_and_Columns.ipynb" target="_parent"><img src="https://naasai-public.s3.eu-west-3.amazonaws.com/Open_in_Naas_Lab.svg"/></a><br><br><a href="https://bit.ly/3JyWIk6">Give Feedbacks</a> | <a href="https://github.com/jupyter-naas/awesome-notebooks/issues/new?assignees=&labels=bug&template=bug_report.md&title=Polars+-+Select+both+rows+and+columns:+Error+short+description">Bug report</a>

**Tags:** #polars #dataframe #read #python #library #data #csv

**Author:** [Antonio Georgiev](www.linkedin.com/in/antonio-georgiev-b672a325b)

**Last update:** 2023-07-06 (Created: 2023-07-06)

**Description:** This notebook demonstrates how to select columns, rows, and both columns and rows at once in a DataFrame using `polars` library.

About Polars:
- `polars` is a Python library for data manipulation that is built on top of Rust's `Apache Arrow` and `DataFusion` projects.
- It offers fast and efficient data processing and manipulation capabilities for large datasets, with a Pandas-like API and support for advanced data types.
- `polars` is especially useful for data-intensive applications such as machine learning, data analysis, and data visualization, and can handle datasets that are too large to fit into memory.

**References:**
- [Polars](https://pypi.org/project/polars/)
- [Dataframe](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html)

## Input

### Import libraries

In [1]:
try:
    import polars as pl
except ModuleNotFoundError:
    !pip install polars
    import polars as pl

### Setup Variables
- `data`: data to be used to create DataFrame

In [2]:
# Inputs
data = {
    'column 1': ["A", "B", "C", "D", "E", "B", "G", "H", "I", "J", "K", "L",],
    'column 2': [3, 7, 8, 4, 1, 3, 2, 5, 7, 6, 3, 11],
    'column 3': ["V", "C", "M", "A", "S", "V", "R", "L", "Q", "N", "P", "O",]
}

## Model

### Create DataFrame

In [3]:
df = pl.DataFrame(data)

### Selecting rows and columns

Selecting rows and columns is an often used feature and it can be achieved by chaining the select() and filter() methods

In [4]:
select_rows_and_columns = df.filter(
    pl.col('column 1') == 'B'
).select('column 3')

To add another column corresponding to the requirements, list it as shown below:

In [5]:
select_rows_and_columns_2 = df.filter(
    pl.col('column 1') == 'B'
).select(['column 3', 'column 2'])

## Output

### Display the DataFrame and number of rows and columns

In [8]:
print(f"\nSelect rows and columns, first function:")
print(select_rows_and_columns)
print(f"\nSelect rows and columns, second function:")
print(select_rows_and_columns_2)