<img width="10%" alt="Naas" src="https://landen.imgix.net/jtci2pxwjczr/assets/5ice39g4.png?w=160"/>

# Polars - Select both rows and columns
<a href="https://app.naas.ai/user-redirect/naas/downloader?url=https://raw.githubusercontent.com/jupyter-naas/awesome-notebooks/master/Polars/Polars_Read_CSV.ipynb" target="_parent"><img src="https://naasai-public.s3.eu-west-3.amazonaws.com/Open_in_Naas_Lab.svg"/></a><br><br><a href="https://github.com/jupyter-naas/awesome-notebooks/issues/new?assignees=&labels=&template=template-request.md&title=Tool+-+Action+of+the+notebook+">Template request</a> | <a href="https://github.com/jupyter-naas/awesome-notebooks/issues/new?assignees=&labels=bug&template=bug_report.md&title=Polars+-+Read+CSV:+Error+short+description">Bug report</a> | <a href="https://app.naas.ai/user-redirect/naas/downloader?url=https://raw.githubusercontent.com/jupyter-naas/awesome-notebooks/master/Naas/Naas_Start_data_product.ipynb" target="_parent">Generate Data Product</a>

**Tags:** #polars #dataframe #read #python #library #data #csv

**Author:** [Antonio Georgiev](www.linkedin.com/in/antonio-georgiev-b672a325b)

**Description:** This notebook will demonstrate how to create a DataFrame using Polars library. Furthermore, it will demonstrate how to select columns, rows, and both columns and rows at once.

About Polars:
- `polars` is a Python library for data manipulation that is built on top of Rust's `Apache Arrow` and `DataFusion` projects.
- It offers fast and efficient data processing and manipulation capabilities for large datasets, with a Pandas-like API and support for advanced data types.
- `polars` is especially useful for data-intensive applications such as machine learning, data analysis, and data visualization, and can handle datasets that are too large to fit into memory.

**References:**
- [Polars](https://pypi.org/project/polars/)
- [Dataframe](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html)

## Input

### Import libraries

Imports libraries, if unsuccessful, installs the required libraries

In [2]:
try:
    import polars as pl
except ModuleNotFoundError:
    !pip install polars
    import polars as pl

### Input

In [3]:
# Inputs
data = {
        'column 1': ["A", "B", "C", "D",
                    "E", "B", "G", "H",
                    "I", "J", "K", "L",],
        'column 2': [3, 7, 8, 4,
                    1, 3, 2, 5,
                    7, 6, 3, 11],
        'column 3': ["V", "C", "M", "A",
                    "S", "V", "R", "L",
                    "Q", "N", "P", "O",]
}

## Model

### Create DataFrame

Create the data frame using polars library

In [4]:
df = pl.DataFrame(data)

### Selecting rows and columns

Selecting rows and columns is an often used feature and it can be achieved by chaining the select() and filter() methods

In [10]:
select_rows_and_columns = df.filter(
    pl.col('column 1') == 'B'
).select('column 3')

To add another column corresponding to the requirements, list it as shown below:

In [9]:
select_rows_and_columns_2 = df.filter(
    pl.col('column 1') == 'B'
).select(['column 3', 'column 2'])

## Output

### Display the DataFrame and number of rows and columns

In [11]:
display(df)
print(f"Number of rows: {df.height}")
print(f"Number of columns: {df.width}")

column 1,column 2
str,i64
"""A""",3
"""B""",7
"""C""",8
"""D""",4
"""E""",1
"""B""",3
"""G""",2
"""H""",5
"""I""",7
"""J""",6


Number of rows: 12
Number of columns: 2
