<img width="10%" alt="Naas" src="https://landen.imgix.net/jtci2pxwjczr/assets/5ice39g4.png?w=160"/>

# Polars - Select columns
<a href="https://app.naas.ai/user-redirect/naas/downloader?url=https://raw.githubusercontent.com/jupyter-naas/awesome-notebooks/master/Polars/Polars_Read_CSV.ipynb" target="_parent"><img src="https://naasai-public.s3.eu-west-3.amazonaws.com/Open_in_Naas_Lab.svg"/></a><br><br><a href="https://github.com/jupyter-naas/awesome-notebooks/issues/new?assignees=&labels=&template=template-request.md&title=Tool+-+Action+of+the+notebook+">Template request</a> | <a href="https://github.com/jupyter-naas/awesome-notebooks/issues/new?assignees=&labels=bug&template=bug_report.md&title=Polars+-+Read+CSV:+Error+short+description">Bug report</a> | <a href="https://app.naas.ai/user-redirect/naas/downloader?url=https://raw.githubusercontent.com/jupyter-naas/awesome-notebooks/master/Naas/Naas_Start_data_product.ipynb" target="_parent">Generate Data Product</a>

**Tags:** #polars #dataframe #read #python #library #data #csv

**Author:** [Antonio Georgiev](www.linkedin.com/in/antonio-georgiev-b672a325b)

**Description:** This notebook will demonstrate how to select columns and how to use its relevant functions

About Polars:
- `polars` is a Python library for data manipulation that is built on top of Rust's `Apache Arrow` and `DataFusion` projects.
- It offers fast and efficient data processing and manipulation capabilities for large datasets, with a Pandas-like API and support for advanced data types.
- `polars` is especially useful for data-intensive applications such as machine learning, data analysis, and data visualization, and can handle datasets that are too large to fit into memory.

**References:**
- [Polars](https://pypi.org/project/polars/)
- [Dataframe](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html)

## Input

### Import libraries

Imports libraries, if unsuccessful, installs the required libraries

In [1]:
try:
    import polars as pl
except ModuleNotFoundError:
    !pip install polars
    import polars as pl

### Input

In [2]:
# Inputs
data = {
        'column 1': ["A", "B", "C", "D",
                    "E", "B", "G", "H",
                    "I", "J", "K", "L",],
        'column 2': [3, 7, 8, 4,
                    1, 3, 2, 5,
                    7, 6, 3, 11],
        'column 3': ["V", "C", "M", "A",
                    "S", "V", "R", "L",
                    "Q", "N", "P", "O",]
}

## Model

### Create DataFrame

Create the data frame using polars library

In [3]:
df = pl.DataFrame(data)

### Select columns

To select columns from the DataFrame, use the select() method:

In [4]:
select_column = df.select("column 2")

To select multiple columns from the DataFrame, use again the select() method:

In [5]:
select_multiple_columns = df.select(
    ["column 1","column 2"]
)

To select columns from the DataFrame by data type, use the expression below within the select() function:

In [6]:
select_column_by_data_type = df.select(
    pl.col(pl.Int64)
)

## Output

### Display the DataFrame and number of rows and columns

In [7]:
display(df)
print(f"\nSelect column function:")
print(select_column)
print(f"\nSelect multiple columns function:")
print(select_multiple_columns)
print(f"\nSelect column by data type function:")
print(select_column_by_data_type)

column 1,column 2,column 3
str,i64,str
"""A""",3,"""V"""
"""B""",7,"""C"""
"""C""",8,"""M"""
"""D""",4,"""A"""
"""E""",1,"""S"""
"""B""",3,"""V"""
"""G""",2,"""R"""
"""H""",5,"""L"""
"""I""",7,"""Q"""
"""J""",6,"""N"""



Select column function:
shape: (12, 1)
┌──────────┐
│ column 2 │
│ ---      │
│ i64      │
╞══════════╡
│ 3        │
│ 7        │
│ 8        │
│ 4        │
│ …        │
│ 7        │
│ 6        │
│ 3        │
│ 11       │
└──────────┘

Select multiple columns function:
shape: (12, 2)
┌──────────┬──────────┐
│ column 1 ┆ column 2 │
│ ---      ┆ ---      │
│ str      ┆ i64      │
╞══════════╪══════════╡
│ A        ┆ 3        │
│ B        ┆ 7        │
│ C        ┆ 8        │
│ D        ┆ 4        │
│ …        ┆ …        │
│ I        ┆ 7        │
│ J        ┆ 6        │
│ K        ┆ 3        │
│ L        ┆ 11       │
└──────────┴──────────┘

Select column by data type function:
shape: (12, 1)
┌──────────┐
│ column 2 │
│ ---      │
│ i64      │
╞══════════╡
│ 3        │
│ 7        │
│ 8        │
│ 4        │
│ …        │
│ 7        │
│ 6        │
│ 3        │
│ 11       │
└──────────┘
