# Polars Tutorial — Beginner to Advanced
Comprehensive notes, definitions, examples, and workflows.

## 1. Introduction to Polars
- Polars is a fast DataFrame library written in Rust.
- Uses Apache Arrow memory model.
- Lazy + Eager execution.
- Built for large data processing and parallelism.

## 2. Installing Polars

In [None]:
!pip install polars

## 3. Importing Polars

In [None]:
import polars as pl

## 4. Creating DataFrames

In [None]:
df = pl.DataFrame({
    'name': ['Alice', 'Bob', 'Charlie'],
    'age': [25, 30, 35],
    'salary': [50000, 60000, 55000]
})
df

## 5. Reading Data

In [None]:
pl.read_csv('file.csv')
pl.read_parquet('file.parquet')

## 6. Selecting & Filtering

In [None]:
df.select(['name', 'age'])
df.filter(pl.col('age') > 28)

## 7. Expressions in Polars
Polars uses **expressions** for transformations.

In [None]:
df.select([pl.col('salary') * 1.1])

## 8. GroupBy & Aggregation

In [None]:
df.groupby('age').agg(pl.col('salary').mean())

## 9. Joining DataFrames

In [None]:
df.join(df, on='name', how='inner')

## 10. Lazy Mode — Optimized Query Engine

In [None]:
lf = df.lazy()
lf.filter(pl.col('age') > 28).select(['name']).collect()

## 11. Window Functions

In [None]:
df.select(pl.col('salary').mean().over('age'))

## 12. Data Types

In [None]:
df.dtypes

## 13. Handling Nulls

In [None]:
df.fill_null(0)
df.drop_nulls()

## 14. Sorting & Unique

In [None]:
df.sort('age', descending=True)
df.unique()

## 15. Writing Files

In [None]:
df.write_csv('output.csv')
df.write_parquet('output.parquet')

## 16. Working with Datetime

In [None]:
df2 = pl.DataFrame({'date': pl.date_range(low=date(2023,1,1), high=date(2023,1,5))})

## 17. String Operations

In [None]:
df.select(pl.col('name').str.to_uppercase())

## 18. Advanced: Custom Functions (apply)

In [None]:
df.with_columns([pl.col('age').apply(lambda x: x+1)])

## 19. Streaming Mode (for big data)

In [None]:
pl.scan_csv('large.csv').select([pl.col('age').mean()]).collect(streaming=True)

## 20. Polars vs Pandas Summary
- Faster than pandas
- Multithreaded
- Lazy execution
- Better for large datasets