# What is narwhals

Narwhals is a Python library that helps you write one piece of code that can work with different kinds of DataFrames, like:

- Pandas
- Polars
- PyArrow

Instead of writing separate code for each library, Narwhals lets you write once and run anywhere.

## Core idea: Expressions

Narwhals is built around a simple idea:

**An expression is a small function that takes a DataFrame and gives you one or more columns (Series).**

When doing:

In [None]:
from __future__ import annotations

import narwhals as nw

nw.col("a") + 1

This builds a Narwhals expression object.
It’s not run yet. It’s just a recipe that says:
   "When you have a DataFrame, take the column 'a', add 1 to it, 
   and give me the result."

It creates a tree of operations that represent this computation:

Next step is to apply this expression to a DataFrame.

narwhals checks what kind of DataFrame it is (Pandas, Polars, PyArrow) and translates the expression into the appropriate operations for that library.

Behind the scenes, Narwhals creates an expression that represents this operation. Internally, Narwhals calls:

In [None]:
(nw.col("a") + 1)._to_compliant_expr(namespace)

This expression can then be applied to different DataFrame libraries like Pandas, Polars, or PyArrow.

Depending on the backend, this becomes:

In [None]:
# Pandas
df["a"] + 1

# Polars
pl.col("a") + 1

# Arrow
pc.add(table["a"], 1)

By itself, **an expression doesn't produce a value**. It only produces a value once you give it to a DataFrame context. What happens to the value(s) it produces depends on which context you hand it to:

- `DataFrame.select`: produce a DataFrame with only the result of the given expression

- `DataFrame.with_columns`: produce a DataFrame like the current one, but also with the result of the given expression

- `DataFrame.filter`: evaluate the given expression, and if it only returns a single Series, then only keep rows where the result is True.

## Namespace

A **namespace** in Narwhals is a special object that knows how to work with a specific backend.

Each backend has its own namespace:

**Pandas** (& Modin, cuDF, Dask) →  `PandasLikeNamespace`

**Polars** →  `PolarsNamespace`

**Arrow** →  `ArrowNamespace`


Narwhals implements core functions like col, lit, sum, etc. **once per backend**, inside backend-specific namespaces.

# Implementation

To implement a new Narwhals function like nw.struct(...), it has to be done in the 3 backends (Pandas, Polars & Arrow).

| Backend | Namespace class | Expr class|
|--------------|--------------|--------------|
| Pandas | PandasLikeNamespace | PandasLikeExpr|
| Polars | PolarsNamespace |PolarsExpr|
| Arrow | ArrowNamespace | ArrowExpr |


(Narwhals uses “constructor” to describe: *“Any function that builds an Expr that represents a computation — without executing it.”*)

## Narwhals architecture

1. **Top-Level Function/Constructor** (what the user writes):

In [None]:
# User calls something like this:
nw.struct(nw.col("a"), nw.col("b"))

- This is a top-level Narwhals function, written by the user.
- The function (e.g. struct(...)) lives in `narwhals/functions.py.`
- It does not compute the result yet — it just constructs a symbolic expression.
- It returns an `Expr object`, which wraps one or more `ExprNodes`.

In [10]:
# Example implementation of struct function:
def struct(*exprs):
    parsed = [
        parse_expr(e) for e in exprs
    ]  # turns strings or native types into expressions
    node = ExprNode(kind="struct", exprs=parsed)  # defines "do a struct operation"
    return Expr([node])  # creates and returns an Expr object

- `Expr([...])` is a core building block in Narwhals, defined in `narwhals/expr.py`

- `ExprNode` objects are created inside the constructor and stored inside the Expr object via `Expr([node])`

2. **Expr Object** (what is returned by the constructor function)

In [None]:
# Expression object:
Expr(
    # ... some metadata
    ExprNode(kind=ExprKind.ELEMENTWISE, exprs=[...])  # ⬅️ this is the key operation
)

- This is the Expr object: a symbolic representation of the operation.
- It contains one or more `ExprNode` instances — each describing one computation step.
- No execution has happened yet!
- `Expr` is defined in narwhals/_expression.py
- The `ExprNode` is defined in narwhals/_expression_parsing.py

3. **Namespace → Expr Translation**

This is the process of taking symbolic info from the `ExprNode` found in the Expr object, and calling the real backend function using the namespace for Pandas, Polars, or Arrow.

That’s where `ExprNode._to_compliant_expr(namespace)` comes in. This line of code is already implemented in Narwhals (in narwhals/_expression_parsing.py), and it gets automatically called when needed.

Narwhals internally does:

In [None]:
# 1. Extract the namespace from the DataFrame type
namespace = PolarsNamespace()  # or PandasLikeNamespace(), etc

# 2. Take the Expr object, and walk through its ExprNodes
for node in expr._nodes:
    compliant_expr = node._to_compliant_expr(namespace)

`namespace.struct` has to be defined in:
- narwhals/backends/pandas_like/namespace.py
- narwhals/backends/polars/namespace.py
- narwhals/backends/arrow/namespace.py

Each backend can implement it differently, depending on what “struct” means in that library.

4. **Real computation (Backend)**

## pandas implementation

Pandas is **not Narwhals-compliant** because:
- Pandas was not built with symbolic expressions in mind.
- It expects real data and actual column names, not symbolic instructions.
- Its API (application programming interface) is designed differently from Polars.

So Narwhals creates its own version of the Pandas tools:
- It makes a special `PandasLikeNamespace`
- It implements its own `PandasLikeExpr` and `PandasLikeDataFrame`

This lets Narwhals translate expressions into things that Pandas can understand and execute, even though Pandas itself doesn’t work that way.

`PandasLikeNamespace` includes the **top-level Polars functions** included in the Narwhals API:

In [11]:
import narwhals as nw
from narwhals._pandas_like.namespace import PandasLikeNamespace
from narwhals._pandas_like.utils import Implementation
from narwhals._utils import Version

pn = PandasLikeNamespace(implementation=Implementation.PANDAS, version=Version.MAIN)
print(nw.col("a")._to_compliant_expr(pn))

<narwhals._pandas_like.expr.PandasLikeExpr object at 0x12a1e1f10>
