# ANSI Migration Guide - Pandas API on Spark
ANSI mode is now on by default for Pandas API on Spark. This guide helps you understand the key behavior differences you’ll see.
In short, with ANSI mode on, Pandas API on Spark behavior matches native pandas in cases where Pandas API on Spark with ANSI off did not.

## Behavior Change
### String Number Comparison
**ANSI off:** Spark implicitly casts numbers and strings, so `1` and `'1'` are considered equal.
**ANSI on:** behaves like pandas, `1 == '1'` is False.

In [1]:
import pandas as pd
import pyspark.pandas as ps

pdf = pd.DataFrame({"int": [1, 2], "str": ["1", "2"]})
psdf = ps.from_pandas(pdf)

# ANSI on
print(psdf["int"] == psdf["str"])
print(pdf["int"] == pdf["str"])

# ANSI off
spark.conf.set("spark.sql.ansi.enabled", False)
print(psdf["int"] == psdf["str"])

ModuleNotFoundError: No module named 'pyspark'

### Strict Casting
**ANSI off:** invalid casts (e.g., `'a' → int`) quietly became NULL.
**ANSI on:** the same casts raise errors.

In [None]:
pdf = pd.DataFrame({"str": ["a"]})
psdf = ps.from_pandas(pdf)

# ANSI on
try:
    print(psdf["str"].astype(int))
except Exception as e:
    print(e)

try:
    print(pdf["str"].astype(int))
except Exception as e:
    print(e)

# ANSI off
spark.conf.set("spark.sql.ansi.enabled", False)
print(psdf["str"].astype(int))

### MultiIndex.to_series Return
**ANSI off:** returns each row as a list ([1, red]).
**ANSI on:** returns each row as a tuple ((1, red)), with the Runtime SQL Configuration `spark.sql.execution.pandas.structHandlingMode` set to `'row'`.

In [None]:
arrays = [[1,  2], ["red", "blue"]]
pidx = pd.MultiIndex.from_arrays(arrays, names=("number", "color"))
psidx = ps.from_pandas(pidx)

spark.conf.set("spark.sql.execution.pandas.structHandlingMode", "row")
print(psidx.to_series())
print(pidx.to_series())

# ANSI off
spark.conf.set("spark.sql.ansi.enabled", False)
print(psidx.to_series())

## Related Configurations
1. **`compute.fail_on_ansi_mode` (Pandas API on Spark option)**
   - Controls whether Pandas API on Spark fails immediately when ANSI mode is enabled.
   - Now overridden by `compute.ansi_mode_support`.

2. **`compute.ansi_mode_support` (Pandas API on Spark option)**
   - Indicates whether ANSI mode is fully supported.

3. **`spark.sql.ansi.enabled` (Spark config)**
   - Native Spark setting that controls ANSI mode.