# Correlation "Auto"

The "Auto" correlation is an easily interpretable pairwise column metric of the following mapping:

- Variable_type-Variable_type : Method, **Range** 
- Categorical-Categorical     : Cramer's V, **[0,1]**
- Numerical-Categorical       : Cramer's V, **[0,1]** (using a discretized numerical column)
- Numerical-Numerical         : Spearman's Rho, **[-1,1]**

This example is based on the one found at: examples/bank_marketing_data/banking_data.py

In [10]:
%%capture
#Run the example as before
from pathlib import Path

import pandas as pd

from pandas_profiling import ProfileReport
from pandas_profiling.utils.cache import cache_zipped_file

file_name = cache_zipped_file(
    "bank-full.csv",
    "https://archive.ics.uci.edu/ml/machine-learning-databases/00222/bank.zip",
)

# Download the UCI Bank Marketing Dataset
df = pd.read_csv(file_name, sep=";")

profile = ProfileReport(
    df, title="Profile Report of the UCI Bank Marketing Dataset", explorative=True
)

The simplest way to change the number of bins is either through your script or notebook. This changes the granularity of the association measure for Numerical-Categorical column pairs.

In [11]:
# changing the number of bins from 10 (the default value) to 8
profile.config.correlations["auto"].n_bins = 8

The 'auto' correlation matrix is displayed with the other correlation matrices in the report.

In [12]:
%%capture cap --no-stdout
profile.to_file(Path("uci_bank_marketing_report.html"))