Nice frame oriented interface to xicor in the database (refs: Professor Sourav Chatterjee's xicor coefficient of correlation (<a href="https://win-vector.com/2021/12/29/exploring-the-xi-correlation-coefficient/">Nina Zumel's tutorial</a>, <a href="https://doi.org/10.1080/01621459.2020.1758115">JASA</a>; original sources: <a href="https://CRAN.R-project.org/package=XICOR">R package</a>, <a href="https://arxiv.org/abs/1909.10140">Arxiv</a>, <a href="https://news.ycombinator.com/item?id=29687613">Hacker News</a>, and <a href="https://github.com/czbiohub/xicor">a Python package</a> (different author).)

For some more notes please see [here](https://github.com/WinVector/data_algebra/blob/main/Examples/xicor/xicor.ipynb).

In [1]:

import pandas as pd
from data_algebra.data_ops import descr
import data_algebra.solutions
import data_algebra.BigQuery


In [2]:
df = pd.DataFrame({'x1': [1, 2, 3], 'x2': [1, 1, 2], 'y': [1, 2, 3]})

ops, rep_frame_name, rep_frame = data_algebra.solutions.xicor_score_variables_plan(
    descr(df=df),
    x_vars=['x1', 'x2'],
    y_name='y',
)

ops.eval({'df': df, rep_frame_name: rep_frame})

Unnamed: 0,variable_name,xicor_mean,xicor_std
0,x1,0.25,0.0
1,x2,0.13,0.178536


In [3]:
# try it in database
db_handle = data_algebra.BigQuery.example_handle()
db_handle.insert_table(df, table_name='df', allow_overwrite=True)
db_handle.insert_table(rep_frame, table_name=rep_frame_name, allow_overwrite=True)

(TableDescription(table_name="rep_frame", column_names=["rep"]))

In [4]:
db_handle.drop_table("xicor")

In [5]:
db_handle.execute(f"CREATE TABLE {db_handle.db_model.table_prefix}.xicor AS {db_handle.to_sql(ops)}")
db_res = db_handle.read_query(f"SELECT * FROM {db_handle.db_model.table_prefix}.xicor ORDER BY variable_name")

db_res

Unnamed: 0,xicor_mean,xicor_std,variable_name
0,0.25,0.0,x1
1,0.07,0.191213,x2


In [6]:
# clean up
db_handle.drop_table("df")
db_handle.drop_table(rep_frame_name)
db_handle.drop_table("xicor")
db_handle.close()
# show we made it to here, adn did not assert earlier
print('done')

done
