# Tutorial 3: Results of Matching

In this tutorial you will understand how to interpret results of matching.

In [18]:
import warnings 
from hypex import Matcher
from hypex.dataset import DataGenerator

warnings.simplefilter(action='ignore', category=FutureWarning)

In [19]:
sample_data = DataGenerator()

In [20]:
sample_data.df

Unnamed: 0,info_col_1,info_col_2,feature_col_1,feature_col_2,feature_col_3,feature_col_4,feature_col_5,feature_col_6,treatment_1,outcome_1
0,8077,O,male,Credit,0.305621,1.376506,-1.885554,1.0,0.0,1.760076
1,14527,O,male,Deposit,-0.933765,-1.143372,1.940034,1.0,0.0,1.255280
2,9124,K,female,Investment,1.097520,1.422774,1.171370,0.0,0.0,4.096455
3,5191,K,male,Credit,-1.390627,1.313302,-0.664406,0.0,1.0,0.705794
4,7636,K,male,Credit,1.270699,1.970219,0.329824,3.0,0.0,7.636827
...,...,...,...,...,...,...,...,...,...,...
4995,2851,O,male,Investment,-1.183287,-0.585822,1.797529,3.0,1.0,7.516132
4996,8644,O,female,Investment,0.478477,-0.129496,3.083552,2.0,1.0,11.526241
4997,2200,K,female,Credit,0.341917,0.157617,-0.101076,3.0,0.0,3.667932
4998,1393,K,male,Deposit,-0.183003,1.505163,-0.929622,2.0,0.0,3.018969


## 1. Matching process

If you want to understand how Matcher works, visit [simple matching tutorial](./Tutorial_1_Simple_Matching.ipynb).

In [21]:
info_col = [sample_data.info_col_names[0]]

outcome = sample_data.outcome_name[0]
treatment = sample_data.treatment_name[0]

In [22]:
model = Matcher(input_data=sample_data.df, outcome=outcome, treatment=treatment, 
                info_col=info_col)
results, quality_results, df_matched = model.estimate()

Get treated index: 100%|██████████| 5000/5000 [00:00<00:00, 32720.45it/s]


## 2. Results interpretation
### 2.0 ATE, ATC, ATT 

For correct results' interpretation you may have main and alternative hypotheses before the experiment.

**ATC (Average Treatment Effect on the Control)** - is the average of treatment effects for people who were assigned to control.
**ATT (Average Treatment Effect on the Treated)** - is the average of treatment effects for people who were assigned to the treatment. 
**ATE (Average Treatment Effect)** - the average of treatment effects or weighted average value between ATC and ATT 


If **ATE > 0**, it means that the treatment produces the desired results or improvement compared to a control group or baseline. 


In [23]:
results

Unnamed: 0,effect_size,std_err,p-val,ci_lower,ci_upper,outcome
ATE,3.431695,0.081789,0.0,3.271388,3.592001,outcome_1
ATC,3.430632,0.090696,0.0,3.252867,3.608397,outcome_1
ATT,3.432778,0.091197,0.0,3.254031,3.611524,outcome_1


### 2.1 PSI 
Population Stability Index

PSI < 0.1 - no change
0.1 <= PSI < 0.2 – minor changes are required
PSI >= 0.2 - significant changes are required


In [13]:
quality_results['psi']

Unnamed: 0,column_treated,anomaly_score_treated,check_result_treated,column_untreated,anomaly_score_untreated,check_result_untreated
0,feature_col_1_male_treated,0.0,OK,feature_col_1_male_untreated,0.0,OK
1,feature_col_2_Deposit_treated,0.0,OK,feature_col_2_Deposit_untreated,0.0,OK
2,feature_col_2_Investment_treated,0.0,OK,feature_col_2_Investment_untreated,0.0,OK
3,feature_col_3_treated,0.01,OK,feature_col_3_untreated,0.02,OK
4,feature_col_4_treated,0.02,OK,feature_col_4_untreated,0.02,OK
5,feature_col_5_treated,0.01,OK,feature_col_5_untreated,0.01,OK
6,feature_col_6_treated,0.0,OK,feature_col_6_untreated,0.0,OK


### 2.2 KS_test 

text

In [15]:
quality_results['ks_test']

Unnamed: 0,match_control_to_treat,match_treat_to_control
feature_col_3,0.10539,0.125298
feature_col_4,0.213033,0.195898
feature_col_5,0.470989,0.162552
feature_col_6,1.0,1.0


### 2.3 SMD

text

In [16]:
quality_results['smd']

Unnamed: 0,match_control_to_treat,match_treat_to_control
feature_col_3,0.002974,0.003194
feature_col_4,0.000296,0.000736
feature_col_5,0.00473,0.004899
feature_col_6,0.006338,0.005453


### 2.4 Repeats

text

In [17]:
quality_results['repeats']

{'match_control_to_treat': 0.6, 'match_treat_to_control': 0.6}