# Getting Started Tutorial

To install Evidently using the pip package manager, run:

```$ pip install evidently```


If you want to see reports inside a Jupyter notebook, you need to also install the Jupyter nbextension. After installing evidently, run the two following commands in the terminal from the Evidently directory.

To install jupyter nbextension, run:

```$ jupyter nbextension install --sys-prefix --symlink --overwrite --py evidently```

To enable it, run:

```$ jupyter nbextension enable evidently --py --sys-prefix```

That's it!

In [10]:
import pandas as pd
import numpy as np

from sklearn.datasets import fetch_california_housing

from evidently.pipeline.column_mapping import ColumnMapping

from evidently.dashboard import Dashboard
from evidently.dashboard.tabs import DataDriftTab, NumTargetDriftTab

from evidently.test_suite import TestSuite
from evidently.test_preset import DataQuality, DataStability, DataDrift
from evidently.tests import *

## Load Data

In [18]:
data = fetch_california_housing(as_frame=True)
housing_data = data.frame

In [22]:
housing_data.rename(columns={'MedHouseVal': 'target'}, inplace=True)
housing_data['prediction'] = housing_data['target'].values + np.random.normal(0, 5, housing_data.shape[0])

In [27]:
housing_data.head()

Unnamed: 0,MedInc,HouseAge,AveRooms,AveBedrms,Population,AveOccup,Latitude,Longitude,target,prediction
0,8.3252,41.0,6.984127,1.02381,322.0,2.555556,37.88,-122.23,4.526,3.754242
1,8.3014,21.0,6.238137,0.97188,2401.0,2.109842,37.86,-122.22,3.585,1.838854
2,7.2574,52.0,8.288136,1.073446,496.0,2.80226,37.85,-122.24,3.521,5.850296
3,5.6431,52.0,5.817352,1.073059,558.0,2.547945,37.85,-122.25,3.413,-1.331103
4,3.8462,52.0,6.281853,1.081081,565.0,2.181467,37.85,-122.25,3.422,-2.356477


In [26]:
reference = housing_data.sample(n=5000, replace=False)
current = housing_data.sample(n=5000, replace=False)

## Test Suite

### HTML Suite

In [31]:
tests = TestSuite(tests=[
    TestNumberOfColumnsWithNANs(),
    TestNumberOfRowsWithNANs(),
    TestNumberOfConstantColumns(),
    TestNumberOfDuplicatedRows(),
    TestNumberOfDuplicatedColumns(),
    TestColumnsType(),
    TestNumberOfDriftedFeatures(), 
])

tests.run(reference_data=reference, current_data=current)
tests

### JSON Suite

In [32]:
tests.json()

'{"version": "0.1.53.dev0", "datetime": "2022-08-01T10:09:33.759507", "tests": [{"name": "Number of Columns with NA values", "description": "The number of columns with NA values is 0. The test threshold is eq=0.", "status": "SUCCESS", "group": "data_integrity", "parameters": {}}, {"name": "Number of Rows with NA Values", "description": "The number of rows with NA values is 0. The test threshold is eq=0 \\u00b1 1e-12.", "status": "SUCCESS", "group": "data_integrity", "parameters": {}}, {"name": "Number of Constant Columns", "description": "The number of constant columns is 0. The test threshold is lte=0.", "status": "SUCCESS", "group": "data_integrity", "parameters": {"condition": {}, "number_of_constant_columns": 0}}, {"name": "Number of Duplicate Rows", "description": "The number of duplicate rows is 0. The test threshold is eq=0 \\u00b1 1e-12.", "status": "SUCCESS", "group": "data_integrity", "parameters": {"condition": {"eq": {"value": 0.0, "relative": 0.1, "absolute": 1e-12}}, "num

### Preset

In [34]:
data_stability = TestSuite(tests=[
    DataStability(),
])

data_stability.run(reference_data=reference, current_data=current)
data_stability

## Dashboard

In [33]:
drift_dashboard = Dashboard(tabs=[DataDriftTab(), NumTargetDriftTab()])
drift_dashboard.calculate(reference, current)
drift_dashboard.show()