<img src="https://storage.googleapis.com/arize-assets/arize-logo-white.jpg" width="200"/>

# **Arize and MongoDB Walkthrough**

Let's get started on using Arize with MongoDB! ✨

**MongoDB** is a NoSQL database that allows you to store and retrieve data in a flexible, scalable, and efficient manner. **Arize** is an observability & monitoring tool that helps you pre-launch validate those model experiments & versions, and allows you to benchmark, monitor, and visualize your production model performance, data drift, data quality, and explainability after it has been deployed in production.

This notebook will walk you through how to transfer model data from MongoDB to Arize.


## ✔️ Steps for this Walkthrough
1. Retrieve data from MongoDB
2. Define a data schema
3. Log data to Arize

In [None]:
!pip install -q arize "pymongo[srv]"

### Enter your Arize and MongoDB credentials

Your Arize Space ID and API Key can be found in the Arize UI under Space Settings.

Your MongoDB credentials can be found in the MongoDB UI under Database Access.

Your MongoDB database name and collection name can be found in the MongoDB UI by going to Overview > Browse Collections.

In [None]:
ARIZE_API_KEY = ''
ARIZE_SPACE_ID = ''

MONGO_USERNAME = ''
MONGO_PASSWORD = ''

MONGO_DB_NAME = ''
MONGO_COLLECTION_NAME = ''

### Connect to Arize


In [None]:
from arize.pandas.logger import Client, Schema
from arize.utils.types import ModelTypes, Environments, Schema, Metrics

arize_client = Client(space_id=ARIZE_SPACE_ID, api_key=ARIZE_API_KEY)

### Connect to MongoDB

In [None]:
from pymongo.mongo_client import MongoClient
from pymongo.server_api import ServerApi

uri = f"mongodb+srv://{MONGO_USERNAME}:{MONGO_PASSWORD}@cluster0.lq406.mongodb.net/?retryWrites=true&w=majority&appName=Cluster0"

# Create a new client and connect to the server
client = MongoClient(uri, server_api=ServerApi('1'))

# Send a ping to confirm a successful connection
try:
    client.admin.command('ping')
    # print(client.list_database_names())
    database = client[MONGO_DB_NAME]
    
    # print(database.list_collection_names())
    collection = database[MONGO_COLLECTION_NAME]
    
    print("Pinged your deployment. You successfully connected to MongoDB!")
except Exception as e:
    print(e)

### Retrieve data from MongoDB
This example simply pulls the full collection from MongoDB and converts it into a pandas dataframe. See MongoDB's [documentation](https://www.mongodb.com/docs/languages/python/pymongo-driver/current/read/retrieve/#std-label-pymongo-retrieve-find-multiple) for more information on how to query specific data.

In [None]:
results = collection.find({})

In [None]:
import pandas as pd
df = pd.DataFrame(results)
df.head()

### Define a data schema
Set up your schema with the appropriate feature and column names for your dataset. For more information on uploading data into Arize, see [our documentation](https://docs.arize.com/arize/machine-learning/how-to-ml/upload-data-to-arize)

In [None]:
schema = Schema(
    actual_label_column_name="actual_label",
    prediction_label_column_name="prediction_label",
    feature_column_names=[
       'mean radius', 'mean texture', 'mean perimeter', 'mean area',
       'mean smoothness', 'mean compactness', 'mean concavity',
       'mean concave points', 'mean symmetry', 'mean fractal dimension',
       'radius error', 'texture error', 'perimeter error', 'area error',
       'smoothness error', 'compactness error', 'concavity error',
       'concave points error', 'symmetry error',
       'fractal dimension error', 'worst radius', 'worst texture',
       'worst perimeter', 'worst area', 'worst smoothness',
       'worst compactness', 'worst concavity', 'worst concave points',
       'worst symmetry', 'worst fractal dimension'
       ]
)

### Upload data to Arize

In [None]:
response = arize_client.log(
    dataframe=df,
    schema=schema,
    model_id='breast_cancer_dataset', 
    model_version='v1',
    model_type=ModelTypes.BINARY_CLASSIFICATION,
    metrics_validation=[Metrics.CLASSIFICATION], 
    environment=Environments.PRODUCTION
) 