## An Introduction to Experiment Tracking with Weights & Biases

#### Setup Dependencies

In [19]:
pip install wandb

Note: you may need to restart the kernel to use updated packages.


#### Import Libraries

In [20]:
import wandb
import pickle 

from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris 
from sklearn.metrics import accuracy_score, mean_squared_error

### Initialize a Weights & Biases Run

At the begining of our script or notebook, calling wandb.init() generates a background to sync and log data as a W&B Run

In [21]:
wandb.init(project='mlops-zoomcamp-wandb', name='experiment-1')

[34m[1mwandb[0m: Currently logged in as: [33mtpitsuev[0m ([33mpizzu[0m). Use [1m`wandb login --relogin`[0m to force relogin


### Load the Iris Dataset

This data set consist fo 3 different types of irises (Setosa, Versicolour, and Virginica) petal and sepal length, stored in a 150x4 numpy.ndarray.

In [22]:
X, y = load_iris(return_X_y=True)
label_names = ['Setosa', 'Versicolour', 'Virginica']

### Training Model and Experiment Tracking 

Define model config or other hyperparameters using wandb.config.

In [23]:
# Log your model config to Weights & Biases
params = {"C": 0.22, "random_state": 42}
wandb.config = params 

Define and train a Logistic Regression model 

In [24]:
model = LogisticRegression(**params).fit(X, y)
y_pred = model.predict(X)
y_probas = model.predict_proba(X)

Log your metrics to Weight & Biases using wandb.log

In [25]:
wandb.log({
    "accuracy": accuracy_score(y, y_pred),
    "mean_squared_error": mean_squared_error(y, y_pred)
})

### Visualize and Compare Plots using Weights & Biases
The ROC curves plot true positive rate (y-axis) vs false positive rate (x-axis). The ideal score is a TPR = 1 and FPR = 0, which is the point on the top left. Typically we calculate the area under the ROC curve (AUC-ROC), and the greater the AUC-ROC the better.

In [26]:
wandb.sklearn.plot_roc(y, y_probas, labels=label_names)

The precision-recall curve computes the tradeoff between precision and recall for different thresholds. A high area under the curve represents both high recall and high precision, where high precision relates to a low false positive rate, and high recall relates to a low false negative rate. High scores for both show that the classifier is returning accurate results (high precision), as well as returning a majority of all positive results (high recall). PR curve is useful when the classes are very imbalanced.

In [27]:
wandb.sklearn.plot_precision_recall(y, y_probas, labels=label_names)

The confusion matrix computes the confusion matrix to evaluate the accuracy of a classifier. It's useful for assessing the quality of model predictions and finding patterns in the predictions the model gets wrong. The diagonal represents the predictions the model got right, i.e. where the actual label is equal to the predicted label.

In [28]:
wandb.sklearn.plot_confusion_matrix(y, y_pred, labels=label_names)

In order to know more about the different functionalities available as part of the Scikit-Learn integration with Weights & Biases, you can check the official docs.

### Logging Model to Weights & Biases

Weights & Biases Artifacts to track datasets, models, dependencies, and result through each step of your ML pipeline. Artifacts make it easy to get a complete and auditable history of changes to your files.

In [29]:
# Save your model 
with open("logistic_regression.pkl", "wb") as f:
    pickle.dump(model, f)

# log your model as a versioned file to Weights & Biases Artifacts
artifact = wandb.Artifact(f"iris-logistic-regression-model", type="model")
artifact.add_file("logistic_regression.pkl")
wandb.log_artifact(artifact)

<wandb.sdk.wandb_artifacts.Artifact at 0x23b25650340>

### Finish the Experiment

In [30]:
wandb.finish()

0,1
accuracy,▁
mean_squared_error,▁

0,1
accuracy,0.96667
mean_squared_error,0.03333
