# References

- Vectice Documentation: https://docs.vectice.com/
- Vectice API Documentation: https://api-docs.vectice.com/

In [1]:
import warnings
warnings.filterwarnings("ignore", category=DeprecationWarning)

In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_absolute_error
from sklearn.linear_model import Ridge, LinearRegression
from sklearn.pipeline import make_pipeline
from category_encoders import OneHotEncoder

## Get started by connecting to Vectice

In [3]:
import vectice as vect

vec = vect.connect(config="tut.json")

VECTICE_API_ENDPOINT is deprecated and will be removed in 23.3.1.0, please use VECTICE_HOST instead.
Welcome, Aidan. You`re now successfully connected to Vectice.

To access your personal workspace, use [1mconnection[0m.my_workspace
To access a specific workspace, use [1mconnection[0m.workspace(Workspace ID)
To get a list of workspaces you can access and their IDs, use [1mconnection[0m.list_workspaces()

If you are using a notebook you can call the help by using a Vectice returned object with the builtin notebook "?":
>> connection?

If you are using an IDE you can call the help() method on any object returned by Vectice:
>> help(connection)

For quick access to the list of workspaces in the Vectice web app, visit:
https://dev.vectice.com/workspaces


## Specify which project phase you want to document
In Vectice UI, navigate to your personal workspace inside your default Tutorial project go to the modeling phase and copy paste your Phase Id below.

In [5]:
phase = vec.phase("PHA-1177")

  and should_run_async(code)
Phase 'Model Retraining' successfully retrieved."

For quick access to the Phase in the Vectice web app, visit:
https://dev.vectice.com/browse/phase/PHA-1177


## Next we are going to create an iteration
An iteration allows you to organize your work in repeatable sequences of steps. You can have multiple iteration within a phase

In [6]:
retrain_iteration = phase.create_iteration()

  and should_run_async(code)
New Iteration number '3' created.

For quick access to the Iteration in the Vectice web app, visit:
https://dev.vectice.com/browse/iteration/ITR-353


In [7]:
df_cleaned = pd.read_csv("https://raw.githubusercontent.com/vectice/GettingStarted/aidann/tutorial_update/23.2/tutorial/ProductSales%20Cleaned.csv")

  and should_run_async(code)


In [8]:
df_cleaned.head()

  and should_run_async(code)


Unnamed: 0.1,Unnamed: 0,Ship Mode,Segment,Country,City,State,Postal Code,Region,Category,Sub-Category,Sales,Quantity,Discount,Profit
0,0,Second Class,Consumer,United States,others,others,42420,South,Furniture,Bookcases,261.96,2,0.0,41.9136
1,2,Second Class,Corporate,United States,Los Angeles,California,90036,West,Office Supplies,Labels,14.62,2,0.0,6.8714
2,4,Standard Class,Consumer,United States,others,Florida,33311,South,Office Supplies,Storage,22.368,2,0.2,2.5164
3,5,Standard Class,Consumer,United States,Los Angeles,California,90032,West,Furniture,Furnishings,48.86,7,0.0,14.1694
4,6,Standard Class,Consumer,United States,Los Angeles,California,90032,West,Office Supplies,Art,7.28,4,0.0,1.9656


In [9]:
# Dropping extra column to imitate some changes in data
X = df_cleaned.drop(["Unnamed: 0", "Sales", "Postal Code"],axis=1)
y = df_cleaned["Sales"]
print(X.shape)
print(y.shape)

(7994, 11)
(7994,)


  and should_run_async(code)


In [10]:
X_train, X_test, y_train, y_test=train_test_split(X, y, test_size=0.2, random_state=42)

  and should_run_async(code)


## Retrieve A Previously Created Dataset
You can retrieve a variety of Vectice Objects with the browse('VECTICE-ID') method. Namely, Phases, Iterations, Datasets, Models etce

In [11]:
cleaned_ds = vec.browse("DTV-1629")

  and should_run_async(code)
Dataset version 'Version 1' successfully retrieved."

For quick access to the Dataset version in the Vectice web app, visit:
https://dev.vectice.com/browse/datasetversion/DTV-1629


## Register dataset metadata and statistics
Register dataset metadata and statistics to Vectice by passing the file resource path and a `Pandas.DataFrame`

In [None]:
train_df = X_train.copy()
test_df = X_test.copy()

train_df["Sales"] = y_train
test_df["Sales"] = y_test

train_df.to_csv("retrain dataset.csv")
test_df.to_csv("retest dataset.csv")

train_ds = vect.FileResource(paths="retrain dataset.csv", dataframes=train_df)
test_ds = vect.FileResource(paths="retest dataset.csv", dataframes=test_df)

# Create the modeling dataset with resources and the dropped column property
modeling_dataset = vect.Dataset.modeling(
        name="ProductSales Modeling",
        training_resource=train_ds,
        testing_resource=test_ds, 
        derived_from=cleaned_ds,
        properties={"Dropped column": "Postal Code"}
    )

## Push the dataset

Push the dataset to Vectice by passing the `Vectice.Dataset` object.

In [13]:
retrain_iteration.step_model_input_data = modeling_dataset

  and should_run_async(code)
New Version: 'Version 2' of Dataset: 'ProductSales Modeling' added to Step: Model Input Data
Attachments: None
Link to Step: https://dev.vectice.com/browse/iteration/ITR-353



In [14]:
model = make_pipeline(OneHotEncoder(use_cat_names=True),
                     Ridge())
model.fit(X_train, y_train)

  and should_run_async(code)


In [15]:
# Making Prediction with the training data
y_train_pred = model.predict(X_train)

  and should_run_async(code)


In [16]:
#Evaluating the model 
mae_train=mean_absolute_error(y_train, y_train_pred)
print(round(mae_train,2))

59.34


  and should_run_async(code)


In [17]:
y_test_pred = model.predict(X_test)

  and should_run_async(code)


In [18]:
mae_test = mean_absolute_error(y_test, y_test_pred)
print(round(mae_test,2))

63.37


  and should_run_async(code)


## Push a model
Push a model to Vectice using the Vectice.Model() object.

In [19]:
vect_model = vect.Model(library="scikit-learn", technique="Ridge Regression Stage", metrics={"mae_test": round(mae_test,2)}, 
                        properties={"quarter": "Q2"}, predictor=model, derived_from=modeling_dataset)

  and should_run_async(code)


In [22]:
retrain_iteration.step_build_model = vect_model

  and should_run_async(code)
Model Pipeline successfully attached to Model(name='scikit-learn Ridge Regression Stage model', version='Version 2').
New Version: 'Version 2' of Model: 'scikit-learn Ridge Regression Stage model' added to Step: Build Model
Attachments: None
Link to Step: https://dev.vectice.com/browse/iteration/ITR-353



## Add a Comment
Passing a string will create a comment

In [21]:
retrain_iteration.step_model_validation = """Evaluation:\nMAE vs Threshold: 63.37 vs 0.5 - 0.65\nModel passed acceptance critera."""

  and should_run_async(code)
Added Comment to Step: Model Validation

Link to Step: https://dev.vectice.com/browse/iteration/ITR-353



✴ You can view your registered assets and comments in the UI by clicking the links in the output messages.