# Before you start with this QuickStart Notebook

In this notebook, we will re-use the classical Iris modeling example to demonstrate how you can automatically document in Vectice your assets, such as datasets, models, graphs, and comments, using a few lines of code. 

### Pre-requisites:
Before using this notebook you will need:
* An account in Vectice
* An API token to connect to Vectice through the APIs
* The Phase Id of the project where you want to log your work

Refer to Vectice Getting Started Guide for more detailed instructions: https://docs.vectice.com/getting-started/


### Other Resources
*   Vectice Documentation: https://docs.vectice.com/ </br>
*   Vectice API documentation: https://api-docs.vectice.com/


<div class="alert alert-primary" role="alert">
<b>Automated code lineage:</b> The code lineage functionalities are not covered as part of this QuickStart as they require to first setting up a Git repository.
</div>

## Install the latest Vectice Python client library

In [None]:
%pip install seaborn
%pip install scikit-learn
%pip install vectice

## Get started by connecting to Vectice

**First, we need to authenticate to the Vectice server. Before proceeding further:**

- Visit the Vectice app (https://docs.vectice.com/getting-started/create-an-api-token) to create and copy an API token.

- Paste the API token in the code below

In [None]:
import vectice

vct = vectice.connect(api_token="my-api-token") #Paste your API token

## Retrieve your To Do Phase Id inside your QuickStart project to specify where to document your work

In the Vectice UI, navigate to your personal workspace (this is prefixed with a `.`) and then go to your QuickStart project next go to the **To Do Phase** and copy paste your **Phase Id** into the cell below.

In [None]:
phase = vct.phase('PHA-xxxx') #Paste your Phase Id

Next, we are going to create an iteration. An iteration allows you to organize your work in repeatable sequence of steps. You can have multiple iterations within a phase. <br>

In [None]:
iteration = phase.create_iteration()

## Auto-Document your iteration in Vectice
In this section we will prepare a modeling dataset based on the well-known iris dataset. We will then train a linear regression model using scikit-learn. 
As we are doing this work and creating those assets, we will log them and corresponding artifacts in Vectice with a few lines of code.
This enables you to document your work as you go, and never forget the data that was used, the models, the code and other artifacts.

### Log a comment

To log information you simply need to assign string variables to one of the steps of the iteration you previously created.

For the purpose of the QuickStart we already created a **quickstart step** to log your work.

In [None]:
# Log a comment inside the step quickstart of the iteration you created above
iteration.step_quickstart += "My first log into Vectice"

### Log a dataset with a graph as attachment

Use the following code block to create a local dataset and generate a graph:

In [None]:
import pandas as pd
from sklearn import datasets

iris = datasets.load_iris()

df_iris = pd.DataFrame(data=iris.data, columns=iris.feature_names)
df_iris['species'] = iris.target_names[iris.target]
#Save your dataframe to local file
df_iris.to_csv('cleaned_dataset.csv', index=False)

In [None]:
import seaborn as sns
import matplotlib.pyplot as plt

sns.scatterplot(data=df_iris, x='sepal length (cm)',
                y='petal width (cm)', hue='species')
plt.plot()
#Save your graph to local file
plt.savefig('Scatter_plot_iris.png')

Let's log the dataset we created, including the attachment above. <br>
The Vectice resource will automatically extract pertinent metadata from the local dataset file and collect statistics from the pandas dataframe. This information will be documented within the iteration as part of a Dataset version.

In [None]:
from vectice import Dataset, FileResource

clean_dataset = Dataset.clean(name="Cleaned Dataset", resource=FileResource(paths="cleaned_dataset.csv", dataframes=df_iris), attachments='Scatter_plot_iris.png')

iteration.step_quickstart += clean_dataset

After running the cell above, you will notice an output displaying a link pointing to the iteration in the Vectice UI. Click on the link to check what you documented.

### Log a model with its associated hyper-parameters

In [None]:
from sklearn.neighbors import KNeighborsClassifier

#instantiate the model (with the default parameter)
knn = KNeighborsClassifier()

# fit the model with data (occurs in-place)
knn.fit(df_iris[iris.feature_names],df_iris["species"])

In [None]:
from vectice import Model

iteration.step_quickstart += Model(library="scikit-learn", technique="KNN", name="My first model", predictor=knn, properties=knn.get_params(), derived_from=[clean_dataset.latest_version_id])

Similarly to Dataset, check what you documented by clicking on the link above.

## 🥇 Congrats! You learn how to succesfully use Vectice to auto-document your assets.<br>
### Next we encourage you to follow [part 2](https://docs.vectice.com/getting-started/quickstart-auto-document-your-work#part-2) of your the QuickStart guide to continue learning about Vectice.