# PyCaret End-to-End Workflow

The most valuable feature of PyCaret is its highly consistent, low-code API across different machine learning tasks. This structure allows a data scientist to rapidly switch between problems (supervised, unsupervised, time series) while maintaining a standard workflow.

## 1. Supervised Learning: Classification & Regression

This workflow is used for predictive modeling and includes critical steps like automated model comparison and hyperparameter optimization.

| Step | PyCaret Function | Purpose / Data Scientist Value |
| :--- | :--- | :--- |
| **Import Libraries** | `from pycaret.[module] import *` | Imports the specific module (`.classification` or `.regression`). |
| **Load Dataset** | `get_data('data_name')` | Loads built-in data or a custom Pandas DataFrame. |
| **Setup Experiment** | `setup(data, target='col', ...)` | **Automated Preprocessing** (Imputation, Scaling, Encoding, Splitting). |
| **Compare Models** | `compare_models()` | **Benchmarking:** Trains and ranks all available models. |
| **Tune the Model** | `tune_model(best_model)` | **Optimization:** Automatically searches for optimal hyperparameters. |
| **Plot Diagnostics** | `plot_model(tuned_model, plot='auc/feature')` | **Explainability:** Visualizes performance and feature importance. |
| **Evaluate Model** | `evaluate_model(final_model)` | Launches an **Interactive Dashboard** for detailed performance analysis. |
| **Finalize Model** | `finalize_model(model)` | Retrains the model on the full dataset for final use. |
| **Save Model** | `save_model(final_model, 'pipeline_name')` | **MLOps:** Saves the complete pipeline (preprocessing + model) for deployment. |


## 2. Time Series (Forecasting)

Used for predicting a continuous value over time, defining the length of the prediction period.

| Step | PyCaret Function | Purpose / Data Scientist Value |
| :--- | :--- | :--- |
| **Import Libraries** | `from pycaret.time_series import *` | Imports the specialized forecasting module. |
| **Load Dataset** | `get_data('airline')` | Time series data with a date index. |
| **Setup Experiment** | `setup(data, target='col', fh=12, ...)` | Defines the **forecast horizon (`fh`)** and sets up time-series specific cross-validation. |
| **Compare Models** | `compare_models()` | Evaluates various forecasting algorithms. |
| **Plot Forecast** | `plot_model(model, plot='forecast')` | Visualizes the predicted future time window. |
| **Finalize & Save** | `finalize_model(model)` / `save_model(...)` | Finalizes the model and saves the pipeline. |

## 3. Unsupervised Learning: Clustering & Anomaly Detection

These modules are used for discovery and inspection rather than prediction (no `target` variable is used in the setup).

### Clustering Workflow

| Step | PyCaret Function | Purpose / Data Scientist Value |
| :--- | :--- | :--- |
| **Import Libraries** | `from pycaret.clustering import *` | Imports the clustering module. |
| **Load Dataset** | `get_data('jewellery')` | Data for customer segmentation or grouping. |
| **Setup Experiment** | `setup(data, ...)` | Initializes environment (no `target` specified). |
| **Create Model** | `create_model('kmeans', num_clusters=4)` | Trains a specific model and defines the number of segments. |
| **Assign Model** | `assign_model(model)` | Adds the new **Cluster Labels** (e.g., 'Segment A', 'Segment B') to the original dataset. |
| **Plot Model** | `plot_model(model, plot='cluster')` | Visualizes the resulting clusters. |
| **Save Model** | `save_model(model, 'clustering_pipeline')` | Saves the model for consistently classifying new data points. |

### Anomaly Detection Workflow

| Step | PyCaret Function | Purpose / Data Scientist Value |
| :--- | :--- | :--- |
| **Import Libraries** | `from pycaret.anomaly import *` | Imports the anomaly detection module. |
| **Load Dataset** | `get_data('kiva')` | Data for outlier identification (e.g., fraud). |
| **Setup Experiment** | `setup(data, ...)` | Initializes environment (no `target` specified). |
| **Create Model** | `create_model('iforest')` | Trains a specific anomaly detection algorithm (e.g., Isolation Forest). |
| **Assign Model** | `assign_model(model)` | Adds the **Anomaly Label (`1/0`)** and **Anomaly Score** to the dataset. |
| **Predict & Filter** | `data[data['Anomaly'] == 1]` | **Analysis:** Filters the dataset to isolate and analyze the identified outliers. |