# **Module: ML Services Overview**

---

## **Introduction**

Hello and welcome to this lesson on **Machine Learning (ML) Services Overview**.  
As we have already seen, **Oracle Cloud Infrastructure (OCI)** is a portfolio of cloud services that help organizations **leverage data** for next-generation business scenarios.  
The **foundation of AI and ML services** is **data**, which fuels all intelligent applications.

At the **top layer** of the OCI architecture, we have **applications**, referring to all the ways AI is consumed — such as **business processes**, **analytics systems**, or **enterprise applications**.  
Between the **application** and **data layers**, two primary service groups exist:
- **AI Services** (prebuilt models)
- **Machine Learning Services** (custom modeling and data science tools)

In this lesson, we’ll focus primarily on **OCI Data Science**, which is the core of Oracle’s Machine Learning Services.

---

## **What is OCI Data Science?**

**OCI Data Science** is a **cloud service** designed to support **data scientists** across the **entire machine learning lifecycle**, providing full support for **Python** and **open-source tools**.

It enables users to:
- **Build**, **train**, **deploy**, and **manage** machine learning models.  
- Work collaboratively within teams.  
- Accelerate development with built-in libraries and managed infrastructure.

### **Key Features of OCI Data Science**
- Model Catalog  
- Projects  
- JupyterLab Notebook  
- Model Training & Deployment  
- Model Management  
- Model Explanation  
- Open-Source Libraries  
- AutoML Integration  

---

## **Core Principles of OCI Data Science**

OCI Data Science is built upon three main principles:

### **1. Accelerated**
- Designed to **accelerate** the productivity of individual data scientists.  
- Provides access to **open-source libraries** and **scalable compute resources** without the need to manage infrastructure.  
- Includes **Oracle’s Accelerated Data Science (ADS) Library** to automate and streamline tasks like data exploration, visualization, and model training.

### **2. Collaborative**
- Enhances **team collaboration** by enabling sharing of assets and models.  
- Promotes **reproducibility** and **auditability** for effective teamwork and compliance.  
- Reduces duplication of work by maintaining shared, versioned assets.

### **3. Enterprise-Grade**
- Fully **integrated with OCI security and identity protocols**.  
- Runs on **managed infrastructure** — users don’t have to handle provisioning, patching, or upgrades.  
- Offers a **secure, compliant**, and **high-performance environment** for enterprise data science.

---

## **Detailed Overview of OCI Data Science**

### **Purpose**
OCI Data Science allows you to **rapidly build, train, deploy, and manage ML models** using open-source frameworks and OCI infrastructure.

### **Primary Users**
- **Data Scientists** and **Data Science Teams**  
- Developers integrating models into business systems

### **Working Environment**
- Users operate in a **JupyterLab Notebook interface**, writing Python code and using OCI resources for compute and storage.  
- Models are **preserved in the Model Catalog** and **deployed** to managed environments for production use.

---

## **Key Components of OCI Data Science**

Let’s explore some of the important terms and components of OCI Data Science.

---

### **1. Projects**

- A **Project** acts as a **container** for organizing data science work.  
- Represents a **collaborative workspace** for managing assets such as:
  - Notebook sessions  
  - Models  
  - Datasets  
- A tenancy can have **unlimited projects**.

---

### **2. Notebook Sessions**

- A **Notebook Session** provides an **interactive JupyterLab environment**.  
- Comes with **preinstalled open-source libraries** and the ability to add custom ones.  
- Runs on **managed OCI infrastructure**.  
- Users can select:
  - **CPU or GPU shapes**
  - **Compute capacity**
  - **Storage allocation**

This eliminates the need for manual provisioning.

---

### **3. Conda Environments**

- **Conda** is an open-source **environment and package management system** created for Python.  
- Used in OCI Data Science to:
  - Install, run, and update packages and dependencies easily.  
  - Create, save, load, and switch between environments within notebook sessions.  
- Ensures **isolated and reproducible development environments**.

---

### **4. Accelerated Data Science (ADS) SDK**

- The **Accelerated Data Science SDK (ADS)** is Oracle’s **Python library** built into OCI Data Science.  
- Provides high-level automation for:
  - Connecting to data sources  
  - Exploring and visualizing data  
  - Training models (including **AutoML**)  
  - Evaluating and explaining models  

- ADS also provides an easy interface to access:
  - **Model Catalog**
  - **Object Storage**
  - Other **OCI Services**

This SDK simplifies and speeds up the data science workflow considerably.

---

### **5. Models**

- A **Model** represents a **mathematical abstraction** of data and business logic.  
- Created within **notebook sessions** and **stored in projects**.  
- Models are central to the machine learning process — used for **training**, **evaluation**, and **prediction**.

---

### **6. Model Catalog**

- The **Model Catalog** is a **centralized repository** to **store, track, share, and manage models**.  
- It stores important **metadata**, including:
  - Version information  
  - Git repository details  
  - Notebooks or scripts used for training  
- Models can be:
  - **Shared** among team members  
  - **Reused** or **reloaded** into new notebook sessions  

---

### **7. Model Deployments**

- **Model Deployments** allow models from the Model Catalog to be served as **HTTP API endpoints**.  
- Enables **real-time predictions** via web applications.  
- Provides **fully managed hosting** infrastructure for production workloads.

**Use Case Examples:**
- Fraud detection in financial systems  
- Real-time recommendation systems  
- Predictive maintenance dashboards  

---

### **8. Jobs**

- **Data Science Jobs** are **repeatable machine learning tasks** that can be scheduled and executed on **managed infrastructure**.  
- Ideal for:
  - Running batch predictions  
  - Automating training workflows  
  - Performing periodic data updates  

---

## **Summary of Key Components**

| **Component** | **Description** | **Purpose** |
|----------------|------------------|--------------|
| **Project** | Collaborative workspace | Organize notebooks, models, and data |
| **Notebook Session** | JupyterLab environment | Build and train ML models |
| **Conda Environment** | Package management system | Manage dependencies |
| **ADS SDK** | Oracle’s ML automation toolkit | Simplify the ML workflow |
| **Model Catalog** | Central repository for models | Store and share models |
| **Model Deployment** | Hosted endpoint for ML models | Serve predictions |
| **Jobs** | Scheduled ML tasks | Automate recurring tasks |

---

## **Conclusion**

OCI Data Science empowers **data scientists and enterprises** to **collaboratively build, train, and deploy ML models** efficiently.  
It combines the **flexibility of open-source tools** with the **security and scalability** of Oracle Cloud Infrastructure.  

By automating infrastructure management, providing built-in collaboration, and offering enterprise-grade governance, OCI Data Science ensures teams can focus on what truly matters — **turning data into business insights**.

