# YouTube Video Popularity Prediction Project

# Notebook 4: Model Deployment and API Integration

In this notebook, we will deploy our trained model using AWS EC2 and set up an API Gateway, using a FastAPI server in a Virtual Machine on the cloud to allow for real-time predictions via API requests.

### Create EC2 Instance


1. Go to the EC2 Console:

- In the AWS Management Console, go to the Amazon EC2 Console.

2. Launch a New Instance:


- Click Launch Instance.
- Choose an Amazon Machine Image (AMI), such as Amazon Linux 2 AMI (HVM), SSD Volume Type. (Default is free tier)
- Choose an Instance Type. t2.micro is a good choice for low-cost, low-traffic applications.
- Click Next: Configure Instance Details.
- Configure the instance details as needed. For basic use, the default settings are sufficient.
- Click Next: Add Storage.
- Add storage if needed, otherwise, the default 8 GB is usually enough.
- Click Next: Add Tags.
- Add tags to help identify your instance, such as Key: Name, Value: MyModelAPI.
- Click Next: Configure Security Group.
Add a security group rule to allow HTTP and SSH access:
Type: HTTP, Port Range: 80, Source: Anywhere
Type: SSH, Port Range: 22, Source: My IP (or 0.0.0.0/0 for Anywhere for public access, but this is less secure)
Type: All Traffic, Port: 5000, Source: My IP (or 0.0.0.0/0 Anywhere for public access, but this is less secure)
- Create a new key pair to access the instance, save the .pem file for ssh access
- Click Review and Launch.
- Click Launch.

### Connect to Your EC2 Instance
1. Obtain the Public DNS: 
- Once the instance is running, select it from the instances list.
- Copy the Public DNS (IPv4) address for the instance.
- Example: 'http://ec2-5-20-1-204.ap-southeast-2.compute.amazonaws.com'

2. Connect to Instance:
- Open a terminal (or use an SSH client like PuTTY on Windows).
- Connect using the key pair and the public DNS:
```
ssh -i /path/to/your-key-pair.pem ec2-user@your-ec2-public-dns
```

### Set up the Environment on EC2 Instance
1. Run the following commands to update the instance and install necessary packages:
```
sudo yum update -y
sudo yum install python3 git -y
mkdir youtubeMeta
cd youtubeMeta
```

2. Install Required Python Packages in a virtual environment:
```
python3 -m venv myenv
source myenv/bin/activate
curl -O https://bootstrap.pypa.io/get-pip.py
python3 get-pip.py
sudo python3 get-pip.py
pip install joblib scikit-learn numpy fastapi uvicorn
sudo pip install joblib scikit-learn numpy fastapi uvicorn
```

3. Create the "main.py" in the youtubeMeta directory
```
nano main.py
```
Add aws s3 credentials and copy the following code in main.py

In [None]:
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
import joblib
import numpy as np
import os
import boto3

app = FastAPI()

# Function to download the model from S3
def download_model_from_s3(bucket_name, object_key, local_file_name):
    s3 = boto3.client(
        's3',
        aws_access_key_id="",
        aws_secret_access_key="",
    )
    try:
        s3.download_file(bucket_name, object_key, local_file_name)
    except Exception as e:
        raise HTTPException(status_code=500, detail=f"Error downloading model: {str(e)}")

# Define the S3 bucket and object key
S3_BUCKET_NAME = 'electric-scooters'
S3_OBJECT_KEY = 'model/model.pkl'
LOCAL_MODEL_PATH = 'model.pkl'

# Download the model if it does not exist locally
if not os.path.exists(LOCAL_MODEL_PATH):
    download_model_from_s3(S3_BUCKET_NAME, S3_OBJECT_KEY, LOCAL_MODEL_PATH)

# Load the model
model = joblib.load(LOCAL_MODEL_PATH)

class InputData(BaseModel):
    input: list

@app.get('/')
def live():
    return {"status":"running"}

@app.post("/predict")
def predict(data: InputData):
    try:
        input_data = np.array(data.input).reshape(1, -1)
        prediction = model.predict(input_data)
        return {"prediction": int(prediction[0])}
    except Exception as e:
        raise HTTPException(status_code=400, detail=str(e))

if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=5000)


4. Start the FastAPI server
- Make sure the Environment is activated
```
source myenv/bin/activate
```
- run the command
```
sudo uvicorn main:app --host 0.0.0.0 --port 5000
```
5. From your local browser open the url "http://your-ec2-public-dns:5000/" to confirm is the server is running. You should see {"status":"running"} in the browser
6. You can use the code in notebook 3 to test out predictions on the testing dataset

### Additional Steps to Keep the APplication Running
1. Install tmux
```
sudo yum install tmux -y
```
2. start a tmux session
```
tmux
```
3. Run the FastAPI server in the tmux session
```
source myenv/bin/activate
uvicorn main:app --host 0.0.0.0 --port 5000
```
4. Detach form tmux Session:
- Press `Ctrl + B`, and then `D` to detach. the server will keep running. you can terminate the ssh session and the server will continue to run
5. To Reattach to tmux Session Later
```
tmux attach-session -t o
```

## Live API Calling

TODO:

Use the notebook 3 to call the API, You can use the prepared Test Split to start, and then create a function for new youtube video processing