# Model Deployment

In [1]:
from huggingface_hub import login

login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

In [2]:
from huggingface_hub import HfApi

# Initialize the API
api = HfApi()

# Replace with your Hugging Face username
username = "kangelamw" 

# Repo name
repo_name = "negative-reviews-into-actionable-insights"
repo_id = f"{username}/{repo_name}"

# Path to model directory
model_path = '../models/phi-2_full_2' 

# Create repo and upload
api.create_repo(repo_id, exist_ok=True)
api.upload_folder(
    folder_path=model_path,
    repo_id=repo_id,
    repo_type="model"
)

model-00002-of-00002.safetensors:   0%|          | 0.00/564M [00:00<?, ?B/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/5.00G [00:00<?, ?B/s]

Upload 2 LFS files:   0%|          | 0/2 [00:00<?, ?it/s]

CommitInfo(commit_url='https://huggingface.co/kangelamw/negative-reviews-into-actionable-insights/commit/c1139493c5e3fc7c2da10c1e688a75fcff04bf65', commit_message='Upload folder using huggingface_hub', commit_description='', oid='c1139493c5e3fc7c2da10c1e688a75fcff04bf65', pr_url=None, repo_url=RepoUrl('https://huggingface.co/kangelamw/negative-reviews-into-actionable-insights', endpoint='https://huggingface.co', repo_type='model', repo_id='kangelamw/negative-reviews-into-actionable-insights'), pr_revision=None, pr_num=None)

In [6]:
# It needs a few more things
from huggingface_hub import ModelCard, ModelCardData

# Add model tags to the repository -- for searchability
card_data = ModelCardData(
    language="en",
    license="mit",
    library_name="peft",
    base_model="microsoft/phi-2",
    tags=["text-generation", "peft", "lora", "review-analysis", "business-intelligence"]
)

# Create and push model card
try:
    card = ModelCard.from_template(
        card_data,
        model_id=repo_id,
        ignore_metadata_errors=True  # Preserve existing README content
    )
    card.push_to_hub(repo_id)
    print("Model card with required tags updated successfully")
    print("Your model should now be deployable on the Inference API")
except Exception as e:
    print(f"Error updating model card: {e}")

Model card with required tags updated successfully
Your model should now be deployable on the Inference API


In [8]:
# Quick check if model was pushed successfully
try:
    # List a few files in the repository
    files = api.list_repo_files(repo_id)
    print(f"Success! Repository contains {len(files)} files.")
    print(f"Files: {files}...")
    print(f"View model at: https://huggingface.co/{repo_id}")
except Exception as e:
    print(f"Error: {e}")

Success! Repository contains 13 files.
Files: ['.gitattributes', 'README.md', 'added_tokens.json', 'config.json', 'generation_config.json', 'merges.txt', 'model-00001-of-00002.safetensors', 'model-00002-of-00002.safetensors', 'model.safetensors.index.json', 'special_tokens_map.json', 'tokenizer.json', 'tokenizer_config.json', 'vocab.json']...
View model at: https://huggingface.co/kangelamw/negative-reviews-into-actionable-insights


## Deployment

**Note: The model is not ready for prod.**

I would want to use the model as an API for a web app, and preferably in Azure. I have the following options:

- **Azure Machine Learning (Azure ML):** Best for managed inference and scalability.
- **Azure Functions / Azure App Service:** For deploying a FastAPI-based API.

#### **Model Hosting Choices**
I can directly load the model using `transformers` from Hugging Face, but I also have the option to deploy it using **[Hugging Face's Inference API](https://endpoints.huggingface.co/)**:

- **Self-Managed on Azure**  
  - Deploy a **FastAPI** or **Flask** server hosting the model.
  - Use **GPU-powered VM** for efficient inference.

- **Hugging Face Inference API**  
  - A fully managed solution for serving models.  
  - I found this guide: [Hugging Face Inference Providers](https://huggingface.co/blog/inference-providers).

#### **Deployment Process on Azure**
To deploy the model efficiently:

1. **Containerize the Model API using Docker**  
   - Write a `Dockerfile` to package the model and API.

2. **Push to Azure Container Registry (ACR)**  
   - Store the container image in **Azure ACR** for deployment.

#### **Deploy the API on Azure**
   - Use **Azure App Service** (simpler for REST API hosting).  
   - Or deploy on **Azure Kubernetes Service (AKS)** for scalable inference.

This setup will allow my **fine-tuned Phi-2 model** to serve as an API for a web app, ensuring **scalability, efficiency, and cost-effectiveness** on Azure.