# Movie Search and Summary API with FastAPI

This code implements a semantic search and summarization API for movie data using **FastAPI**, leveraging vector embeddings, Pinecone, and BigQuery. Here's an overview of the main components and how they work:

## Overview

### 1. Dependencies and Initialization
- **FastAPI**: The API framework used to manage requests.
- **Pinecone**: Used as a vector database for semantic search.
- **OpenAI**: Provides text embeddings to represent movie data semantically.
- **BigQuery**: Stores movie metadata and descriptions.
- **Prometheus**: Used for metrics collection and monitoring, providing insights into API performance.

### 2. API Features

#### 2.1 Search and Summarize Endpoint (`/search`)
- **Input**: Takes a search query (`query`), number of results (`top_k`), and a minimum similarity score (`min_score`).
- **Workflow**:
  - Generates an embedding of the search query using OpenAI.
  - Searches the **Pinecone** index for similar vectors.
  - Retrieves movie descriptions from **BigQuery** based on the search results.
  - Generates a summary of relevant movies using OpenAI, based on the movie descriptions.

#### 2.2 Health Check Endpoint (`/health`)
- A health endpoint that checks connectivity to **Pinecone**, **BigQuery**, and **OpenAI** to determine if the API services are running properly.

### 3. Key Components

#### 3.1 Lifespan Management
- The `@asynccontextmanager` (`lifespan`) handles the startup and shutdown activities:
  - Initializes Pinecone, OpenAI, and BigQuery clients at startup.
  - Performs cleanup during shutdown.

#### 3.2 Middleware
- **Logging and Metrics Middleware**:
  - Logs request details and response times.
  - Collects metrics for request count, latency, and endpoint-specific latencies for Prometheus monitoring.

#### 3.3 Core Functions
- **`get_embedding()`**: Generates an embedding of the given text using OpenAI's API. The resulting vector is used to find similar items.
- **`search_pinecone()`**: Queries the Pinecone index using the embedding, retrieving relevant matches.
- **`get_full_text_from_bigquery()`**: Uses BigQuery to retrieve movie descriptions by their IDs.
- **`generate_summary()`**: Uses OpenAI's chat model to generate a summary based on the movie descriptions retrieved.

### 4. Prometheus Metrics
- **Total HTTP Requests** (`http_requests_total`): Counts total requests by method, endpoint, and status.
- **HTTP Request Latency** (`http_request_duration_seconds`): Measures latency of each request.
- **Embedding, Search, and Summary Latency Metrics**: Measures specific latencies for key API operations (`embedding_generation_seconds`, `vector_search_seconds`, `summary_generation_seconds`).

### 5. API Workflow

- **Input Handling**: Takes a search query to look for related movies.
- **Query Processing**:
  - Generates an embedding for the query.
  - Searches for similar movie vectors using **Pinecone**.
  - Fetches detailed movie descriptions from **BigQuery**.
  - Generates a summary using OpenAI.
- **Output**: Returns a summary along with related movie data.

### 6. Deployment
- **CORS Middleware**: Configured to allow requests from all origins (`allow_origins=["*"]`), which should be modified for security in a production environment.
- **Metrics Endpoint (`/metrics`)**: Provides Prometheus metrics for monitoring the API.

### Example Usage
1. **Search Request (`POST /search`)**:
   - Accepts a `query` string to find semantically similar movie data.
   - Returns a summary with a list of matched movies.
  
2. **Health Check (`GET /health`)**:
   - Ensures all dependent services are up and running.

### Running the Application
To run this API:
```bash
uvicorn main:app --host 0.0.0.0 --port 8000
```

### Notes
- **Logging**: Structured logging using `structlog` for better traceability.
- **Error Handling**: Each function has error handling that logs the issue and raises an appropriate HTTP exception.

--- 


```python
from fastapi import FastAPI, HTTPException, Request
from fastapi.middleware.cors import CORSMiddleware
from pydantic import BaseModel
from typing import List, Optional
import pinecone
from openai import OpenAI
from google.cloud import bigquery
import os
from dotenv import load_dotenv
from contextlib import asynccontextmanager
import time
import logging
from prometheus_client import Counter, Histogram, make_asgi_app
import structlog

# Load environment variables
load_dotenv()

# Configure structured logging
logging.basicConfig(level=logging.INFO)
logger = structlog.get_logger()

# Prometheus metrics
REQUESTS = Counter('http_requests_total', 'Total HTTP requests', ['method', 'endpoint', 'status'])
LATENCY = Histogram('http_request_duration_seconds', 'HTTP request latency', ['endpoint'])
EMBEDDING_LATENCY = Histogram('embedding_generation_seconds', 'Embedding generation latency')
SEARCH_LATENCY = Histogram('vector_search_seconds', 'Vector search latency')
SUMMARY_LATENCY = Histogram('summary_generation_seconds', 'Summary generation latency')

class SearchRequest(BaseModel):
    query: str
    top_k: int = 5
    min_score: float = 0.7

class MovieSummary(BaseModel):
    query: str
    movies: List[dict]
    summary: str

@asynccontextmanager
async def lifespan(app: FastAPI):
    """Startup and shutdown events handler"""
    logger.info("application_startup", message="Initializing services")
    try:
        # Initialize Pinecone
        pinecone.init(
            api_key=os.getenv('PINECONE_API_KEY'),
            environment=os.getenv('PINECONE_ENV')
        )
        app.state.pinecone_index = pinecone.Index(os.getenv('PINECONE_INDEX_NAME'))
        
        # Initialize OpenAI client
        app.state.openai_client = OpenAI(api_key=os.getenv('OPENAI_API_KEY'))
        
        # Initialize BigQuery client
        app.state.bq_client = bigquery.Client(project=os.getenv('GCP_PROJECT_ID'))
        
        logger.info("application_startup_success", message="Services initialized successfully")
        yield
    except Exception as e:
        logger.error("application_startup_failed", error=str(e))
        raise
    finally:
        # Cleanup
        logger.info("application_shutdown", message="Shutting down services")
        if hasattr(app.state, 'bq_client'):
            app.state.bq_client.close()

# Initialize FastAPI app
app = FastAPI(
    title="Movie Search and Summary API",
    description="API for semantic search and summarization of movie data",
    version="1.0.0",
    lifespan=lifespan
)

# Add CORS middleware
app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],  # Configure appropriately for production
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

# Add prometheus metrics endpoint
metrics_app = make_asgi_app()
app.mount("/metrics", metrics_app)

@app.middleware("http")
async def add_logging_and_metrics(request: Request, call_next):
    """Middleware for logging and metrics collection"""
    start_time = time.time()
    request_id = str(time.time())
    
    logger.info(
        "request_started",
        request_id=request_id,
        method=request.method,
        url=str(request.url),
    )
    
    try:
        response = await call_next(request)
        duration = time.time() - start_time
        
        REQUESTS.labels(
            method=request.method,
            endpoint=request.url.path,
            status=response.status_code
        ).inc()
        LATENCY.labels(endpoint=request.url.path).observe(duration)
        
        logger.info(
            "request_completed",
            request_id=request_id,
            duration=duration,
            status_code=response.status_code
        )
        return response
    except Exception as e:
        logger.error(
            "request_failed",
            request_id=request_id,
            error=str(e),
            error_type=type(e).__name__
        )
        raise

def get_embedding(text: str, request_id: str = None) -> List[float]:
    """Generate embedding for the input text"""
    start_time = time.time()
    logger.info("generating_embedding", request_id=request_id, text_length=len(text))
    
    try:
        response = app.state.openai_client.embeddings.create(
            input=text,
            model="text-embedding-ada-002"
        )
        duration = time.time() - start_time
        EMBEDDING_LATENCY.observe(duration)
        
        logger.info(
            "embedding_generated",
            request_id=request_id,
            duration=duration
        )
        return response.data[0].embedding
    except Exception as e:
        logger.error(
            "embedding_generation_failed",
            request_id=request_id,
            error=str(e),
            error_type=type(e).__name__
        )
        raise HTTPException(
            status_code=500,
            detail=f"Error generating embedding: {str(e)}"
        )

def search_pinecone(vector: List[float], top_k: int, min_score: float, request_id: str = None) -> List[tuple]:
    """Search Pinecone index for similar vectors"""
    start_time = time.time()
    logger.info(
        "searching_pinecone",
        request_id=request_id,
        top_k=top_k,
        min_score=min_score
    )
    
    try:
        results = app.state.pinecone_index.query(
            vector=vector,
            top_k=top_k,
            include_metadata=True
        )
        duration = time.time() - start_time
        SEARCH_LATENCY.observe(duration)
        
        filtered_results = [
            (match.id, match.score)
            for match in results.matches
            if match.score >= min_score
        ]
        
        logger.info(
            "pinecone_search_completed",
            request_id=request_id,
            duration=duration,
            results_count=len(filtered_results)
        )
        return filtered_results
    except Exception as e:
        logger.error(
            "pinecone_search_failed",
            request_id=request_id,
            error=str(e),
            error_type=type(e).__name__
        )
        raise HTTPException(
            status_code=500,
            detail=f"Error searching Pinecone: {str(e)}"
        )

def get_full_text_from_bigquery(ids: List[str], request_id: str = None) -> List[dict]:
    """Retrieve full text from BigQuery for given IDs"""
    logger.info(
        "querying_bigquery",
        request_id=request_id,
        ids_count=len(ids)
    )
    
    query = f"""
    SELECT id, full_text
    FROM `{os.getenv('GCP_PROJECT_ID')}.{os.getenv('BQ_DATASET')}.full_text`
    WHERE id IN UNNEST(@ids)
    """
    
    job_config = bigquery.QueryJobConfig(
        query_parameters=[
            bigquery.ArrayParameter("ids", "STRING", ids)
        ]
    )
    
    try:
        start_time = time.time()
        results = app.state.bq_client.query(query, job_config=job_config).result()
        duration = time.time() - start_time
        
        results_list = [dict(row) for row in results]
        
        logger.info(
            "bigquery_query_completed",
            request_id=request_id,
            duration=duration,
            results_count=len(results_list)
        )
        return results_list
    except Exception as e:
        logger.error(
            "bigquery_query_failed",
            request_id=request_id,
            error=str(e),
            error_type=type(e).__name__
        )
        raise HTTPException(
            status_code=500,
            detail=f"Error querying BigQuery: {str(e)}"
        )

def generate_summary(query: str, texts: List[dict], request_id: str = None) -> str:
    """Generate summary using OpenAI"""
    start_time = time.time()
    logger.info(
        "generating_summary",
        request_id=request_id,
        query=query,
        texts_count=len(texts)
    )
    
    prompt = f"""Based on the following movie descriptions, provide a brief summary that addresses this search query: "{query}"
    
Movie descriptions:
{chr(10).join([f'- {text["full_text"]}' for text in texts])}

Please provide a concise summary that highlights the most relevant aspects related to the search query."""

    try:
        response = app.state.openai_client.chat.completions.create(
            model="gpt-4-turbo-preview",
            messages=[
                {"role": "system", "content": "You are a helpful assistant that provides concise and relevant summaries of movie information."},
                {"role": "user", "content": prompt}
            ],
            temperature=0.7,
            max_tokens=500
        )
        duration = time.time() - start_time
        SUMMARY_LATENCY.observe(duration)
        
        summary = response.choices[0].message.content
        
        logger.info(
            "summary_generated",
            request_id=request_id,
            duration=duration,
            summary_length=len(summary)
        )
        return summary
    except Exception as e:
        logger.error(
            "summary_generation_failed",
            request_id=request_id,
            error=str(e),
            error_type=type(e).__name__
        )
        raise HTTPException(
            status_code=500,
            detail=f"Error generating summary: {str(e)}"
        )

@app.post("/search", response_model=MovieSummary)
async def search_and_summarize(request: SearchRequest, req: Request):
    """Endpoint to search for similar movies and generate a summary"""
    request_id = str(time.time())
    logger.info(
        "search_request_received",
        request_id=request_id,
        query=request.query,
        top_k=request.top_k
    )
    
    try:
        # Generate embedding for query
        query_embedding = get_embedding(request.query, request_id)
        
        # Search Pinecone
        similar_vectors = search_pinecone(
            vector=query_embedding,
            top_k=request.top_k,
            min_score=request.min_score,
            request_id=request_id
        )
        
        if not similar_vectors:
            logger.info(
                "no_results_found",
                request_id=request_id,
                query=request.query
            )
            return MovieSummary(
                query=request.query,
                movies=[],
                summary="No relevant movies found for your query."
            )
        
        # Get IDs and scores
        ids = [id for id, _ in similar_vectors]
        scores = {id: score for id, score in similar_vectors}
        
        # Get full text from BigQuery
        movie_texts = get_full_text_from_bigquery(ids, request_id)
        
        # Add similarity scores to movie data
        for movie in movie_texts:
            movie['similarity_score'] = scores.get(movie['id'], 0)
        
        # Sort by similarity score
        movie_texts.sort(key=lambda x: x['similarity_score'], reverse=True)
        
        # Generate summary
        summary = generate_summary(request.query, movie_texts, request_id)
        
        logger.info(
            "search_request_completed",
            request_id=request_id,
            movies_count=len(movie_texts)
        )
        
        return MovieSummary(
            query=request.query,
            movies=movie_texts,
            summary=summary
        )
        
    except Exception as e:
        logger.error(
            "search_request_failed",
            request_id=request_id,
            error=str(e),
            error_type=type(e).__name__,
            query=request.query
        )
        raise HTTPException(status_code=500, detail=str(e))

@app.get("/health")
async def health_check():
    """Health check endpoint"""
    try:
        # Verify all services are connected
        _ = app.state.pinecone_index.describe_index_stats()
        _ = app.state.bq_client.list_datasets()
        _ = app.state.openai_client.models.list()
        
        return {
            "status": "healthy",
            "services": {
                "pinecone": "connected",
                "bigquery": "connected",
                "openai": "connected"
            }
        }
    except Exception as e:
        logger.error("health_check_failed", error=str(e))
        return {
            "status": "unhealthy",
            "error": str(e)
        }

if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=8000)
```

# Dockerfile for FastAPI Application

This Dockerfile creates a container to run a **FastAPI** application using **Python 3.9 slim**.

1. **Base Image**: Uses `python:3.9-slim` for a lightweight Python environment.
2. **Set Working Directory**: Sets `/app` as the working directory.
3. **Install Dependencies**:
   - Uses `apt-get` to install system-level tools (`build-essential`) needed for Python packages that require compilation.
4. **Copy and Install Python Requirements**:
   - Copies `requirements.txt` and installs dependencies using `pip`.
5. **Copy Application Code**: Copies all app files to the container.
6. **Expose Port**: Opens port `8000` for the app.
7. **Run the App**: Uses `uvicorn` to start the FastAPI app on host `0.0.0.0`, port `8000`.

This Dockerfile is lightweight, efficient, and well-suited for deploying FastAPI apps in a containerized environment.

--- 

```bash
# Use Python 3.9 slim image
FROM python:3.9-slim

# Set working directory
WORKDIR /app

# Install system dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
    build-essential \
    && rm -rf /var/lib/apt/lists/*

# Copy requirements first to leverage Docker cache
COPY requirements.txt .

# Install Python dependencies
RUN pip install --no-cache-dir -r requirements.txt

# Copy application code
COPY . .

# Expose port
EXPOSE 8000

# Command to run the application
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
```

# FastAPI Movie Search API Request Summary

To search for movies using the **Movie Search and Summary API**, use the Python `requests` library to send a **POST** request to the `/search` endpoint.

### Example Request
```python
import requests

response = requests.post(
    "http://api_url:8000/search",
    json={
        "query": "adventure movies with magical elements",
        "top_k": 5,
        "min_score": 0.7
    }
)
print(response.json())
```

### Key Parameters
- **`query`**: The search phrase (e.g., `"adventure movies with magical elements"`).
- **`top_k`**: Number of top results to return.
- **`min_score`**: Minimum similarity score (between `0` and `1`).

### Expected Output
- **Movies** matching the query, with metadata and similarity scores.
- A **summary** of relevant movies.

This example provides a straightforward way to send a search request and get results including a summary of related movies.

# Deploying FastAPI Application to Kubernetes

This configuration deploys the **Movie Search and Summary API** to a Kubernetes cluster using various Kubernetes resources:

### 1. **ConfigMap (`configmap.yaml`)**:
- Stores environment configuration, such as `GCP_PROJECT_ID`, `BQ_DATASET`, and structured logging settings.
  
### 2. **Secret (`secret.yaml`)**:
- Contains sensitive data like `OPENAI_API_KEY` and `PINECONE_API_KEY` encoded in base64.

### 3. **Deployment (`deployment.yaml`)**:
- **Deployment of API**:
  - Runs `movie-search-api` with 3 replicas.
  - Uses environment variables from the ConfigMap and Secret.
  - Includes health and readiness probes for monitoring the health of the service.
  - **Logging**:
    - Uses **Fluent Bit** as a sidecar for log processing and sending logs to **Google Stackdriver**.
  - **Resource Management**: Requests and limits for CPU and memory are defined.

### 4. **Fluent Bit Config (`fluent-bit-config.yaml`)**:
- Configures Fluent Bit to collect container logs, enrich them with Kubernetes metadata, and send them to **Google Stackdriver**.

### 5. **Service (`service.yaml`)**:
- Defines a **LoadBalancer** service named `movie-search-service` to expose the application externally on port `80` (mapped to port `8000` in the container).

### 6. **Horizontal Pod Autoscaler (`hpa.yaml`)**:
- Scales the number of pods between **3** and **10** based on **CPU utilization** (targeting 70%).

---


### Deploying to kubernetes
```yaml
# config/configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: movie-search-config
data:
  GCP_PROJECT_ID: "your-project-id"
  BQ_DATASET: "your-dataset"
  PINECONE_ENV: "your-pinecone-env"
  PINECONE_INDEX_NAME: "your-index-name"
  LOG_LEVEL: "INFO"
  # Structured logging configuration
  LOGGING_CONFIG: |
    {
      "version": 1,
      "disable_existing_loggers": false,
      "formatters": {
        "json": {
          "format": "%(levelname)s %(asctime)s %(name)s %(message)s",
          "datefmt": "%Y-%m-%d %H:%M:%S",
          "class": "pythonjsonlogger.jsonlogger.JsonFormatter"
        }
      },
      "handlers": {
        "console": {
          "class": "logging.StreamHandler",
          "formatter": "json",
          "stream": "ext://sys.stdout"
        }
      },
      "root": {
        "level": "INFO",
        "handlers": ["console"]
      }
    }
---
# config/secret.yaml
apiVersion: v1
kind: Secret
metadata:
  name: movie-search-secrets
type: Opaque
data:
  OPENAI_API_KEY: "base64-encoded-key"
  PINECONE_API_KEY: "base64-encoded-key"
---
# config/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: movie-search-api
  labels:
    app: movie-search-api
spec:
  replicas: 3
  selector:
    matchLabels:
      app: movie-search-api
  template:
    metadata:
      labels:
        app: movie-search-api
      annotations:
        # Enable GCP Cloud Logging
        logging.cloud.google.com/agent: '{"plugins":["opentelemetry","prometheus","application"]}'
    spec:
      containers:
      - name: movie-search-api
        image: gcr.io/your-project-id/movie-search-api:latest
        ports:
        - containerPort: 8000
        resources:
          requests:
            memory: "512Mi"
            cpu: "250m"
          limits:
            memory: "1Gi"
            cpu: "500m"
        env:
        # Add trace context to logs
        - name: OTEL_SERVICE_NAME
          value: "movie-search-api"
        - name: OTEL_PROPAGATORS
          value: "tracecontext,baggage"
        # Add pod metadata to logs
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        envFrom:
        - configMapRef:
            name: movie-search-config
        - secretRef:
            name: movie-search-secrets
        readinessProbe:
          httpGet:
            path: /health
            port: 8000
          initialDelaySeconds: 5
          periodSeconds: 10
        livenessProbe:
          httpGet:
            path: /health
            port: 8000
          initialDelaySeconds: 15
          periodSeconds: 20
        # Mount fluentbit config for log processing
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
      # Sidecar container for log collection
      - name: fluent-bit
        image: fluent/fluent-bit:latest
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
        - name: fluent-bit-config
          mountPath: /fluent-bit/etc/
      volumes:
      - name: varlog
        emptyDir: {}
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
      - name: fluent-bit-config
        configMap:
          name: fluent-bit-config
---
# config/fluent-bit-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: fluent-bit-config
data:
  fluent-bit.conf: |
    [SERVICE]
        Flush         1
        Log_Level     info
        Daemon        off
        Parsers_File  parsers.conf

    [INPUT]
        Name             tail
        Path             /var/log/containers/*.log
        Parser           docker
        Tag              kube.*
        Mem_Buf_Limit    5MB
        Skip_Long_Lines  On

    [FILTER]
        Name                kubernetes
        Match               kube.*
        Kube_URL           https://kubernetes.default.svc:443
        Merge_Log          On
        K8S-Logging.Parser On

    [OUTPUT]
        Name            stackdriver
        Match           *
        resource        k8s_container
---
# config/service.yaml
apiVersion: v1
kind: Service
metadata:
  name: movie-search-service
spec:
  selector:
    app: movie-search-api
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8000
  type: LoadBalancer
---
# config/hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: movie-search-api-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: movie-search-api
  minReplicas: 3
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
```

# GKE Cluster Setup Script 

1. **Set Environment**: Define project ID, cluster name, and region.
2. **Create GKE Cluster**: Create a cluster named `movie-search` with autoscaling (3-10 nodes), multi-zone setup, logging, and monitoring.
3. **Get Credentials**: Allow `kubectl` to manage the cluster.
4. **Namespace and Service Account**: Create namespace and IAM service account (`movie-search-sa`) for workloads.
5. **Grant Permissions**: Allow the service account to access BigQuery (`dataViewer`) and log services (`logWriter`).
6. **Verify**: Ensure nodes are ready with `kubectl get nodes`.

This script sets up a scalable, monitored GKE cluster with proper permissions for deploying the Movie Search application.

--- 

```bash
# Set environment variables
export PROJECT_ID=your-project-id
export CLUSTER_NAME=movie-search
export REGION=us-central1

# Set project
gcloud config set project $PROJECT_ID

# Create cluster with essential features
gcloud container clusters create $CLUSTER_NAME \
    --region $REGION \
    --num-nodes 3 \
    --machine-type e2-standard-2 \
    --enable-autoscaling \
    --min-nodes 3 \
    --max-nodes 10 \
    --node-locations $REGION-a,$REGION-b,$REGION-c \
    --logging=SYSTEM,WORKLOAD \
    --monitoring=SYSTEM \
    --enable-ip-alias \
    --workload-pool=${PROJECT_ID}.svc.id.goog \
    --labels=app=movie-search

# Get credentials
gcloud container clusters get-credentials $CLUSTER_NAME --region $REGION

# Create namespace and service account
kubectl create namespace movie-search

# Create service account for workload identity
gcloud iam service-accounts create movie-search-sa \
    --display-name="Movie Search Service Account"

# Grant permissions
gcloud projects add-iam-policy-binding $PROJECT_ID \
    --member="serviceAccount:movie-search-sa@${PROJECT_ID}.iam.gserviceaccount.com" \
    --role="roles/bigquery.dataViewer"

gcloud projects add-iam-policy-binding $PROJECT_ID \
    --member="serviceAccount:movie-search-sa@${PROJECT_ID}.iam.gserviceaccount.com" \
    --role="roles/logging.logWriter"

# Verify setup
kubectl get nodes
```

```bash
# Dockerfile
FROM python:3.9-slim

# Set environment variables
ENV PYTHONUNBUFFERED=1 \
    PYTHONDONTWRITEBYTECODE=1 \
    PIP_NO_CACHE_DIR=1 \
    PIP_DISABLE_PIP_VERSION_CHECK=1

# Create non-root user
RUN useradd -m -s /bin/bash app

WORKDIR /app

# Install dependencies first
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy application code
COPY . .

# Set ownership and permissions
RUN chown -R app:app /app

# Switch to non-root user
USER app

# Expose port
EXPOSE 8000

# Health check
HEALTHCHECK --interval=30s --timeout=3s \
  CMD curl -f http://localhost:8000/health || exit 1

CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "4"]
```

```bash
#!/bin/bash

# Set environment variables
export PROJECT_ID=$(gcloud config get-value project)
export IMAGE_NAME="movie-search-api"
export IMAGE_TAG=$(git rev-parse --short HEAD 2>/dev/null || echo "latest")

echo "Building and pushing image to GCR..."

# Build the image
docker build -t gcr.io/$PROJECT_ID/$IMAGE_NAME:$IMAGE_TAG \
            -t gcr.io/$PROJECT_ID/$IMAGE_NAME:latest .

# Push the images
docker push gcr.io/$PROJECT_ID/$IMAGE_NAME:$IMAGE_TAG
docker push gcr.io/$PROJECT_ID/$IMAGE_NAME:latest

echo "Successfully built and pushed: gcr.io/$PROJECT_ID/$IMAGE_NAME:$IMAGE_TAG"

# Update Kubernetes deployment if needed
read -p "Do you want to update the Kubernetes deployment? (y/n) " -n 1 -r
echo
if [[ $REPLY =~ ^[Yy]$ ]]
then
    kubectl set image deployment/movie-search-api \
            movie-search-api=gcr.io/$PROJECT_ID/$IMAGE_NAME:$IMAGE_TAG \
            -n movie-search
    echo "Deployment updated successfully!"
fi
```

# Kubernetes Monitoring Configuration

This configuration deploys **Prometheus** and **Grafana** for monitoring the **Movie Search API** in a Kubernetes environment.

### 1. **Prometheus and Grafana Values (`prometheus-values.yaml`)**:

- **Grafana**:
  - **Admin Password**: Sets a secure admin password.
  - **Persistence**: Enables persistent storage of **10Gi** for dashboards and data.
  - **Dashboard Configuration**: Preloads an "API Monitoring" dashboard, visualizing metrics such as the **Request Rate** (`rate(search_requests_total[5m])`).

- **Prometheus**:
  - **Retention**: Data retention of **15 days**.
  - **Resources**: Defines resource requests (`512Mi` memory, `500m` CPU) and limits (`2Gi` memory, `1000m` CPU).
  - **Storage**: Configures **50Gi** persistent storage.

- **Alertmanager**:
  - **Enabled**: Configured with a **Slack** integration for alert notifications.
  - **Alert Settings**: Alerts are grouped by `job` with a repeat interval of **12 hours** for critical alerts.

### 2. **ServiceMonitor (`service-monitor.yaml`)**:

- **ServiceMonitor**:
  - Monitors the **Movie Search API** for metrics.
  - Collects metrics every **15 seconds** from the `/metrics` endpoint.

### 3. **Prometheus Alerts (`prometheus-rules.yaml`)**:

- **Alerts Configuration**:
  - **HighErrorRate**: Triggers if the HTTP error rate (`5xx`) exceeds **5%** for **5 minutes**.
  - **HighLatency**: Alerts if **95th percentile latency** is above **2 seconds** for **5 minutes**.
  - **HighCPUUsage**: Warns when **CPU usage** exceeds **80%** of the container's CPU quota for **15 minutes**.

---


```bash
# monitoring/prometheus-values.yaml
grafana:
  adminPassword: "your-secure-password"
  persistence:
    enabled: true
    size: 10Gi
  dashboardProviders:
    dashboardproviders.yaml:
      apiVersion: 1
      providers:
      - name: 'default'
        orgId: 1
        folder: ''
        type: file
        disableDeletion: false
        editable: true
        options:
          path: /var/lib/grafana/dashboards
  dashboards:
    default:
      api-monitoring:
        json: |
          {
            "annotations": {
              "list": []
            },
            "editable": true,
            "fiscalYearStartMonth": 0,
            "graphTooltip": 0,
            "links": [],
            "liveNow": false,
            "panels": [
              {
                "datasource": {
                  "type": "prometheus",
                  "uid": "prometheus"
                },
                "fieldConfig": {
                  "defaults": {
                    "color": {
                      "mode": "palette-classic"
                    },
                    "custom": {
                      "axisCenteredZero": false,
                      "axisColorMode": "text",
                      "axisLabel": "",
                      "axisPlacement": "auto",
                      "barAlignment": 0,
                      "drawStyle": "line",
                      "fillOpacity": 10,
                      "gradientMode": "none",
                      "hideFrom": {
                        "legend": false,
                        "tooltip": false,
                        "viz": false
                      },
                      "lineInterpolation": "linear",
                      "lineWidth": 1,
                      "pointSize": 5,
                      "scaleDistribution": {
                        "type": "linear"
                      },
                      "showPoints": "never",
                      "spanNulls": false,
                      "stacking": {
                        "group": "A",
                        "mode": "none"
                      },
                      "thresholdsStyle": {
                        "mode": "off"
                      }
                    },
                    "mappings": [],
                    "thresholds": {
                      "mode": "absolute",
                      "steps": [
                        {
                          "color": "green",
                          "value": null
                        }
                      ]
                    },
                    "unit": "short"
                  },
                  "overrides": []
                },
                "gridPos": {
                  "h": 8,
                  "w": 12,
                  "x": 0,
                  "y": 0
                },
                "id": 1,
                "options": {
                  "legend": {
                    "calcs": [],
                    "displayMode": "list",
                    "placement": "bottom",
                    "showLegend": true
                  },
                  "tooltip": {
                    "mode": "single",
                    "sort": "none"
                  }
                },
                "targets": [
                  {
                    "datasource": {
                      "type": "prometheus",
                      "uid": "prometheus"
                    },
                    "expr": "rate(search_requests_total[5m])",
                    "refId": "A"
                  }
                ],
                "title": "Request Rate",
                "type": "timeseries"
              }
            ],
            "refresh": "5s",
            "schemaVersion": 38,
            "style": "dark",
            "tags": [],
            "templating": {
              "list": []
            },
            "time": {
              "from": "now-1h",
              "to": "now"
            },
            "timepicker": {},
            "timezone": "",
            "title": "API Monitoring",
            "version": 0,
            "weekStart": ""
          }

prometheusOperator:
  enabled: true
  serviceMonitor:
    enabled: true

prometheus:
  prometheusSpec:
    retention: 15d
    resources:
      requests:
        memory: 512Mi
        cpu: 500m
      limits:
        memory: 2Gi
        cpu: 1000m
    storageSpec:
      volumeClaimTemplate:
        spec:
          accessModes: ["ReadWriteOnce"]
          resources:
            requests:
              storage: 50Gi

alertmanager:
  enabled: true
  config:
    global:
      resolve_timeout: 5m
    route:
      group_by: ['job']
      group_wait: 30s
      group_interval: 5m
      repeat_interval: 12h
      receiver: 'slack'
      routes:
      - match:
          severity: critical
        receiver: 'slack'
    receivers:
    - name: 'slack'
      slack_configs:
      - api_url: 'https://hooks.slack.com/services/your-webhook-url'
        channel: '#alerts'
        send_resolved: true
---
# monitoring/service-monitor.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: movie-search-monitor
  labels:
    release: prometheus
spec:
  selector:
    matchLabels:
      app: movie-search-api
  endpoints:
  - port: http
    path: /metrics
    interval: 15s
---
# monitoring/prometheus-rules.yaml
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: movie-search-alerts
  labels:
    release: prometheus
spec:
  groups:
  - name: movie-search
    rules:
    - alert: HighErrorRate
      expr: |
        sum(rate(http_requests_total{status=~"5.."}[5m])) 
        / 
        sum(rate(http_requests_total[5m])) 
        > 0.05
      for: 5m
      labels:
        severity: critical
      annotations:
        summary: High error rate detected
        description: Error rate is above 5% for more than 5 minutes
    
    - alert: HighLatency
      expr: |
        histogram_quantile(0.95, sum(rate(search_latency_seconds_bucket[5m])) by (le)) 
        > 2
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: High latency detected
        description: 95th percentile latency is above 2 seconds
    
    - alert: HighCPUUsage
      expr: |
        container_cpu_usage_seconds_total{container="movie-search-api"} 
        > 
        container_spec_cpu_quota{container="movie-search-api"} * 0.8
      for: 15m
      labels:
        severity: warning
      annotations:
        summary: High CPU usage detected
        description: Container is using more than 80% of its CPU quota

    ```