Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 49 additions & 10 deletions CodeGen/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -106,19 +106,58 @@ flowchart LR

This CodeGen example can be deployed manually on various hardware platforms using Docker Compose or Kubernetes. Select the appropriate guide based on your target environment:

| Hardware | Deployment Mode | Guide Link |
| :-------------- | :------------------- | :----------------------------------------------------------------------- |
| Intel Xeon CPU | Single Node (Docker) | [Xeon Docker Compose Guide](./docker_compose/intel/cpu/xeon/README.md) |
| Intel Gaudi HPU | Single Node (Docker) | [Gaudi Docker Compose Guide](./docker_compose/intel/hpu/gaudi/README.md) |
| AMD EPYC CPU | Single Node (Docker) | [EPYC Docker Compose Guide](./docker_compose/amd/cpu/epyc/README.md) |
| AMD ROCm GPU | Single Node (Docker) | [ROCm Docker Compose Guide](./docker_compose/amd/gpu/rocm/README.md) |
| Intel Xeon CPU | Kubernetes (Helm) | [Kubernetes Helm Guide](./kubernetes/helm/README.md) |
| Intel Gaudi HPU | Kubernetes (Helm) | [Kubernetes Helm Guide](./kubernetes/helm/README.md) |
| Intel Xeon CPU | Kubernetes (GMC) | [Kubernetes GMC Guide](./kubernetes/gmc/README.md) |
| Intel Gaudi HPU | Kubernetes (GMC) | [Kubernetes GMC Guide](./kubernetes/gmc/README.md) |
| Hardware | Deployment Mode | Guide Link |
| :-------------- | :----------------------------------- | :--------------------------------------------------------------------------------------- |
| Intel Xeon CPU | Single Node (Docker) | [Xeon Docker Compose Guide](./docker_compose/intel/cpu/xeon/README.md) |
| Intel Xeon CPU | Single Node (Docker) with Monitoring | [Xeon Docker Compose with Monitoring Guide](./docker_compose/intel/cpu/xeon/README.md) |
| Intel Gaudi HPU | Single Node (Docker) | [Gaudi Docker Compose Guide](./docker_compose/intel/hpu/gaudi/README.md) |
| Intel Gaudi HPU | Single Node (Docker) with Monitoring | [Gaudi Docker Compose with Monitoring Guide](./docker_compose/intel/hpu/gaudi/README.md) |
| AMD EPYC CPU | Single Node (Docker) | [EPYC Docker Compose Guide](./docker_compose/amd/cpu/epyc/README.md) |
| AMD ROCm GPU | Single Node (Docker) | [ROCm Docker Compose Guide](./docker_compose/amd/gpu/rocm/README.md) |
| Intel Xeon CPU | Kubernetes (Helm) | [Kubernetes Helm Guide](./kubernetes/helm/README.md) |
| Intel Gaudi HPU | Kubernetes (Helm) | [Kubernetes Helm Guide](./kubernetes/helm/README.md) |
| Intel Xeon CPU | Kubernetes (GMC) | [Kubernetes GMC Guide](./kubernetes/gmc/README.md) |
| Intel Gaudi HPU | Kubernetes (GMC) | [Kubernetes GMC Guide](./kubernetes/gmc/README.md) |

_Note: Building custom microservice images can be done using the resources in [GenAIComps](https://github.com/opea-project/GenAIComps)._

## Monitoring

The CodeGen example supports monitoring capabilities for Intel Xeon and Intel Gaudi platforms. Monitoring includes:

- **Prometheus**: For metrics collection and querying
- **Grafana**: For visualization and dashboards
- **Node Exporter**: For system metrics collection

### Monitoring Features

- Real-time metrics collection from all CodeGen microservices
- Pre-configured dashboards for:
- vLLM/TGI performance metrics
- CodeGen MegaService metrics
- System resource utilization
- Node-level metrics

### Enabling Monitoring

Monitoring can be enabled by using the `compose.monitoring.yaml` file along with the main compose file:

```bash
# For Intel Xeon
docker compose -f compose.yaml -f compose.monitoring.yaml up -d

# For Intel Gaudi
docker compose -f compose.yaml -f compose.monitoring.yaml up -d
```

### Accessing Monitoring Services

Once deployed with monitoring, you can access:

- **Prometheus**: `http://${HOST_IP}:9090`
- **Grafana**: `http://${HOST_IP}:3000` (username: `admin`, password: `admin`)
- **Node Exporter**: `http://${HOST_IP}:9100`

## Benchmarking

Guides for evaluating the performance and accuracy of this CodeGen deployment are available:
Expand Down
63 changes: 60 additions & 3 deletions CodeGen/docker_compose/intel/cpu/xeon/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,8 @@ This uses the default vLLM-based deployment using `compose.yaml`.
# export https_proxy="your_https_proxy"
# export no_proxy="localhost,127.0.0.1,${HOST_IP}" # Add other hosts if necessary
source intel/set_env.sh
cd /intel/cpu/xeon
cd intel/cpu/xeon
bash grafana/dashboards/download_opea_dashboard.sh
```

_Note: The compose file might read additional variables from set_env.sh. Ensure all required variables like ports (`LLM_SERVICE_PORT`, `MEGA_SERVICE_PORT`, etc.) are set if not using defaults from the compose file._
Expand Down Expand Up @@ -146,7 +147,7 @@ Key parameters are configured via environment variables set before running `dock
Most of these parameters are in `set_env.sh`, you can either modify this file or overwrite the env variables by setting them.

```shell
source CodeGen/docker_compose/set_env.sh
source CodeGen/docker_compose/intel/set_env.sh
```

#### Compose Files
Expand Down Expand Up @@ -252,7 +253,63 @@ Users can interact with the backend service using the `Neural Copilot` VS Code e
- **"Container name is in use"**: Stop existing containers (`docker compose down`) or change `container_name` in the compose file.
- **Resource Issues:** CodeGen models can be memory-intensive. Monitor host RAM usage. Increase Docker resources if needed.

## Stopping the Application
## Monitoring Deployment

To enable monitoring for the CodeGen application, you can use the monitoring Docker Compose file along with the main deployment.

### Option #1: Default Deployment (without monitoring)

To deploy the CodeGen services without monitoring, execute:

```bash
docker compose up -d
```

### Option #2: Deployment with Monitoring

> NOTE: To enable monitoring, `compose.monitoring.yaml` file need to be merged along with default `compose.yaml` file.

To deploy with monitoring:

```bash
bash grafana/dashboards/download_opea_dashboard.sh
docker compose -f compose.yaml -f compose.monitoring.yaml up -d
```

### Accessing Monitoring Services

Once deployed with monitoring, you can access:

- **Prometheus**: `http://${HOST_IP}:9090`
- **Grafana**: `http://${HOST_IP}:3000` (username: `admin`, password: `admin`)
- **Node Exporter**: `http://${HOST_IP}:9100`

### Monitoring Components

The monitoring stack includes:

- **Prometheus**: For metrics collection and querying
- **Grafana**: For visualization and dashboards
- **Node Exporter**: For system metrics collection

### Monitoring Dashboards

The following dashboards are automatically downloaded and configured:

- vLLM Dashboard
- TGI Dashboard
- CodeGen MegaService Dashboard
- Node Exporter Dashboard

### Stopping the Application

If monitoring is enabled, execute the following command:

```bash
docker compose -f compose.yaml -f compose.monitoring.yaml down
```

If monitoring is not enabled, execute:

```bash
docker compose down # for vLLM (compose.yaml)
Expand Down
58 changes: 58 additions & 0 deletions CodeGen/docker_compose/intel/cpu/xeon/compose.monitoring.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

services:
prometheus:
image: prom/prometheus:v2.52.0
container_name: opea_prometheus
user: root
volumes:
- ./prometheus.yaml:/etc/prometheus/prometheus.yaml
- ./prometheus_data:/prometheus
command:
- '--config.file=/etc/prometheus/prometheus.yaml'
ports:
- '9090:9090'
ipc: host
restart: unless-stopped

grafana:
image: grafana/grafana:11.0.0
container_name: grafana
volumes:
- ./grafana_data:/var/lib/grafana
- ./grafana/dashboards:/var/lib/grafana/dashboards
- ./grafana/provisioning:/etc/grafana/provisioning
user: root
environment:
GF_SECURITY_ADMIN_PASSWORD: admin
GF_RENDERING_CALLBACK_URL: http://grafana:3000/
GF_LOG_FILTERS: rendering:debug
no_proxy: ${no_proxy}
host_ip: ${host_ip}
depends_on:
- prometheus
ports:
- '3000:3000'
ipc: host
restart: unless-stopped

node-exporter:
image: prom/node-exporter
container_name: node-exporter
volumes:
- /proc:/host/proc:ro
- /sys:/host/sys:ro
- /:/rootfs:ro
command:
- '--path.procfs=/host/proc'
- '--path.sysfs=/host/sys'
- --collector.filesystem.ignored-mount-points
- "^/(sys|proc|dev|host|etc|rootfs/var/lib/docker/containers|rootfs/var/lib/docker/overlay2|rootfs/run/docker/netns|rootfs/var/lib/docker/aufs)($$|/)"
environment:
no_proxy: ${no_proxy}
ports:
- 9100:9100
restart: always
deploy:
mode: global
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
#!/bin/bash
# Copyright (C) 2025 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
cd "$SCRIPT_DIR"
if ls *.json 1> /dev/null 2>&1; then
rm *.json
fi

wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/vllm_grafana.json
wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/tgi_grafana.json
wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/codegen_megaservice_grafana.json
wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/node_grafana.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# Copyright (C) 2025 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

apiVersion: 1

providers:
- name: 'default'
orgId: 1
folder: ''
type: file
disableDeletion: false
updateIntervalSeconds: 10 #how often Grafana will scan for changed dashboards
options:
path: /var/lib/grafana/dashboards
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# Copyright (C) 2025 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

# config file version
apiVersion: 1

# list of datasources that should be deleted from the database
deleteDatasources:
- name: Prometheus
orgId: 1

# list of datasources to insert/update depending
# what's available in the database
datasources:
# <string, required> name of the datasource. Required
- name: Prometheus
# <string, required> datasource type. Required
type: prometheus
# <string, required> access mode. direct or proxy. Required
access: proxy
# <int> org id. will default to orgId 1 if not specified
orgId: 1
# <string> url
url: http://$host_ip:9090
# <string> database password, if used
password:
# <string> database user, if used
user:
# <string> database name, if used
database:
# <bool> enable/disable basic auth
basicAuth: false
# <string> basic auth username, if used
basicAuthUser:
# <string> basic auth password, if used
basicAuthPassword:
# <bool> enable/disable with credentials headers
withCredentials:
# <bool> mark as default datasource. Max one per org
isDefault: true
# <map> fields that will be converted to json and stored in json_data
jsonData:
httpMethod: GET
graphiteVersion: "1.1"
tlsAuth: false
tlsAuthWithCACert: false
# <string> json object of data that will be encrypted.
secureJsonData:
tlsCACert: "..."
tlsClientCert: "..."
tlsClientKey: "..."
version: 1
# <bool> allow users to edit datasources from the UI.
editable: true
27 changes: 27 additions & 0 deletions CodeGen/docker_compose/intel/cpu/xeon/prometheus.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Copyright (C) 2025 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
# [IP_ADDR]:{PORT_OUTSIDE_CONTAINER} -> {PORT_INSIDE_CONTAINER} / {PROTOCOL}
global:
scrape_interval: 5s
external_labels:
monitor: "my-monitor"
scrape_configs:
- job_name: "prometheus"
static_configs:
- targets: ["opea_prometheus:9090"]
- job_name: "vllm"
metrics_path: /metrics
static_configs:
- targets: ["vllm-server:80"]
- job_name: "tgi"
metrics_path: /metrics
static_configs:
- targets: [ "tgi-service:80" ]
- job_name: "codegen-backend-server"
metrics_path: /metrics
static_configs:
- targets: ["codegen-xeon-backend-server:7778"]
- job_name: "prometheus-node-exporter"
metrics_path: /metrics
static_configs:
- targets: ["node-exporter:9100"]
62 changes: 60 additions & 2 deletions CodeGen/docker_compose/intel/hpu/gaudi/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,10 @@ This uses the default vLLM-based deployment using `compose.yaml`.
# export https_proxy="your_https_proxy"
# export no_proxy="localhost,127.0.0.1,${HOST_IP}" # Add other hosts if necessary
source intel/set_env.sh
cd /intel/hpu/gaudi
cd intel/hpu/gaudi
cd grafana/dashboards
bash download_opea_dashboard.sh
cd ../..
```

_Note: The compose file might read additional variables from set_env.sh. Ensure all required variables like ports (`LLM_SERVICE_PORT`, `MEGA_SERVICE_PORT`, etc.) are set if not using defaults from the compose file._
Expand Down Expand Up @@ -228,7 +231,62 @@ Use the `Neural Copilot` extension configured with the CodeGen backend URL: `htt
- **Model Download Issues:** Check `HF_TOKEN`, internet access, proxy settings. Check LLM service logs.
- **Connection Errors:** Verify `HOST_IP`, ports, and proxy settings. Use `docker ps` and check service logs.

## Stopping the Application
## Monitoring Deployment

To enable monitoring for the CodeGen application on Gaudi, you can use the monitoring Docker Compose file along with the main deployment.

### Option #1: Default Deployment (without monitoring)

To deploy the CodeGen services without monitoring, execute:

```bash
docker compose up -d
```

### Option #2: Deployment with Monitoring

> NOTE: To enable monitoring, `compose.monitoring.yaml` file need to be merged along with default `compose.yaml` file.

To deploy with monitoring:

```bash
docker compose -f compose.yaml -f compose.monitoring.yaml up -d
```

### Accessing Monitoring Services

Once deployed with monitoring, you can access:

- **Prometheus**: `http://${HOST_IP}:9090`
- **Grafana**: `http://${HOST_IP}:3000` (username: `admin`, password: `admin`)
- **Node Exporter**: `http://${HOST_IP}:9100`

### Monitoring Components

The monitoring stack includes:

- **Prometheus**: For metrics collection and querying
- **Grafana**: For visualization and dashboards
- **Node Exporter**: For system metrics collection

### Monitoring Dashboards

The following dashboards are automatically downloaded and configured:

- vLLM Dashboard
- TGI Dashboard
- CodeGen MegaService Dashboard
- Node Exporter Dashboard

### Stopping the Application

If monitoring is enabled, execute the following command:

```bash
docker compose -f compose.yaml -f compose.monitoring.yaml down
```

If monitoring is not enabled, execute:

```bash
docker compose down # for vLLM (compose.yaml)
Expand Down
Loading