Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rs/logging example #796

Open
wants to merge 21 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
230 changes: 230 additions & 0 deletions examples/logging-prometheus/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,230 @@
# DeepSparse + Prometheus/Grafana

This is a simple example that shows you how to connect DeepSparse Logging to the Prometheus/Grafana stack.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This is a simple example that shows you how to connect DeepSparse Logging to the Prometheus/Grafana stack.
This is a simple example that shows you how to connect DeepSparse logging to the Prometheus/Grafana stack.


#### There are four steps:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#### There are four steps:
There are four steps:

- Configure DeepSparse Logging to log metrics in Prometheus format to a REST endpoint
robertgshaw2-neuralmagic marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Configure DeepSparse Logging to log metrics in Prometheus format to a REST endpoint
1. Configure DeepSparse logging to log metrics in Prometheus format to a REST endpoint.

- Point Prometheus to the appropriate endpoint to scrape the data at a specified interval
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Point Prometheus to the appropriate endpoint to scrape the data at a specified interval
2. Point Prometheus to the appropriate endpoint to scrape the data at a specified interval.

- Run the client script simulating a data quality/drift issue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Run the client script simulating a data quality/drift issue
3. Run the client script simulating a data quality/drift issue.

- Visualize data in Prometheus with dashboarding tool like Grafana
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Visualize data in Prometheus with dashboarding tool like Grafana
4. Visualize data in Prometheus with a dashboarding tool such as Grafana.


## 0. Setting Up
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## 0. Setting Up
## Setting Up

#### Installation

To run this tutorial, you need Docker, Docker Compose, and DeepSparse Server
- [Docker Installation](https://docs.docker.com/engine/install/)
- [Docker Compose Installation](https://docs.docker.com/compose/install/)
- DeepSparse Server is installed via PyPi (`pip install deepsparse[server]`)

#### Code
The repository contains all the code you need:

```bash
.
├── client
│   ├── client.py # simple client application for interacting with Server
│   └── goldfish.jpeg # photo of a goldfish
| └── all_black.jpeg # photo with just black pixels
├── server-config.yaml # specifies the configuration of the DeepSparse server
├── custom-fn.py # custom function used for the logging
├── docker # specifies the configuration of the containerized Prometheus/Grafana stack
│   ├── docker-compose.yaml
│   └── prometheus.yaml
└── grafana # specifies the design of the Grafana dashboard
└── dashboard.json
```
## 1. Spin up the DeepSparse Server
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## 1. Spin up the DeepSparse Server
## Step 1: Spin Up the DeepSparse Server


`server-config.yaml` specifies the config of the DeepSparse Server, including for logging:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`server-config.yaml` specifies the config of the DeepSparse Server, including for logging:
`server-config.yaml` specifies the configuration of the DeepSparse Server, including the following for logging:


```yaml
# server-config.yaml

loggers:
prometheus: # logs to prometheus on port 6100
port: 6100

endpoints:
- task: image_classification
route: /image_classification/predict
model: zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned95_quant-none
data_logging:
pipeline_inputs.images[0]: # applies to the first image (of the form target.property[idx])
- func: fraction_zeros # built-in function
frequency: 1
target_loggers:
- prometheus
- func: custom-fn.py:mean_pixel_red # custom function
frequency: 1
target_loggers:
- prometheus
```

The config file instructs the server to create an image classification pipeline. Prometheus logs are declared to be exposed on port `6100`, system logging is turned on, and we will log the mean pixel of the red channel (a custom function) as well as the percentage of pixels that are 0 (a built-in function) for each image sent to the server.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The config file instructs the server to create an image classification pipeline. Prometheus logs are declared to be exposed on port `6100`, system logging is turned on, and we will log the mean pixel of the red channel (a custom function) as well as the percentage of pixels that are 0 (a built-in function) for each image sent to the server.
The configuration file instructs the Server to create an image classification pipeline. Prometheus logs are declared to be exposed on port `6100`, system logging is turned on, and we will log the mean pixel of the red channel (a custom function) as well as the percentage of pixels that are 0 (a built-in function) for each image sent to the Server.


Thus, once launched, the Server exposes two endpoints:
- port `6100`: exposes the `metrics` endpoint through [Prometheus python client](https://github.com/prometheus/client_python).
- port `5543`: exposes the endpoint for inference.

To spin up the Server execute:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
To spin up the Server execute:
To spin up the Server, execute:

```
deepsparse.server --config_file server-config.yaml
```

To validate that metrics are being properly exposed, visit `localhost:6100`. It should contain logs in the specific format meant to be used by the PromQL query language.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
To validate that metrics are being properly exposed, visit `localhost:6100`. It should contain logs in the specific format meant to be used by the PromQL query language.
To validate that metrics are being properly exposed, visit `localhost:6100`. It should contain logs in the specific format meant to be used by the Prometheus Query Language (PromQL).


## 2. Setup Prometheus/Grafana Stack
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## 2. Setup Prometheus/Grafana Stack
## Step 2: Set Up Prometheus/Grafana Stack


For simplicity, we have provided `docker-compose.yaml` that spins up the containerized Prometheus/Grafana stack. In that file, we instruct `prometheus.yaml` (a [Prometheus config file](https://prometheus.io/docs/prometheus/latest/configuration/configuration/)) to be passed to the Prometheus container. Inside `prometheus.yaml`, the `scrape_config` has the information about the `metrics` endpoint exposed by the server on port `6100`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
For simplicity, we have provided `docker-compose.yaml` that spins up the containerized Prometheus/Grafana stack. In that file, we instruct `prometheus.yaml` (a [Prometheus config file](https://prometheus.io/docs/prometheus/latest/configuration/configuration/)) to be passed to the Prometheus container. Inside `prometheus.yaml`, the `scrape_config` has the information about the `metrics` endpoint exposed by the server on port `6100`.
For simplicity, we have provided `docker-compose.yaml`, which spins up the containerized Prometheus/Grafana stack. In that file, we instruct `prometheus.yaml` (a [Prometheus configuration file](https://prometheus.io/docs/prometheus/latest/configuration/configuration/)) to be passed to the Prometheus container. Inside `prometheus.yaml`, the `scrape_config` has the information about the `metrics` endpoint exposed by the Server on port `6100`.


Docker Compose File:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Docker Compose File:
#### Docker Compose File


```yaml
# docker-compose.yaml

version: "3"

services:
prometheus:
image: prom/prometheus
extra_hosts:
- "host.docker.internal:host-gateway" # allow a direct connection from container to the local machine
ports:
- "9090:9090" # the default port used by Prometheus
volumes:
- ${PWD}/prometheus.yaml:/etc/prometheus/prometheus.yml # mount Prometheus config file

grafana:
image: grafana/grafana:latest
depends_on:
- prometheus
ports:
- "3000:3000" # the default port used by Grafana

```

Prometheus Config file:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Prometheus Config file:
#### Prometheus Configuration File


```yaml
# prometheus.yaml

global:
scrape_interval: 15s # how often to scrape from endpoint
evaluation_interval: 30s # time between each evaluation of Prometheus' alerting rules

scrape_configs:
- job_name: prometheus_logs # your project name
static_configs:
- targets:
- 'host.docker.internal:6100' # should match the port exposed by the PrometheusLogger in the DeepSparse Server config file
```
</details>

To start up a Prometheus stack to monitor the DeepSparse Server, run:

```bash
cd docker
docker-compose up
```

## 3. Launch the Python Client and Run Inference
robertgshaw2-neuralmagic marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## 3. Launch the Python Client and Run Inference
## Step 3: Launch the Python Client and Run Inference


`client.py` is a simple client that simulates the behavior of some application. In the example, we have two images:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`client.py` is a simple client that simulates the behavior of some application. In the example, we have two images:
`client.py` is a simple client that simulates the behavior of some applications. In the example, we have two images:

- `goldfish.jpeg`: sample photo of a Goldfish
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `goldfish.jpeg`: sample photo of a Goldfish
- `goldfish.jpeg`: sample photo of a goldfish

- `all-black.jpeg`: a photo that is all black (every pixel is a 0)

The client sends requests to the Server, initially with "just" the Goldfish. Over time, we start to randomly
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The client sends requests to the Server, initially with "just" the Goldfish. Over time, we start to randomly
The client sends requests to the Server, initially with "just" the goldfish. Over time, we start to randomly

send the All Black image to the server with increasing probability. This simulates a data issue in the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
send the All Black image to the server with increasing probability. This simulates a data issue in the
send the all-black image to the server with increasing probability. This simulates a data issue in the

pipeline that we can detect with the monitoring system.

Run the following to start inference:

```bash
python client/client.py
```

It prints out which image was sent to the server.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
It prints out which image was sent to the server.
It prints out the image that was sent to the server.


## 4. Inspecting the Prometheus/Grafana Stack
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## 4. Inspecting the Prometheus/Grafana Stack
## Step 4: Inspect the Prometheus/Grafana Stack


### Prometheus

#### Confirm It Is Working
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#### Confirm It Is Working
#### Confirming Prometheus is Working


Visiting `http://localhost:9090/targets` should show that an endpoint `http://host.docker.internal:6100/metrics` is in state `UP`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Visiting `http://localhost:9090/targets` should show that an endpoint `http://host.docker.internal:6100/metrics` is in state `UP`.
Visiting `http://localhost:9090/targets` should show that an endpoint `http://host.docker.internal:6100/metrics` is in the `UP` state.


#### Query Prometheus with PromQL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#### Query Prometheus with PromQL
#### Querying Prometheus with PromQL


If you do not want to use Grafana, you can start off by using Prometheus's native graphing functionality.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If you do not want to use Grafana, you can start off by using Prometheus's native graphing functionality.
If you do not want to use Grafana, you can use Prometheus's native graphing functionality.


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

COMBINE THESE TWO PARAGRAPHS

Navigate to `http://localhost:9090/graph` and add the following `Expression`:

```
rate(image_classification__0__pipeline_inputs__images__fraction_zeros_sum[30s])
/
rate(image_classification__0__pipeline_inputs__images__fraction_zeros_count[30s])
```

You should see the following:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
You should see the following:
You should see:


![prometheus-dashboard.png](image/prometheus-dashboard.png)

This graph shows the percentage of 0 pixels in the images sent to the server.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This graph shows the percentage of 0 pixels in the images sent to the server.
This graph shows the percentage of 0 pixels in the images sent to the Server.

As the "corrupted" all black images were sent to the server in increasing probability,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
As the "corrupted" all black images were sent to the server in increasing probability,
As the "corrupted" all-black images were sent to the Server in increasing probability,

we can clearly see a spike in the graph, alerting us
that something strange is happening with the provided input.

DeepSparse Server also automatically logs prediction latencies for each pipeline stage as well
end-to-end server-side inference time. Add the following query to inspect average latency:

```
rate(image_classification__0__prediction_latency__total_inference_sum[30s])
/
rate(image_classification__0__prediction_latency__total_inference_count[30s])
```

![prometheus-dashboard-latency.png](image/prometheus-dashboard-latency.png)

For more details on working with the Prometheus Query Language PromQL,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
For more details on working with the Prometheus Query Language PromQL,

see [the official docs](https://prometheus.io/docs/prometheus/latest/querying/basics/).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
see [the official docs](https://prometheus.io/docs/prometheus/latest/querying/basics/).
See [Querying Prometheus](https://prometheus.io/docs/prometheus/latest/querying/basics/) for more details on working with PromQL.


### Grafana

#### Login
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#### Login
#### Logging In


Visit `localhost:3000` to launch Grafana. Log in with the default username (`admin`) and password (`admin`).

#### Add Prometheus Data Source
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#### Add Prometheus Data Source
#### Adding Prometheus Data Source

Setup the Prometheus data source (`Add your first data source` -> `Prometheus`). On this page, we just
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Setup the Prometheus data source (`Add your first data source` -> `Prometheus`). On this page, we just
Set up the Prometheus data source (`Add your first data source` -> `Prometheus`). On this page, we just

need to update the `url` section. Since Grafana and Prometheus are running separate docker containers,
put we need to put the IP address of the Prometheus container.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
put we need to put the IP address of the Prometheus container.
we need to include the IP address of the Prometheus container.


Run the following to lookup the `name` of your Prometheus container:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Run the following to lookup the `name` of your Prometheus container:
Run the following to look up the `name` of your Prometheus container:

```
docker container ls
>>> CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
>>> 997521854d84 grafana/grafana:latest "/run.sh" About an hour ago Up About an hour 0.0.0.0:3000->3000/tcp, :::3000->3000/tcp docker_grafana_1
>>> c611c80ae05e prom/prometheus "/bin/prometheus --c…" About an hour ago Up About an hour 0.0.0.0:9090->9090/tcp, :::9090->9090/tcp docker_prometheus_1
```

Run the following to lookup the IP address (replace `docker_prometheus_1` with your container's name):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Run the following to lookup the IP address (replace `docker_prometheus_1` with your container's name):
Run the following to look up the IP address (replace `docker_prometheus_1` with your container's name):

```
docker inspect -f '{{range.NetworkSettings.Networks}}{{.IPAddress}}{{end}}' docker_prometheus_1
>>> 172.18.0.2
```

So, in our case, the `url` section should be: `http://172.18.0.2:9090`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
So, in our case, the `url` section should be: `http://172.18.0.2:9090`.
So, in our case, the `url` section should be `http://172.18.0.2:9090`.


Click `Save & Test`. We should get a green check saying "Data Source Is Working".
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Click `Save & Test`. We should get a green check saying "Data Source Is Working".
Click `Save & Test`. We should get a green check indicating, "Data Source Is Working."


#### Import A Dashboard
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#### Import A Dashboard
#### Importing a Dashboard


Now you should be ready to create/import your dashboard.

Grafana's interface for adding metrics is very intuitive (and you can use PromQL),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Grafana's interface for adding metrics is very intuitive (and you can use PromQL),
Although Grafana's interface for adding metrics is very intuitive (and you can use PromQL),

but we have provided a simple pre-made dashboard for this use case.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
but we have provided a simple pre-made dashboard for this use case.
we have provided a simple pre-made dashboard for this use case.


Click `Dashboard` -> `Import` on the left-hand side bar. You should see an option to upload a file.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Click `Dashboard` -> `Import` on the left-hand side bar. You should see an option to upload a file.
Click `Dashboard` -> `Import` on the left-hand sidebar. You should see an option to upload a file.

Upload `grafana/dashboard.json` and save. Then, you should see the following dashboard:

![img.png](image/grafana-dashboard.png)
Binary file added examples/logging-prometheus/client/all_black.jpeg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
43 changes: 43 additions & 0 deletions examples/logging-prometheus/client/client.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
import random, time, requests, argparse

parser = argparse.ArgumentParser()
parser.add_argument("--url", type=str, default="http://0.0.0.0:5543/image_classification/predict/from_files")
parser.add_argument("--img1_path", type=str, default="client/goldfish.jpeg")
parser.add_argument("--img2_path", type=str, default="client/all_black.jpeg")
parser.add_argument("--num_iters", type=int, default=25)
parser.add_argument("--prob_incr", type=float, default=0.1)

def send_random_img(url, img1_path, img2_path, prob_img2):
img_path = ""
if random.uniform(0, 1) < prob_img2:
img_path = img2_path
else:
img_path = img1_path

files = [('request', open(img_path, 'rb'))]
resp = requests.post(url=url, files=files)
print(f"Sent File: {img_path}")

def main(url, img1_path, img2_path, num_iters, prob_incr):
prob_img2 = 0.0
iters = 0
increasing = True

while (increasing or prob_img2 > 0.0):
send_random_img(url, img1_path, img2_path, prob_img2)

if iters % num_iters == 0 and increasing:
prob_img2 += prob_incr
elif iters % num_iters == 0:
prob_img2 -= prob_incr
iters += 1

if prob_img2 >= 1.0:
increasing = False
prob_img2 -= prob_incr

time.sleep(0.25)

if __name__ == "__main__":
args = vars(parser.parse_args())
main(args["url"], args["img1_path"], args["img2_path"], args["num_iters"], args["prob_incr"])
Binary file added examples/logging-prometheus/client/goldfish.jpeg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
5 changes: 5 additions & 0 deletions examples/logging-prometheus/custom-fn.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
import numpy as np
from typing import List

def mean_pixel_red(img: np.ndarray):
robertgshaw2-neuralmagic marked this conversation as resolved.
Show resolved Hide resolved
return np.mean(img[:,:,0])
20 changes: 20 additions & 0 deletions examples/logging-prometheus/docker/docker-compose.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# docker-compose.yaml

version: "3"

services:
prometheus:
image: prom/prometheus
extra_hosts:
- "host.docker.internal:host-gateway" # allow a direct connection from container to the local machine
ports:
- "9090:9090" # the default port used by Prometheus
volumes:
- ${PWD}/prometheus.yaml:/etc/prometheus/prometheus.yml # mount Prometheus config file

grafana:
image: grafana/grafana:latest
depends_on:
- prometheus
ports:
- "3000:3000" # the default port used by Grafana
11 changes: 11 additions & 0 deletions examples/logging-prometheus/docker/prometheus.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# prometheus.yaml

global:
scrape_interval: 15s # how often to scrape from endpoint
evaluation_interval: 30s # time between each evaluation of Prometheus' alerting rules

scrape_configs:
- job_name: deepsparse_img_classification # your project name
static_configs:
- targets:
- 'host.docker.internal:6100' # should match the port exposed by the PrometheusLogger in the DeepSparse Server config file
Loading