Skip to content

Unexpectedly high CPU utilization of Docker container; even without models. #1730

@ericmclachlan

Description

@ericmclachlan

System Information

  • Host OS: Microsoft Windows 10 Pro (Version 10.0.19041 Build 19041)
  • TensorFlow Serving installed from Docker: tensorflow/serving:latest-devel and Ubuntu:18.04
  • Docker Desktop: 2.3.0.4 (46911) (using Linux containers)

The Problem

TensorFlow Servings docker container uses significantly more CPU than expected/desirable.

Evaluation

docker stats shows the following:

CONTAINER ID        NAME                                            CPU %               MEM USAGE / LIMIT     MEM %               NET I/O             BLOCK I/O           PIDS
0e5fca607055        linguistic-intelligence_tensorflow-servings_1   14.75%              24.26MiB / 12.43GiB   0.19%               1.19kB / 0B         0B / 0B             29

Notice the container is reported as using 14.75% CPU. This is after running for 15 minutes.

And here's the issue: This is with zero models loaded. Here, the sustained CPU utilization is around 6-17%. (If models had been loaded the sustained CPU utilization is around 25%.)

Expected Behaviour:

Of course, since TSF has been delegated zero work to do, it's expected that CPU utilization would be minimal.

Exact Steps to Reproduce

Running the Docker Container:

docker-compose up tensorflow-servings

docker-compose.yaml:

  tensorflow-servings:
    build:
      context: .
      dockerfile: tensorflow-servings.dockerfile
    image: tensorflow-servings
    ports:
      - 8500:8500
      - 8501:8501
    volumes:
      - ./ml-models/:/models/:ro
      - ./ml-servings-config/:/config/:ro
    deploy:
      resources:
        limits:
          memory: 4Gb
    environment:
      - MODEL_CONFIG_FILE_POLL_WAIT_SECONDS=0

tensorflow-servings.dockerfile:

ARG TF_SERVING_VERSION=latest
ARG TF_SERVING_BUILD_IMAGE=tensorflow/serving:${TF_SERVING_VERSION}-devel

FROM ${TF_SERVING_BUILD_IMAGE} as build_image
FROM ubuntu:18.04

ARG TF_SERVING_VERSION_GIT_BRANCH=master
ARG TF_SERVING_VERSION_GIT_COMMIT=head

LABEL tensorflow_serving_github_branchtag=${TF_SERVING_VERSION_GIT_BRANCH}
LABEL tensorflow_serving_github_commit=${TF_SERVING_VERSION_GIT_COMMIT}

RUN apt-get update && apt-get install -y --no-install-recommends \
        ca-certificates \
        && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

# Install TF Serving pkg
COPY --from=build_image /usr/local/bin/tensorflow_model_server /usr/bin/tensorflow_model_server

# PORTS:
# gRPC
EXPOSE 8500
# REST
EXPOSE 8501

# Set where models should be stored in the container
ENV MODEL_BASE_PATH=/models/

# Embed the models
COPY ./ml-models /models
COPY ./ml-servings-config /config

# Create a script that runs the model server so we can use environment variables
# while also passing in arguments from the docker command line
RUN echo '#!/bin/bash \n\n\
tensorflow_model_server \
--allow_version_labels_for_unavailable_models=true \
--batching_parameters_file=/config/batching_parameters.txt \
--enable_batching=true \
--model_base_path=${MODEL_BASE_PATH} \
--model_config_file=/config/all_models.config \
--model_config_file_poll_wait_seconds=60 \
--monitoring_config_file=/config/monitoring_config.txt \
--port=8500 \
--rest_api_port=8501 \
--rest_api_timeout_in_ms=30000 \
"$@"' > /usr/bin/tf_serving_entrypoint.sh \
&& chmod +x /usr/bin/tf_serving_entrypoint.sh

ENTRYPOINT ["/usr/bin/tf_serving_entrypoint.sh"]

all_models.config:

model_config_list {

}

Logs: When starting the container:

$ docker-compose up tensorflow-servings
WARNING: Some services (natural-language-server, tensorflow-servings) use the 'deploy' key, which will be ignored. Compose does not support 'deploy' configuration - use `docker stack deploy` to deploy to a swarm.
Creating linguistic-intelligence_tensorflow-servings_1 ... done
Attaching to linguistic-intelligence_tensorflow-servings_1
tensorflow-servings_1      | 2020-09-07 09:27:05.197858: I tensorflow_serving/model_servers/server_core.cc:464] Adding/updating models.
tensorflow-servings_1      | 2020-09-07 09:27:05.204117: I tensorflow_serving/model_servers/server.cc:367] Running gRPC ModelServer at 0.0.0.0:8500 ...
tensorflow-servings_1      | 2020-09-07 09:27:05.207537: I tensorflow_serving/model_servers/server_core.cc:464] Adding/updating models.
tensorflow-servings_1      | [warn] getaddrinfo: address family for nodename not supported
tensorflow-servings_1      | [evhttp_server.cc : 238] NET_LOG: Entering the event loop ...
tensorflow-servings_1      | 2020-09-07 09:27:05.215164: I tensorflow_serving/model_servers/server.cc:387] Exporting HTTP/REST API at:localhost:8501 ...
tensorflow-servings_1      | 2020-09-07 09:28:05.205505: I tensorflow_serving/model_servers/server_core.cc:464] Adding/updating models.
tensorflow-servings_1      | 2020-09-07 09:29:05.206871: I tensorflow_serving/model_servers/server_core.cc:464] Adding/updating models.
tensorflow-servings_1      | 2020-09-07 09:30:05.206192: I tensorflow_serving/model_servers/server_core.cc:464] Adding/updating models.
tensorflow-servings_1      | 2020-09-07 09:31:05.206607: I tensorflow_serving/model_servers/server_core.cc:464] Adding/updating models.
tensorflow-servings_1      | 2020-09-07 09:32:05.207374: I tensorflow_serving/model_servers/server_core.cc:464] Adding/updating models.
tensorflow-servings_1      | 2020-09-07 09:33:05.207560: I tensorflow_serving/model_servers/server_core.cc:464] Adding/updating models.
tensorflow-servings_1      | 2020-09-07 09:34:05.207728: I tensorflow_serving/model_servers/server_core.cc:464] Adding/updating models.
tensorflow-servings_1      | 2020-09-07 09:35:05.206796: I tensorflow_serving/model_servers/server_core.cc:464] Adding/updating models.
tensorflow-servings_1      | 2020-09-07 09:36:05.213456: I tensorflow_serving/model_servers/server_core.cc:464] Adding/updating models.
tensorflow-servings_1      | 2020-09-07 09:37:05.207654: I tensorflow_serving/model_servers/server_core.cc:464] Adding/updating models.
tensorflow-servings_1      | 2020-09-07 09:38:05.208601: I tensorflow_serving/model_servers/server_core.cc:464] Adding/updating models.
tensorflow-servings_1      | 2020-09-07 09:39:05.208775: I tensorflow_serving/model_servers/server_core.cc:464] Adding/updating models.
tensorflow-servings_1      | 2020-09-07 09:40:05.209190: I tensorflow_serving/model_servers/server_core.cc:464] Adding/updating models.
tensorflow-servings_1      | 2020-09-07 09:41:05.209411: I tensorflow_serving/model_servers/server_core.cc:464] Adding/updating models.
tensorflow-servings_1      | 2020-09-07 09:42:05.210525: I tensorflow_serving/model_servers/server_core.cc:464] Adding/updating models.
tensorflow-servings_1      | 2020-09-07 09:43:05.210102: I tensorflow_serving/model_servers/server_core.cc:464] Adding/updating models.
tensorflow-servings_1      | 2020-09-07 09:44:05.210034: I tensorflow_serving/model_servers/server_core.cc:464] Adding/updating models.

batching_parameters.txt:

max_batch_size { value: 64 }
batch_timeout_micros { value: 0 }
num_batch_threads { value: 4 }
pad_variable_length_inputs: true

Thanks for your attention.

Metadata

Metadata

Labels

staleThis label marks the issue/pr stale - to be closed automatically if no activitystat:awaiting responsetype:performancePerformance Issue

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions