Memory leak when reloading model config

## Bug Report
Memory leak when reloading model config

### System information
OS Platform and Distribution: ubuntu:18.04
TensorFlow Serving installed from: binary
TensorFlow Serving version: 2.1.0
Bug produced using TFS docker image: tensorflow/serving:2.1.0-gpu

### Describe the Problem
Using the grpc model management endpoints to load and unload models, specifically calling the function ReloadConfigRequest, we've loaded 22 copies of the same model each with size 208MiB and proceeded to unload them.

When all the models were loaded docker stats showed ~10GiB in memory usage. We expected it to return close to the base memory usage when we unloaded them all.

But after unloading them, we still saw a usage of 8.153GiB. No additional changes have been made to the TFS code.

### Exact Steps to Reproduce

1. Pull Docker image `sudo docker pull tensorflow/serving:2.1.0-gpu`
2. Run Docker image `sudo docker run -it --rm -v "/local/models:/models" -e MODEL_NAME=model_name tensorflow/serving:2.1.0-gpu`
4. Have a separate window with tensorflow_serving_api==2.1.0 (binary)
5. Add python client side grpc code to tensorflow_serving (shown below)
6. Load model 22 (different copies of the same model) times using python client script
7. Record memory usage
8. Unload all models
9. Record memory usage

### Source code / logs
Server side logs
![server_side_logs](https://user-images.githubusercontent.com/9426164/84068030-3bf2de80-a97d-11ea-878a-6b6c360d1d0e.png)

Grpc Client Side Code
```
import grpc
from tensorflow_serving.apis import model_service_pb2_grpc
from tensorflow_serving.config import model_server_config_pb2
from tensorflow_serving.apis import model_management_pb2

server_address = "0.0.0.0:1234" # Replace with address of your server

def handle_reload_config_request(stub):
    model_server_config = model_server_config_pb2.ModelServerConfig()
    request = model_management_pb2.ReloadConfigRequest()
    config_list = model_server_config_pb2.ModelConfigList()

    model_server_config.model_config_list.CopyFrom(config_list)
    request.config.CopyFrom(model_server_config)

    response = stub.HandleReloadConfigRequest(request)

    print("Response: %s" % response)


def run():
    with grpc.insecure_channel(server_address) as channel:
	stub = model_service_pb2_grpc.ModelServiceStub(channel)
	print("-------------Handle Reload Config Request--------------")
	handle_reload_config_request(stub)


if __name__ == '__main__':
    run()
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory leak when reloading model config #1664

Bug Report

System information

Describe the Problem

Exact Steps to Reproduce

Source code / logs

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Memory leak when reloading model config #1664

Description

Bug Report

System information

Describe the Problem

Exact Steps to Reproduce

Source code / logs

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions