Add shared memory support to python client #57

t-wata · 2021-11-19T12:04:54Z

Thanks to this repo, I was able to deploy the YOLOv4 on Triton Inference Server very easily.

I've implemented shared memory support for the python client, please merge it if you want.

Of course, you can only use it if the client runs on the same host as the Triton Inference Server. The --ipc="host" option is also required when starting the Triton Inference Server with docker run, but it is already included in the command listed in README.md.

Although there are such limitations, a cursory measurement in my environment showed that the time required for inference improved by about 20ms.

### For debug ###

$ git diff
diff --git a/clients/python/client.py b/clients/python/client.py
index 3c6cced..8b9714c 100644
--- a/clients/python/client.py
+++ b/clients/python/client.py
@@ -235,6 +235,8 @@ if __name__ == '__main__':
             print("FAILED: no input image")
             sys.exit(1)

+        debug_start_time = time.perf_counter_ns()
+
         inputs = []
         outputs = []
         inputs.append(grpcclient.InferInput('input', [1, 3, FLAGS.width, FLAGS.height], "FP32"))
@@ -291,6 +293,10 @@ if __name__ == '__main__':
         detected_objects = postprocess(result, input_image.shape[1], input_image.shape[0], [FLAGS.width, FLAGS.height], FLAGS.confidence, FLAGS.nms)
         print(f"Detected objects: {len(detected_objects)}")

+        # Display Inference processing time
+        debug_infer_time = time.perf_counter_ns() - debug_start_time
+        print('Inference processing time: {} ms'.format(debug_infer_time / 1000000))
+
         for box in detected_objects:
             print(f"{COCOLabels(box.classID).name}: {box.confidence}")
             input_image = render_box(input_image, box.box(), color=tuple(RAND_COLORS[box.classID % 64].tolist()))

### Test without shared memory ###

$ python client.py -o data/dog_inferred.jpg image data/dog.jpg
Running in 'image' mode
Creating buffer from image file...
Invoking inference...
Done
Received result buffer of size (1, 159201, 1, 1)
Naive buffer sum: 565762.875
Detected objects: 3
Inference processing time: 53.121086 ms  # <- BEFORE
DOG: 0.9786704182624817
BICYCLE: 0.9221425652503967
TRUCK: 0.9161325097084045
Saved result to data/dog_inferred.jpg

### Test with shared memory ###

$ python client.py --shm -o data/dog_inferred.jpg image data/dog.jpg
Running in 'image' mode
shared_memory_status: regions {
  key: "input_data"
  value {
    name: "input_data"
    key: "/input_simple"
    byte_size: 4435968
  }
}
regions {
  key: "output_data"
  value {
    name: "output_data"
    key: "/output_simple"
    byte_size: 4435968
  }
}

Creating buffer from image file...
Invoking inference...
Done
Detected objects: 3
Inference processing time: 33.28672 ms   # <- AFTER
DOG: 0.9786704182624817
BICYCLE: 0.9221425652503967
TRUCK: 0.9161325097084045
Saved result to data/dog_inferred.jpg
Cleanup shared memory...
shared_memory_status:

There is a related issue #31, but it was closed without implementation.

philipp-schmidt · 2022-01-11T22:24:50Z

Looks good, thanks for the contribution!

philipp-schmidt · 2022-01-11T22:28:03Z

Now in the list of contributions

Add shared memory support to python client

9b18951

philipp-schmidt merged commit 86eeb5f into isarsoft:master Jan 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add shared memory support to python client #57

Add shared memory support to python client #57

t-wata commented Nov 19, 2021 •

edited

philipp-schmidt commented Jan 11, 2022

philipp-schmidt commented Jan 11, 2022

Add shared memory support to python client #57

Add shared memory support to python client #57

Conversation

t-wata commented Nov 19, 2021 • edited

philipp-schmidt commented Jan 11, 2022

philipp-schmidt commented Jan 11, 2022

t-wata commented Nov 19, 2021 •

edited