Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add shared memory support to python client #57

Merged
merged 1 commit into from
Jan 11, 2022

Conversation

t-wata
Copy link

@t-wata t-wata commented Nov 19, 2021

Thanks to this repo, I was able to deploy the YOLOv4 on Triton Inference Server very easily.

I've implemented shared memory support for the python client, please merge it if you want.

Of course, you can only use it if the client runs on the same host as the Triton Inference Server. The --ipc="host" option is also required when starting the Triton Inference Server with docker run, but it is already included in the command listed in README.md.

Although there are such limitations, a cursory measurement in my environment showed that the time required for inference improved by about 20ms.

### For debug ###

$ git diff
diff --git a/clients/python/client.py b/clients/python/client.py
index 3c6cced..8b9714c 100644
--- a/clients/python/client.py
+++ b/clients/python/client.py
@@ -235,6 +235,8 @@ if __name__ == '__main__':
             print("FAILED: no input image")
             sys.exit(1)

+        debug_start_time = time.perf_counter_ns()
+
         inputs = []
         outputs = []
         inputs.append(grpcclient.InferInput('input', [1, 3, FLAGS.width, FLAGS.height], "FP32"))
@@ -291,6 +293,10 @@ if __name__ == '__main__':
         detected_objects = postprocess(result, input_image.shape[1], input_image.shape[0], [FLAGS.width, FLAGS.height], FLAGS.confidence, FLAGS.nms)
         print(f"Detected objects: {len(detected_objects)}")

+        # Display Inference processing time
+        debug_infer_time = time.perf_counter_ns() - debug_start_time
+        print('Inference processing time: {} ms'.format(debug_infer_time / 1000000))
+
         for box in detected_objects:
             print(f"{COCOLabels(box.classID).name}: {box.confidence}")
             input_image = render_box(input_image, box.box(), color=tuple(RAND_COLORS[box.classID % 64].tolist()))

### Test without shared memory ###

$ python client.py -o data/dog_inferred.jpg image data/dog.jpg
Running in 'image' mode
Creating buffer from image file...
Invoking inference...
Done
Received result buffer of size (1, 159201, 1, 1)
Naive buffer sum: 565762.875
Detected objects: 3
Inference processing time: 53.121086 ms  # <- BEFORE
DOG: 0.9786704182624817
BICYCLE: 0.9221425652503967
TRUCK: 0.9161325097084045
Saved result to data/dog_inferred.jpg

### Test with shared memory ###

$ python client.py --shm -o data/dog_inferred.jpg image data/dog.jpg
Running in 'image' mode
shared_memory_status: regions {
  key: "input_data"
  value {
    name: "input_data"
    key: "/input_simple"
    byte_size: 4435968
  }
}
regions {
  key: "output_data"
  value {
    name: "output_data"
    key: "/output_simple"
    byte_size: 4435968
  }
}

Creating buffer from image file...
Invoking inference...
Done
Detected objects: 3
Inference processing time: 33.28672 ms   # <- AFTER
DOG: 0.9786704182624817
BICYCLE: 0.9221425652503967
TRUCK: 0.9161325097084045
Saved result to data/dog_inferred.jpg
Cleanup shared memory...
shared_memory_status:

There is a related issue #31, but it was closed without implementation.

@philipp-schmidt
Copy link
Contributor

Looks good, thanks for the contribution!

@philipp-schmidt philipp-schmidt merged commit 86eeb5f into isarsoft:master Jan 11, 2022
@philipp-schmidt
Copy link
Contributor

Now in the list of contributions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants