Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ultralytics 8.1.46 add TensorRT 10 support #9516

Merged
merged 41 commits into from
Apr 10, 2024
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
0995ecd
add support for TensorRT 10
Burhan-Q Apr 2, 2024
4a4c68f
add TensorRT10 compatability
Burhan-Q Apr 2, 2024
079aa43
Auto-format by https://ultralytics.com/actions
UltralyticsAssistant Apr 2, 2024
6850472
Merge branch 'main' into trt10
glenn-jocher Apr 3, 2024
70c2f6b
Merge branch 'main' into trt10
glenn-jocher Apr 6, 2024
bc21f77
Merge branch 'main' into trt10
glenn-jocher Apr 6, 2024
02c4b99
Merge branch 'main' into trt10
glenn-jocher Apr 6, 2024
94ea376
Merge branch 'main' into trt10
glenn-jocher Apr 7, 2024
8418063
Update exporter.py
glenn-jocher Apr 7, 2024
32b2445
Updates
glenn-jocher Apr 7, 2024
1f553c2
Add support to exporting or inference with TensorRT 10.0.0b6, and fix…
ZouJiu1 Apr 7, 2024
d0a088d
Update __init__.py
glenn-jocher Apr 7, 2024
a090b75
Merge branch 'main' into trt10
glenn-jocher Apr 7, 2024
4c40158
Merge branch 'main' into trt10
glenn-jocher Apr 8, 2024
574a3a5
Removed changes from 1f553c2 due to errors, now working and fixes iss…
Burhan-Q Apr 8, 2024
0964835
Auto-format by https://ultralytics.com/actions
UltralyticsAssistant Apr 8, 2024
905fd05
ensure max shape dimension scales no smaller than 1
Burhan-Q Apr 8, 2024
139c2da
Merge branch 'trt10' of https://github.com/ultralytics/ultralytics in…
Burhan-Q Apr 8, 2024
c1c0046
Merge branch 'main' into trt10
glenn-jocher Apr 8, 2024
0a9a92f
Merge branch 'main' into trt10
glenn-jocher Apr 9, 2024
4e81132
Updates
glenn-jocher Apr 9, 2024
a38ba95
Merge remote-tracking branch 'origin/trt10' into trt10
glenn-jocher Apr 9, 2024
0cb3053
Merge branch 'main' into trt10
glenn-jocher Apr 9, 2024
2c03f83
Refactor constants out of for loop for speed
glenn-jocher Apr 9, 2024
d371949
Update exporter.py
glenn-jocher Apr 9, 2024
5a5e61f
refactored to use unified methods
Burhan-Q Apr 9, 2024
df6a3ff
Merge branch 'trt10' of https://github.com/ultralytics/ultralytics in…
Burhan-Q Apr 9, 2024
050b278
remove redundant line
Burhan-Q Apr 9, 2024
cd18b7b
remove self. as assigned automatically
glenn-jocher Apr 9, 2024
3017bad
Add TensorRT export test
glenn-jocher Apr 9, 2024
0e694cb
Merge remote-tracking branch 'origin/trt10' into trt10
glenn-jocher Apr 9, 2024
d263946
Update test_cuda.py
glenn-jocher Apr 9, 2024
e221ff5
refactored to include older interface for `tensorrt<8.6`
Burhan-Q Apr 9, 2024
3bab6b4
Merge branch 'trt10' of https://github.com/ultralytics/ultralytics in…
Burhan-Q Apr 9, 2024
529293f
refactor to include path for installs with `tensorrt<8.6`
Burhan-Q Apr 9, 2024
117a011
fix dynamic where self. is needed, align else, rearrange flow
Burhan-Q Apr 9, 2024
1f49b5d
correct for error with TensorRT 8.4.3.1 without method `get_tensor_sh…
Burhan-Q Apr 9, 2024
1ef7818
delete redundant trt10 definition
glenn-jocher Apr 10, 2024
9b96f5b
Align variable names in Exporter and Autobackend
glenn-jocher Apr 10, 2024
ba20133
Merge branch 'main' into trt10
glenn-jocher Apr 10, 2024
1fba2f3
move test to slow
glenn-jocher Apr 10, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
24 changes: 14 additions & 10 deletions ultralytics/engine/exporter.py
Original file line number Diff line number Diff line change
Expand Up @@ -654,6 +654,7 @@
def export_engine(self, prefix=colorstr("TensorRT:")):
"""YOLOv8 TensorRT export https://developer.nvidia.com/tensorrt."""
assert self.im.device.type != "cpu", "export running on CPU but must be on GPU, i.e. use 'device=0'"
self.args.simplify = True

Check warning on line 657 in ultralytics/engine/exporter.py

View check run for this annotation

Codecov / codecov/patch

ultralytics/engine/exporter.py#L657

Added line #L657 was not covered by tests
f_onnx, _ = self.export_onnx() # run before trt import https://github.com/ultralytics/ultralytics/issues/7016

try:
Expand All @@ -662,12 +663,10 @@
if LINUX:
check_requirements("nvidia-tensorrt", cmds="-U --index-url https://pypi.ngc.nvidia.com")
import tensorrt as trt # noqa

check_version(trt.__version__, "7.0.0", hard=True) # require tensorrt>=7.0.0

self.args.simplify = True

LOGGER.info(f"\n{prefix} starting export with TensorRT {trt.__version__}...")
is_trt_10 = int(trt.__version__.split(".")[0]) >= 10 # is TensorRT >= 10

Check warning on line 669 in ultralytics/engine/exporter.py

View check run for this annotation

Codecov / codecov/patch

ultralytics/engine/exporter.py#L669

Added line #L669 was not covered by tests
assert Path(f_onnx).exists(), f"failed to export ONNX file: {f_onnx}"
f = self.file.with_suffix(".engine") # TensorRT engine file
logger = trt.Logger(trt.Logger.INFO)
Expand All @@ -676,7 +675,11 @@

builder = trt.Builder(logger)
config = builder.create_builder_config()
config.max_workspace_size = int(self.args.workspace * (1 << 30))
workspace = int(self.args.workspace * (1 << 30))
if is_trt_10:
config.set_memory_pool_limit(trt.MemoryPoolType.WORKSPACE, workspace)

Check warning on line 680 in ultralytics/engine/exporter.py

View check run for this annotation

Codecov / codecov/patch

ultralytics/engine/exporter.py#L678-L680

Added lines #L678 - L680 were not covered by tests
else: # TensorRT versions 7, 8
config.max_workspace_size = workspace

Check warning on line 682 in ultralytics/engine/exporter.py

View check run for this annotation

Codecov / codecov/patch

ultralytics/engine/exporter.py#L682

Added line #L682 was not covered by tests
flag = 1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)
network = builder.create_network(flag)
parser = trt.OnnxParser(network, logger)
Expand All @@ -699,23 +702,24 @@
profile.set_shape(inp.name, (1, *shape[1:]), (max(1, shape[0] // 2), *shape[1:]), shape)
config.add_optimization_profile(profile)

LOGGER.info(
f"{prefix} building FP{16 if builder.platform_has_fast_fp16 and self.args.half else 32} engine as {f}"
)
if builder.platform_has_fast_fp16 and self.args.half:
half = builder.platform_has_fast_fp16 and self.args.half
LOGGER.info(f"{prefix} building FP{16 if half else 32} engine as {f}")
if half:

Check warning on line 707 in ultralytics/engine/exporter.py

View check run for this annotation

Codecov / codecov/patch

ultralytics/engine/exporter.py#L705-L707

Added lines #L705 - L707 were not covered by tests
config.set_flag(trt.BuilderFlag.FP16)

# Free CUDA memory
del self.model
torch.cuda.empty_cache()

# Write file
with builder.build_engine(network, config) as engine, open(f, "wb") as t:
build = builder.build_serialized_network if is_trt_10 else builder.build_engine
with build(network, config) as engine, open(f, "wb") as t:

Check warning on line 716 in ultralytics/engine/exporter.py

View check run for this annotation

Codecov / codecov/patch

ultralytics/engine/exporter.py#L715-L716

Added lines #L715 - L716 were not covered by tests
# Metadata
meta = json.dumps(self.metadata)
t.write(len(meta).to_bytes(4, byteorder="little", signed=True))
t.write(meta.encode())
# Model
t.write(engine.serialize())
t.write(engine if is_trt_10 else engine.serialize())

Check warning on line 722 in ultralytics/engine/exporter.py

View check run for this annotation

Codecov / codecov/patch

ultralytics/engine/exporter.py#L722

Added line #L722 was not covered by tests

return f, None

Expand Down
33 changes: 25 additions & 8 deletions ultralytics/nn/autobackend.py
Original file line number Diff line number Diff line change
Expand Up @@ -234,23 +234,40 @@
meta_len = int.from_bytes(f.read(4), byteorder="little") # read metadata length
metadata = json.loads(f.read(meta_len).decode("utf-8")) # read metadata
model = runtime.deserialize_cuda_engine(f.read()) # read engine
context = model.create_execution_context()
try:
context = model.create_execution_context()
except AttributeError: # model is None

Check warning on line 239 in ultralytics/nn/autobackend.py

View check run for this annotation

Codecov / codecov/patch

ultralytics/nn/autobackend.py#L237-L239

Added lines #L237 - L239 were not covered by tests
# TensorRT <10 and >=10 incompatible
LOGGER.error(

Check warning on line 241 in ultralytics/nn/autobackend.py

View check run for this annotation

Codecov / codecov/patch

ultralytics/nn/autobackend.py#L241

Added line #L241 was not covered by tests
f"\nExport to .engine using the same version of TensorRT installed; currently using {trt.__version__}\n"
)
raise err

Check warning on line 244 in ultralytics/nn/autobackend.py

View check run for this annotation

Codecov / codecov/patch

ultralytics/nn/autobackend.py#L244

Added line #L244 was not covered by tests
bindings = OrderedDict()
output_names = []
fp16 = False # default updated below
dynamic = False
for i in range(model.num_bindings):
name = model.get_binding_name(i)
dtype = trt.nptype(model.get_binding_dtype(i))
if model.binding_is_input(i):
if -1 in tuple(model.get_binding_shape(i)): # dynamic
is_legacy = hasattr(model, "num_bindings") # TensorRT <10
num = range(model.num_bindings) if is_legacy else range(model.num_io_tensors)
for i in num:
name = model.get_binding_name(i) if is_legacy else model.get_tensor_name(i)
dtype = trt.nptype(model.get_binding_dtype(i) if is_legacy else model.get_tensor_dtype(name))
is_input = (

Check warning on line 254 in ultralytics/nn/autobackend.py

View check run for this annotation

Codecov / codecov/patch

ultralytics/nn/autobackend.py#L249-L254

Added lines #L249 - L254 were not covered by tests
model.binding_is_input(i) if is_legacy else model.get_tensor_mode(name) == trt.TensorIOMode.INPUT
)
if is_input:
if -1 in tuple(

Check warning on line 258 in ultralytics/nn/autobackend.py

View check run for this annotation

Codecov / codecov/patch

ultralytics/nn/autobackend.py#L257-L258

Added lines #L257 - L258 were not covered by tests
model.get_binding_shape(i) if is_legacy else model.get_tensor_shape(name)
): # dynamic
dynamic = True
context.set_binding_shape(i, tuple(model.get_profile_shape(0, i)[2]))
profile_shape = (

Check warning on line 262 in ultralytics/nn/autobackend.py

View check run for this annotation

Codecov / codecov/patch

ultralytics/nn/autobackend.py#L262

Added line #L262 was not covered by tests
model.get_profile_shape(0, i) if is_legacy else model.get_tensor_profile_shape(name, i)
)
context.set_binding_shape(i, tuple(profile_shape[2]))

Check warning on line 265 in ultralytics/nn/autobackend.py

View check run for this annotation

Codecov / codecov/patch

ultralytics/nn/autobackend.py#L265

Added line #L265 was not covered by tests
if dtype == np.float16:
fp16 = True
else: # output
output_names.append(name)
shape = tuple(context.get_binding_shape(i))
shape = tuple(context.get_binding_shape(i) if is_legacy else context.get_tensor_shape(name))

Check warning on line 270 in ultralytics/nn/autobackend.py

View check run for this annotation

Codecov / codecov/patch

ultralytics/nn/autobackend.py#L270

Added line #L270 was not covered by tests
im = torch.from_numpy(np.empty(shape, dtype=dtype)).to(device)
bindings[name] = Binding(name, dtype, shape, im, int(im.data_ptr()))
binding_addrs = OrderedDict((n, d.ptr) for n, d in bindings.items())
Expand Down