single 4090 or multiple 4060 #13038

aafaqinq · 2024-05-23T06:54:28Z

aafaqinq
May 23, 2024

If I have 1 4090 can i have multiple 4060(less memory) to run yolov8 m fp 16 model(p2 head) faster.
What setup will be faster.
All i care about is faster inference.
I am asking this because i have to make the purchase i don't have it yet so based on this i will buy.
ignore other bottleneck like CPU and all lets assume i can get the best CPU possible for the task.

please suggest any other solutions also, open to suggestions.

Thank you

pderrenger · 2024-05-24T02:27:38Z

pderrenger
May 24, 2024
Maintainer

For achieving faster inference with YOLOv8 models, using a single RTX 4090 would generally offer better performance compared to multiple RTX 4060s, especially considering the higher memory bandwidth and compute capabilities of the 4090. This setup minimizes potential bottlenecks and complexities associated with multi-GPU configurations.

Here's a simple example to set up your model for inference on a single GPU:

from ultralytics import YOLO

# Load your model
model = YOLO('path_to_your_model.pt', device='cuda:0')  # Specify GPU device

# Run inference
results = model.predict('path_to_image_or_video')

Ensure your system has adequate cooling and power supply to handle the demands of the RTX 4090. If you decide later that you need more power, you can consider scaling up with additional GPUs.

2 replies

aafaqinq May 24, 2024
Author

@pderrenger

Thank you for the initial feedback on the performance comparison between a single RTX 4090 and multiple RTX 4060 GPUs.

I have a follow-up question considering our specific use case:

We will be running multiple instances of TensorRT YOLOv8 models in parallel. Given this, there are two potential setups we're considering:

Single RTX 4090:
    Running 4 YOLOv8 model instances concurrently on one powerful GPU.
    Maximizing the utilization of the high memory and bandwidth of the 4090.

Multiple RTX 4060s:
    Distributing 4 model instances across 4 4060, each equipped with a single RTX 4060.
    Leveraging parallel processing on different machines to achieve faster inference.

Could you provide your opinion on which setup might offer faster inference speeds considering the parallel execution of multiple models? Additionally, are there any other configurations or solutions you would recommend to optimize for the fastest possible inference?
all are tensorrt p2 head fp 16 models(we might try int 8 as well).
Thank you for your assistance!

pderrenger May 24, 2024
Maintainer

Hi @aafaqinq,

For running multiple instances of TensorRT YOLOv8 models in parallel, using a single RTX 4090 could potentially offer better performance due to its superior compute capabilities and larger memory bandwidth compared to multiple RTX 4060s. This setup allows you to fully utilize the advanced architecture of the 4090, which is beneficial for handling multiple heavy workloads concurrently.

However, distributing the workload across multiple RTX 4060s could also have its advantages, particularly in terms of scaling and redundancy. Each GPU handling a single instance might reduce the risk of performance bottlenecks that could occur from overloading a single GPU.

If inference speed is your top priority and you have the infrastructure to manage multiple GPUs efficiently, the multiple RTX 4060 setup could be slightly more advantageous. This approach can also provide more flexibility for scaling up in the future.

For further optimization, consider experimenting with both FP16 and INT8 precision modes, as INT8 can offer significant speedups with minimal loss in accuracy when properly calibrated.

Hope this helps! Let me know if you have more questions. 😊

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ultralytics

single 4090 or multiple 4060 #13038

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 2 replies

{{title}}

{{title}}

{{title}}

Select a reply

Ultralytics

single 4090 or multiple 4060 #13038

aafaqinq May 23, 2024

Replies: 1 comment · 2 replies

pderrenger May 24, 2024 Maintainer

aafaqinq May 24, 2024 Author

pderrenger May 24, 2024 Maintainer

aafaqinq
May 23, 2024

Replies: 1 comment 2 replies

pderrenger
May 24, 2024
Maintainer

aafaqinq May 24, 2024
Author

pderrenger May 24, 2024
Maintainer