GPU Benchmark

This repository contains benchmark data and documentation for evaluating the inference speeds of various large language models (LLMs) on different GPUs.

About Massed Compute

Massed Compute leverages cutting-edge technology to offer scalable and efficient distributed computing solutions. We provide flexible computing power for AI research, visual effects production, data science, and more. Our goal is to empower organizations with the tools they need to maximize their computational capabilities.

For more information, visit Massed Compute.

Benchmarking Overview

This repository covers benchmarking LLM inference speeds on different GPUs, including:

Llama 3
- Llama 3 70B
- Llama 3.1 70B
Qwen
Mixtral
Magnum
Other popular models

Each benchmark includes:

Model Description: Overview of the model being tested.
Hardware Specifications: Details about the GPUs used.
Benchmark Results: Inference speed and performance metrics.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
llama-3		llama-3
shared-images		shared-images
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GPU Benchmark

About Massed Compute

Benchmarking Overview

About

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

GPU Benchmark

About Massed Compute

Benchmarking Overview

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!