Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify exactly which models of GPUs were used in benchmarks #87

Open
jchodera opened this issue Jan 7, 2022 · 3 comments
Open

Clarify exactly which models of GPUs were used in benchmarks #87

jchodera opened this issue Jan 7, 2022 · 3 comments

Comments

@jchodera
Copy link
Member

jchodera commented Jan 7, 2022

There seems to be significant variation in the performance of different models/variants of the same GPU (e.g. the multiple variants of A100 available), so we should provide more details in our benchmarks about exactly which model(s) were used.

@peastman
Copy link
Member

peastman commented Jan 7, 2022

The A100s are on Perlmutter. They're 40 GB, 1410 MHz versions.

@jchodera
Copy link
Member Author

jchodera commented Jan 7, 2022

Maybe we should capture the output of nivida-smi -q?

The datasheet says there's a bunch of flavors of A100:
image

@peastman
Copy link
Member

peastman commented Jan 7, 2022

The only difference between them is the amount of memory (40 or 80 GB) and the form factor (PCIe or SXM). Neither of those should have any difference in speed.

Here's what nvidia-smi reports on the login node with the GPU idle.

    Clocks
        Graphics                          : 210 MHz
        SM                                : 210 MHz
        Memory                            : 1215 MHz
        Video                             : 585 MHz
    Applications Clocks
        Graphics                          : 765 MHz
        Memory                            : 1215 MHz
    Default Applications Clocks
        Graphics                          : 765 MHz
        Memory                            : 1215 MHz
    Max Clocks
        Graphics                          : 1410 MHz
        SM                                : 1410 MHz
        Memory                            : 1215 MHz
        Video                             : 1290 MHz
    Max Customer Boost Clocks
        Graphics                          : 1410 MHz

Comparing to what you posted in #86 (comment), the max clock rates for graphics, SM, and video are the same, but the memory is slightly lower. Other factors that can affect performance are the type of bus (PCIe or NVLink, and the particular version of either one), the cooling system (influences whether it can actually sustain the maximum clock rate, bus topology (mainly for multi-GPU benchmarks), and CPU type (it's not a huge effect for GPU benchmarks, but it does make a difference).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants