In [489]:
from utils import *

In [490]:
NUM_SAMPLES = 1000

## Production cost of a DGX H100 server

Raymond James (a global financial services company) estimated that it costs Nvidia \$3,320 to make a H100, which is then sold to customers for \$25,000 to \$30,000. This was [reported](https://twitter.com/firstadopter/status/1691877797487165443) second-hand via Tae Kim (a senior writer for financial and investment news site Barrons), who added "[High bandwidth memory] was included in their BOM estimates".

However, this doesn't seem to account for off-board components in a DGX server, such as CPUs, tranceivers and switches. For that, we use this SemiAnalysis breakdown of DGX H100 server cost (includes networking hardware for the server): https://www.semianalysis.com/p/ai-server-cost-analysis-memory-is. Note that the HGX model is more applicable to large-scale clusters, but we don't have information about HGX so we assume it is similar to DGX in costs.
  - Sale price of DGX H100 estimated at ~\$270,000
  - “Nvidia's gross profit per DGX H100 is almost \$190,000. Of course, R&D and other operating expenses bring this much lower.”
  - So SemiAnalysis implies that DGX H100 costs \$80,000 to make.

Consistency check:

If the Raymond James estimate of \$3,320 only includes the H100 chip and peripherals, it's actually not way off the cost implied by SemiAnalysis, at \$5,875 (and that includes an additional cost of a 4 NVSwitch Baseboard, which may not be in the Raymond James estimate). See calculation below. Note that the SemiAnalysis estimate was published in May 2023 while Tae Kim's post was in August 2023.

A colleague who independently estimated the cost of the H100 GPU also informed us that they got roughly $3,000 (based on TSMC wafer prices, among other things).

In [491]:
# Using SemiAnalysis BOM https://www.semianalysis.com/p/ai-server-cost-analysis-memory-is
non_gpu_dgx_cost = 5200 + 7860 + 3456 + 10908 + 563 + 875 + 463 + 1200 + 1485
non_gpu_dgx_cost

32010

In [492]:
dgx_h100_cost = 269010 - 190000
gpu_and_baseboard_cost = dgx_h100_cost - non_gpu_dgx_cost
gpu_per_dgx = 8
gpu_and_baseboard_cost_per_gpu = gpu_and_baseboard_cost / gpu_per_dgx
gpu_and_baseboard_cost_per_gpu

5875.0

Putting it all together to get the average overall expense per H100 sold:

In [493]:
# We have two precise cost estimates for an H100 GPU: $3,320 for the GPU and $5,875 for the GPU plus NVSwitch baseboard.
# Based on that we'll centre on $4,500 and range from $2,500 to $8000 to be conservative.
gpu_and_baseboard_cost_per_gpu = lognorm_from_90_ci(2500, 8000, NUM_SAMPLES)
print_median_and_ci(gpu_and_baseboard_cost_per_gpu)

Median: 4.3e+03 [90% CI: 2.5e+03, 7.9e+03]


In [494]:
# Also using conservative bounds on additional server components, centred on the SemiAnalysis estimate
non_gpu_dgx_cost_dist = lognorm_from_90_ci(non_gpu_dgx_cost / 1.5, non_gpu_dgx_cost * 1.5, NUM_SAMPLES)
print_median_and_ci(non_gpu_dgx_cost_dist)

Median: 3.2e+04 [90% CI: 2.1e+04, 4.8e+04]


In [495]:
dgx_h100_cost = gpu_and_baseboard_cost_per_gpu * gpu_per_dgx + non_gpu_dgx_cost
print_median_and_ci(dgx_h100_cost)

Median: 6.7e+04 [90% CI: 5.2e+04, 9.6e+04]


We're making an analogy between large clusters of H100 servers and large clusters of TPUv4 servers, so we base the average production cost on the DGX server cost.

In [496]:
dgx_h100_cost_per_gpu = dgx_h100_cost / gpu_per_dgx
print_median_and_ci(dgx_h100_cost_per_gpu)
dgx_h100_cost_per_gpu.mean()

Median: 8.3e+03 [90% CI: 6.5e+03, 1.2e+04]


8665.028437749053