Skip to content

Hardware Performance

chrisjmccormick edited this page Apr 5, 2018 · 2 revisions

VSX Board

Storage

Each VSX card has 42 GB of storage for holding reference vectors.

Single Query Latency

Each VSX card searches its memory at a rate of 20 ms / GB.

This leads to a very simple equation for calculating the expected latency for a single vector:

Latency Equation

Note that datasets are always distributed evenly across all available VSX cards.

Remember when applying this calculation that:

  • After integer mapping, your components are 8-bit integers and not 32-bit floats.
  • 1 GB = 2^30 Bytes (not 1E9 bytes!). Similarly, 1 GB = 2^10 MB.

Batch Query Latency

Each VSX card has 16 query "slots" which allow it to process up to 16 querries simultaneously. This means that when submitting querries as a batch, you can divide the latency by 16.

Examples

The MNIST training dataset consists of 55,000 reference vectors with 1,024 components each. Therefore, using two VSX cards, the expected latency for a single query is given by:

Latency = (55,000 vectors x 1,024 bytes / 2^30bytes/GB) / 2 VSX Cards x 20ms/GB = 0.52ms

If we instead present a batch of 1,024 queries, the latency becomes:

Batch Latency = 1,024 queries x 0.52ms/query / 16 slots = 33ms

Clone this wiki locally