Aaron Boodman edited this page Jul 21, 2015 · 3 revisions

Random perf numbers (until we have a perfbot)

"mlb test"

This test imports one day of MLB pitch data into noms using xml_importer.

Numbers indicate time (as measured by time) to write to an empty directory vs. one that already contains the necessary tiered directory structure that FileStore uses (but not the files themselves). A "Dumb copy" entails opening every file in a file store, reading every byte, and writing every byte to a new file in another location. This is intended to mimic a best-case noms importer that doesn't need to do any processing at all, but does need to create the same number of files and populate them with the same data.

  • Dumb copy: 3.5s/2.4s
  • xml_importer: 8.5s/6.2s

We also measured the time to import the same day worth of data to AWSStore. Eep:

  • xml_importer: 35m32s
  • pitchmap/index: 10m48s