Skip to content

Optimization

okay edited this page Oct 23, 2017 · 4 revisions

Baselines

To optimize sybil, we should setup a baseline of where we are now. My computer seems to have variable timing memory allocations, so the memory alloc timing should be used to gauge how long the other operations took. I also have encrypted SSD drives, so the timing for the disk accesses is slightly degraded.

Using uptime data to gauge sybil's perf, we have samples that look like host|status|ping|weight|category|id|timestamp.

  • Import 1mm uptime records: 37 - 40seconds
  • Baseline: allocate 15mm records: 0.574s
  • load and query unzipped time as a histogram: 1.2s
  • load and query unzipped time as a nested histogram: 1.3s
  • load and query zipped time as a histogram: 1.7s
  • load and query zipped time as a nested histogram: 1.9s

Future optimization opportunities

  • late materialization
  • flat / flex buffers for serialization
  • early filtering of columns