Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write performance with secondary indices under Node with LevelDB store #47

Closed
aral opened this issue May 22, 2018 · 4 comments

Comments

Projects
None yet
2 participants
@aral
Copy link

commented May 22, 2018

I’m seeing writes on Node with the LevelDB store take 10x-30x longer to complete when a secondary index is set in the code below (actual performance varies depending on which column the index is set on. Also, it doesn’t seem to work when set on the “done” column). Given that write speeds otherwise appear to increase linearly, this is surprising. I would expect a ~2x increase in time for adding an index.

Thoughts?

const { nSQL } = require('nano-sql')

let timer = null
function resetTimer () {
  timer = new Date()
}
function displayDurationFor (taskName) {
  let duration = (new Date()) - timer
  console.log(`${taskName} took ${duration}ms.`)
}

async function start () {
  const connectionResult = await nSQL('hellodb')
    .model([
      {key: 'id', type: 'int', props: ['pk', 'ai']},
      {key: 'date', type: 'string'/*, props: ['idx']*/}, // uncomment the secondary index to compare timings
      {key: 'description', type: 'string'},
      {key: 'done', type: 'int'}
    ])
    .config({
      mode: 'PERM'
    })
    .connect()

  const numInserts = 10000
  const inserts = [] 
  for (let i = 0; i < numInserts; i++) {
    inserts.push(nSQL('hellodb').query('upsert', { date: new Date(), description: `Item: ${i}`, done: Math.round(Math.random()) }).exec())
  }

  // Time inserts
  resetTimer()
  const addResult = await Promise.all(inserts)
  displayDurationFor(`${numInserts} inserts`)
}

start()

ClickSimply added a commit that referenced this issue May 24, 2018

@ClickSimply

This comment has been minimized.

Copy link
Owner

commented May 24, 2018

Hey, thanks for looking into this. You're right, that's quite the performance penalty!

The newest github commit for 1.6.2 includes some performance adjustments regarding the secondary indexes. Using your test I observed the secondary indexes only decreasing performance 3-5x instead of 10-30x.

I'm not sure we're gonna be able to get better than a 3-5x cost for the secondary indexes due to these limitations:

  1. Upserts are actually put in a queue and executed sequentially to prevent secondary indexes from becoming inconsistent.
  2. For each secondary index we have to perform these actions in addition to writing the record to the database:
    a - Read the secondary index row from the database.
    b - Add the primary key of the new row and write the new secondary index row back.

Both of these are pretty quick since we can go straight to the primary key, but physics are likely to provide a hardstop of performance at no less than 3x cost compared to normal write performance.

I'll update the documentation soon to reflect this.

@ClickSimply

This comment has been minimized.

Copy link
Owner

commented May 26, 2018

Just kidding, 1.6.3 is now live on Github with significantly increased performance. Testing on my machine shows the normal test is around ~160ms and the secondary index test is around ~210ms, the secondary index performance penalty doesn't seem to get above 50%, being closer to 30% slower normally.

There's now a secondary index cache held in memory, this prevents the need to do the round trips to the database. The changes to the cache are flushed to the database no more than every 100ms.

The new cache can be disabled by passing cache: false into the config object, getting you the old performance numbers of 3-5x.

@aral

This comment has been minimized.

Copy link
Author

commented May 27, 2018

Sounds awesome, @ClickSimply, thank you :)

@ClickSimply

This comment has been minimized.

Copy link
Owner

commented May 30, 2018

Marking this resolved, feel free to reopen it if you have more questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.