Very bad performance iterating through a large number of entries #33

chrbala · 2018-03-26T00:17:31Z

Hello – thank you for working on SQLDown!

I'm trying to run the iterator through a large number of entries (~100,000). It is quite slow – I believe it is loading up all the entries before doing anything because it iterates through them quickly once they start up after several (~6) minutes. When testing on a smaller dataset (~3000 entries), I can read them in about 10 seconds total, with the majority of the time spent before any data is returned.

I believe that iter-stream is the bottleneck. It should either iterate one-at-a-time or batch the reads in reasonably-sized batches.

calvinmetcalf · 2018-03-26T12:31:24Z

could I get some more details like

what platform?
what database?

More generally we are using the built in cursor functionality of the database but it has to do a query in order to start the stream, so if the query is slow then it will take a while to get the first item because on the database it needs to complete the query before it can return anything, some possible issues

mysql doesn't support indexes on arbitrarily long fields so if you are using mysql then unless you set the keySize option there isn't an index on the keys which could be an issue. I also wrote this a while ago and have learned a lot about database performance since then so this could be something in the query that would be fixable.

chrbala · 2018-04-09T01:08:23Z

Sure!

platform: OSX Sierra 10.12.6
database: SQLite

It looks like the code branches don't make indexes for SQLite . Also, the readme specifies keyLength and valueLength, but the actual keys are keySize and valueSize.

Is there a reason why all the databases have three columns if (as the readme says) the column limitation is only with postgres?

calvinmetcalf · 2018-04-09T12:32:30Z

yeah the 3 columns are also useful for dealing with iterators while updates are made, we could use cursors or whatnot but that would be complicated to do for all the different db types, that being said, it does make indexes for SQLite, it makes indexes for every db that isn't mysql

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Very bad performance iterating through a large number of entries #33

Very bad performance iterating through a large number of entries #33

chrbala commented Mar 26, 2018

calvinmetcalf commented Mar 26, 2018

chrbala commented Apr 9, 2018

calvinmetcalf commented Apr 9, 2018

Very bad performance iterating through a large number of entries #33

Very bad performance iterating through a large number of entries #33

Comments

chrbala commented Mar 26, 2018

calvinmetcalf commented Mar 26, 2018

chrbala commented Apr 9, 2018

calvinmetcalf commented Apr 9, 2018