Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sqlite-utils bulk --batch-size option #392

Closed
simonw opened this issue Jan 26, 2022 · 4 comments
Closed

sqlite-utils bulk --batch-size option #392

simonw opened this issue Jan 26, 2022 · 4 comments
Labels
cli-tool enhancement New feature or request

Comments

@simonw
Copy link
Owner

simonw commented Jan 26, 2022

Could add support for --batch-size as seen in insert/upsert too - causing it to break the list up into batches and commit for each one.

Originally posted by @simonw in #391 (comment)

@simonw simonw added cli-tool enhancement New feature or request labels Jan 26, 2022
@simonw
Copy link
Owner Author

simonw commented Jan 26, 2022

Relevant code:

# For bulk_sql= we use cursor.executemany() instead
if bulk_sql:
with db.conn:
db.conn.cursor().executemany(bulk_sql, docs)
return

@simonw
Copy link
Owner Author

simonw commented Jan 26, 2022

Help for insert says:

  --batch-size INTEGER      Commit every X records

@simonw
Copy link
Owner Author

simonw commented Jan 26, 2022

Can use this utility function:

def chunks(sequence, size):
iterator = iter(sequence)
for item in iterator:
yield itertools.chain([item], itertools.islice(iterator, size - 1))

@simonw
Copy link
Owner Author

simonw commented Jan 26, 2022

Manually tested it like this:

# Create database with an empty "lines" table
sqlite-utils create-table bulk-test.db lines line text
# Stream records every 0.5s, commit every 5 records
stream-delay docs/python-api.rst -d 500 | \
  sqlite-utils bulk bulk-test.db 'insert into lines (line) values (:line)' - \
  --lines --batch-size 5

Running datasette bulk-test.db showed that records would show up about every 2.5s five at a time.

@simonw simonw closed this as completed in d1d2a8e Jan 26, 2022
simonw added a commit that referenced this issue Jan 26, 2022
simonw added a commit that referenced this issue Feb 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cli-tool enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant