New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bulk insertion : avoid commiting each time a triple is added #9
Comments
Have you tried the |
I have submitted a pull request for Graph.addN method which does not currently write the content in the database. |
For the record, I think @fconil touched a broader issue here, which I tried to summarize as issue #357 on rdflib. |
I am bumping this quite old issue as it is quite relevant to my use cases as well. Currently, the performance drop using SQLAlchemy store is mostly in execute / commit time for my project. Trying to see if it would be possible to propose a simple fix for this issue based on former code by @fconil |
I'm closing this issue since the stated problem of bulk insertion is well supported by I do, however, see the need to support more flexible transaction management for other uses cases involving multiple reads and writes to the triple store and for external transaction management. Currently, there's a mix of manual transaction management logic and absence thereof which needs to be removed to support sqlalchemy's existing transaction management. This does require additional application logic to manage transactions, but I see that as just the cost of working with a transactional store. |
Hello,
The current behaviour of the plugin make SQLAlchemy work in autocommit mode.
Each time a triple is added, it is committed making bulk insertion very slow : 1 min 30 s for 500 triples with SQLite, 25 seconds with MySQL (using the triples files of rdflib-benchmark).
The old rdflib-mysql plugin was not issuing a commit on each triple insertion but only when the commit method of the store was used.
I made an quick and dirty change of the plugin to test the impact on performance : begin a transaction when the store is opened and commit only when the store commit method is called.
In this context, 500 triples are added in 0.3 second for SQLite and 1.15 seconds for MySQL.
https://github.com/ktbs/rdflib-sqlalchemy/blob/avoid_autocommit/rdflib_sqlalchemy/SQLAlchemy.py
Maybe autocommiting could be a store parameter ?
Regards
The text was updated successfully, but these errors were encountered: