Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize Query insertMangaImage & Typo Fix #678

Merged
merged 4 commits into from Apr 27, 2020
Merged

Optimize Query insertMangaImage & Typo Fix #678

merged 4 commits into from Apr 27, 2020

Conversation

jwshields
Copy link
Contributor

Hello!

This PR is to change the insertMangaImage function which inserts images into the Manga table;
I have also fixed a typo in the dropDatabase function.

  1. I have renamed the insertMangaImage function to insertMangaImages - this is to more closely align the name of the function with what it does with these changes.
  2. I have changed a few lines in the process_image function; this is to improve speed and resource utilization while inserting manga into the DB, and to better form the data for the insertMangaImages function.
  3. The dropDatabase function had a typo in it, I changed the query from DROP IF EXISTS TABLE -> DROP TABLE IF EXISTS (Docs: https://www.sqlite.org/lang_droptable.html)

Explanation & Reasoning:

The previous method invoked the insertMangaImage function for each page in a manga; this can create a lot of overhead, in terms of the number of times variables are passed around, the number of times extra functions are invoked, and as well as opening/committing/closing a connection to the SQLite DB for each page.

The changes I have made reduce that to invoking the insertMangaImage(s) function once, opening and closing a single connection to the DB, and adding all needed rows to the DB in a batch operation-
This should result in a large increase in speed of inserting manga, but it largely depends on the number of pages in an image/artwork.
In my manual testing in the python console with dummy data, I've seen about a 3-5x increase in speed, but my testing may not be a direct parallel.

Please let me know if you think there are other changes needed here, or if you have feedback on these changes.


Info:

import sqlite3
import timeit
tempdb = sqlite3.connect(":memory:")
cur = tempdb.cursor()
def redoTable():
    cur.execute('''DROP TABLE IF EXISTS temp_manga_table''')
    tempdb.commit()
    cur.execute('''CREATE TABLE temp_manga_table (image_id TEXT, save_name TEXT, page TEXT);''')
    tempdb.commit()
def genlist():
    templist = list()
    tempnum = 0
    while 150 > tempnum:
        tempitem = (f"aaa{tempnum}", f"aaa{tempnum}", f"aaa{tempnum}")
        templist.append(tempitem)
        tempnum += 1
    return templist
def insertone():
    templist = genlist()
    for item in templist:
        tempcur = tempdb.cursor()
        tempcur.execute('''INSERT OR IGNORE INTO temp_manga_table VALUES(?,?,?)''', item)
        tempdb.commit()
def insertmany():
    templist = genlist()
    tempcur = tempdb.cursor()
    tempcur.executemany('''INSERT OR IGNORE INTO temp_manga_table VALUES(?,?,?)''', templist)
    tempdb.commit()
def insertandcreateone():
    redoTable()
    insertone()
def insertandcreatemany():
    redoTable()
    insertmany()
timeit.timeit(insertandcreateone, number=50)
0.05511809999995876
timeit.timeit(insertandcreatemany, number=50)
0.019706499999983862

timeit.timeit(insertandcreateone, number=500)
0.5120501000000104
timeit.timeit(insertandcreateone, number=500)
0.5187395000000379
timeit.timeit(insertandcreatemany, number=500)
0.17566390000001775
timeit.timeit(insertandcreatemany, number=500)
0.18835190000004332

timeit.timeit(insertandcreateone, number=5000)
5.130790800000341
timeit.timeit(insertandcreatemany, number=5000)
1.776915800000097

timeit.timeit(insertandcreateone, number=50000)
52.0960113000001
timeit.timeit(insertandcreatemany, number=50000)
17.997701900000266

jwshields and others added 4 commits April 13, 2020 23:28
Move insertNewMember statement higher in the processMember function to prevent a "race" condition where the member is not always inserted into the table.
@Nandaka Nandaka merged commit 9828d39 into Nandaka:master Apr 27, 2020
byjtje pushed a commit to byjtje/PixivUtil2 that referenced this pull request Oct 30, 2020
* Update autoAddMember

Move insertNewMember statement higher in the processMember function to prevent a "race" condition where the member is not always inserted into the table.

* Optimize DB Query InsertManga

* Fix typo in dropDatabase function
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants