Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow multithreading via config #86

Closed
mufeedali opened this issue Jan 19, 2021 · 11 comments
Closed

Allow multithreading via config #86

mufeedali opened this issue Jan 19, 2021 · 11 comments
Labels
bug Something isn't working enhancement New feature or request
Milestone

Comments

@mufeedali
Copy link

mufeedali commented Jan 19, 2021

Ever since wn 0.4.0 , my application has been unable to use wn with the following error:

sqlite3.ProgrammingError: SQLite objects created in a thread can only be used in that same thread.

Full traceback:

Traceback (most recent call last):
  File "~/Projects/Wordbook/_build/testdir/share/wordbook/wordbook/window.py", line 246, in _on_search_clicked
    out = self.__search(text)
  File "~/Projects/Wordbook/_build/testdir/share/wordbook/wordbook/window.py", line 356, in __search
    return base.reactor(
  File "~/Projects/Wordbook/_build/testdir/share/wordbook/wordbook/base.py", line 381, in reactor
    return generate_definition(
  File "~/Projects/Wordbook/_build/testdir/share/wordbook/wordbook/base.py", line 92, in generate_definition
    return get_data(text, wordcol, sencol, wn_instance, accent)
  File "~/Projects/Wordbook/_build/testdir/share/wordbook/wordbook/base.py", line 154, in get_data
    definition = get_definition(term, word_col, sen_col, wn_instance)
  File "~/Projects/Wordbook/_build/testdir/share/wordbook/wordbook/base.py", line 171, in get_definition
    synsets = wn_instance.synsets(term)  # Get relevant synsets.
  File "/usr/lib/python3.9/site-packages/wn/_core.py", line 909, in synsets
    return [Synset(*synset_data, self) for synset_data in iterable]
  File "/usr/lib/python3.9/site-packages/wn/_core.py", line 909, in <listcomp>
    return [Synset(*synset_data, self) for synset_data in iterable]
  File "/usr/lib/python3.9/site-packages/wn/_queries.py", line 269, in find_synsets
    rows: Iterator[_Synset] = conn.execute(query, params)
sqlite3.ProgrammingError: SQLite objects created in a thread can only be used in that same thread. The object was created in thread id 140212385756736 and this is thread id 140212724569920.

Any pointers as to how to work around this would be much appreciated.

Edit: What I mean is, the same thread check should probably be disabled for read operations unless I'm missing something.

@goodmami
Copy link
Owner

In v0.4.0 I stored an open database connection on each wn.Wordnet object that was created (and note that one is implicitly created when using the functions wn.words(), wn.synsets(), etc.). Each entity object (wn.Word, wn.Synset, etc.) kept a pointer to the wn.Wordnet object that created it, and thus to that open connection. Keeping an open connection made queries much faster, but it also brought some headaches with the ability to copy or pickle these objects (see #84), and apparently with multithreading.

In v0.4.1 (released yesterday) I changed it so the open connection is stored by the wn._db module and not on these user-namespace objects. I would first try to use v0.4.1 and see if that helps. If not, we'll have to find some other way around it, such as a way to disable connection sharing between calls.

Let me know how it goes with v0.4.1

@goodmami goodmami added the bug Something isn't working label Jan 20, 2021
@mufeedali
Copy link
Author

Sad to report that the same issue persists in the new version too. It's due to me initializing Wordnet in another thread and then calling operations in it. But I don't know a better way to implement this as of yet.

@goodmami
Copy link
Owner

I see, too bad. I'm not very familiar with the threading API (although I have used multiprocessing and subprocess), so can you please provide a minimal example that illustrates the problem? That will help me with debugging.

@mufeedali
Copy link
Author

As a quick example, this code doesn't work with wn 0.4.1 for me:

#!/usr/bin/python3
import os
import threading

import wn
from wn import Wordnet


class wordnet_thread_demo:
    wn_instance = None

    def __init__(self):
        wn_thread = threading.Thread(target=self.set_wn_instance)
        wn_thread.start()
        wn_thread.join()
        print(self.wn_instance.synsets("word"))

    def set_wn_instance(self):
        if self.wn_instance is None:
            self.wn_instance = Wordnet(lexicon="ewn:2020")


wordnet_thread_demo()

@goodmami
Copy link
Owner

Great, thanks, I was able to reproduce the issue. Also we might try setting check_same_thread=False on sqlite3.connect(). If you're only reading and not writing the database this should be fine. It allowed me to run your script without problems. Maybe this could be a parameter on the wn.config object?

@mufeedali
Copy link
Author

I think that might work for me. Just checking, doesn't wn.download() always create a separate db connection from when we create an instance of wn.Wordnet ? Because that's the only time any form of writing would be done in my application.

@goodmami
Copy link
Owner

Just checking, doesn't wn.download() always create a separate db connection from when we create an instance of wn.Wordnet ?

That might have been the case in v0.4.0, but from v0.4.1 one database connection is made per database location. Basically, there is only one connection, unless you do something, then reassign wn.config.data_directory, then do something else.

Writing isn't exactly a problem with the multithreading, it's just that the onus is on the caller to prevent multiple threads from writing to the database at the same time. If you only have one thread that is writing, there shouldn't be any conflicts.

@mufeedali
Copy link
Author

Ah, I see. Thanks for clearing that up! Sounds good to me.

@goodmami goodmami added the enhancement New feature or request label Jan 25, 2021
@goodmami goodmami changed the title Multithreading failure Allow multithreading via config Jan 25, 2021
@goodmami
Copy link
Owner

Renaming the issue and adding the 'enhancement' label as this is now a feature request for some config option that enables the check_same_thread=False option on sqlite3.connect().

@goodmami goodmami added this to the v0.5.0 milestone Jan 25, 2021
@goodmami
Copy link
Owner

This is now resolved in v0.5.0. See the allow_multithreading config option.

@mufeedali
Copy link
Author

Thank you! I'll check it out!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants