You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a well known issue for pickle: recursive data structure (such as nested dict as used in py/pyahocorasick.py) do not pickle when you reach a certain depth.
First I comment out the slots line in pyahocorasick.py (such as I do not have to implement the __*state__ methods): #__slots__ = ['char', 'output', 'fail', 'children']
Then I run this:
$ python
Python 2.7.6 (default, Jun 22 2015, 17:58:13)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyahocorasick as pa
>>> for x in range(10):
... key = str(range(x, x+100))
...
>>> t=pa.Trie()
>>> for x in range(10):
... key = str(range(x, x+100))
... t.add_word(key, x)
...
>>> import pickle
>>> p=pickle.dumps(t)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/pickle.py", line 1374, in dumps
Pickler(file, protocol).dump(obj)
[...]
File "/usr/lib/python2.7/pickle.py", line 271, in save
pid = self.persistent_id(obj)
RuntimeError: maximum recursion depth exceeded
The obvious solution is to increase the recursion limit, but this fails quickly and can exhaust system resources:
>>> for x in range(100):
... key = str(range(x, x+1000))
... t.add_word(key, x)
...
>>> p=pickle.dumps(t)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/pickle.py", line 1374, in dumps
Pickler(file, protocol).dump(obj)
[...]
File "/usr/lib/python2.7/pickle.py", line 271, in save
pid = self.persistent_id(obj)
RuntimeError: maximum recursion depth exceeded
The culprit is that the Trie uses nested dictionaries (as opposed to the C Automaton that uses an array-based Trie structure. One way out would be to have a similar data structure in Python such that pickling works.
The text was updated successfully, but these errors were encountered:
Now why would I want to use this rather than the C automaton? as a go between to run a slower automaton on Windows for now that does not compile yet there.
Since this is really no longer an issue and that we have a windows build with #11 and the pure Python implementation is not meant to be a fool proof full feature implementation, I am closing this
This is a well known issue for pickle: recursive data structure (such as nested dict as used in py/pyahocorasick.py) do not pickle when you reach a certain depth.
First I comment out the slots line in pyahocorasick.py (such as I do not have to implement the
__*state__
methods):#__slots__ = ['char', 'output', 'fail', 'children']
Then I run this:
The obvious solution is to increase the recursion limit, but this fails quickly and can exhaust system resources:
Add a few more and it dies too:
The culprit is that the Trie uses nested dictionaries (as opposed to the C
Automaton
that uses an array-based Trie structure. One way out would be to have a similar data structure in Python such that pickling works.The text was updated successfully, but these errors were encountered: