Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fused types #16

Open
orbisvicis opened this issue Aug 18, 2017 · 2 comments
Open

fused types #16

orbisvicis opened this issue Aug 18, 2017 · 2 comments

Comments

@orbisvicis
Copy link

Would it be possible to fuse the _unicode and _byte functions? Apart from fetching the next element (and technically the hard-coded sizeof's), both cython code paths are identical, I think. The pure-python is 100% identical. I'm not familiar with cython and the code seems more complicated than the standard ahocorasick implementation, so maybe not.. I'm thinking of some lookup table matching type -> get_next_element_function. Perhaps then the container type can by extended to any (python) sequence type and the contained type can be any (python) comparable type. Cython automatically selects the fastest type, I think, so str->UCS4, bytes->char, int->int?, bool->bint, ->.

I'd also like to support the c bitarray extension module, which is a char*-backed List[Bool], without the intermediate boxing of bit to python boolean. Any ideas?

@scoder
Copy link
Owner

scoder commented Aug 20, 2017

Good question. I remember thinking about merging the two at some point, but given how critical the performance is here, ended up optimising them separately. I even recall making them more similar again at some point, but that was already a while ago...

I would expect bitarray to export a buffer, which could then be unpacked and used in acora. Since the bytes type (and Python's bytearray, array.array, memoryview and others) supports the buffer interface as well, it should be enough to switch the current bytes implementation to char[:] buf (or unsigned char[:]?) as input and pass &buf[0] and length.

@scoder
Copy link
Owner

scoder commented Aug 20, 2017

Want to give it a try?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants