Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: more flexible Function interface for non-String args #57

Closed
dhleong opened this issue Jan 13, 2018 · 7 comments
Closed

Proposal: more flexible Function interface for non-String args #57

dhleong opened this issue Jan 13, 2018 · 7 comments
Assignees

Comments

@dhleong
Copy link
Contributor

dhleong commented Jan 13, 2018

In particular, accessing matchinfo() in a custom rank() function is a bit tricky. I see #39 suggests just writing it in C and I may eventually want to do that, but for now I have a rather hacky workaround:

            val bytes = packed.toByteArray(charset = utf16)
            val buffer = ByteBuffer.wrap(bytes)
            buffer.order(ByteOrder.nativeOrder())

            val longBuf = buffer.asLongBuffer()
            val ns = IntArray(bytes.size / 8) {
                longBuf.get().toInt()
            }

This works, but for each call to rank() it allocates the String, the Byte Array it contains, the String Array to hold the String, the Byte Array copy from the String, and the IntArray that ultimately holds the results. This is because the Int32s from sqlite's matchinfo() get packed into a UTF-16 string. xerial's JDBC client calls into some native functions to access the passed arguments directly, to avoid all those "middle man" allocations.

Now, lacking this sort of implementation on Android I can't benchmark to check if the allocation overhead would be worse than the JNI overhead, but that is my suspicion. A Function2 interface would also be more intuitive for these use cases, and avoids clients having to deal with NDK, which I think fits with the original goals for this project.

My hacky workaround seems to be performing okay for my dataset right now, but I wanted to put up this proposal to test the waters and see how difficult you guys think it'd be, and whether it's something you'd even be interested in. I'd be interesting in working on this myself, but I'm pretty busy with other parts of my project and it's not yet critical for me, so I didn't want to start without a nod of interest, and maybe a nudge in the right direction ;)

@npurushe npurushe self-assigned this Jan 17, 2018
@npurushe
Copy link
Contributor

Thanks for the detailed issue! yes I think having a way to read arguments in a CustomFunction interface makes sense, instead of just passing them all as strings.

@dhleong
Copy link
Contributor Author

dhleong commented Feb 12, 2018

Closed by #61 :)

@dhleong dhleong closed this as completed Feb 12, 2018
@mortenholmgaard
Copy link

mortenholmgaard commented Feb 24, 2018

Please make a release with this @npurushe

And how did @dhleong solve the matchinfo parsing with this feature? Which type did you use?
I have some problems getting the matchinfo parsed correctly (signed/unsinged types! and more)
I the code you added in the inital case, kotlin? I am having trouble with this line in java: longBuf.get().toInt()

@dhleong
Copy link
Contributor Author

dhleong commented Feb 24, 2018

Yep, it's kotlin. That code was the workaround for using the old Strings interface, and doesn't actually work super well in all cases, which is why I submitted the PR.

If you use the code from #61, it's much simpler. The matchinfo() function returns a Blob that you can wrap in a ByteBuffer:

          // for a call like rank(matchinfo(index, 'nlpx')):
            val buffer = ByteBuffer.wrap(args.getBlob(0))

            val totalDocsCount = buffer.int  // 'n'
            val documentSize = buffer.int  // 'l'
            val phrases = buffer.int  // 'p'

            val hitCount = buffer.int  // 'x[0]'
            buffer.int // skip x[1]
            val docsWithHits = buffer.int  // 'x[2]'

@dhleong
Copy link
Contributor Author

dhleong commented Feb 24, 2018

Also, since I may have misunderstood your question and maybe you were actually not sure what the kotlin was doing, the .toInt() there was just casting the result of longBuf.get() to an int; Kotlin doesn't let you just cast between primitive types; you have to explicitly convert them.

The .int calls in the above example code actually compile to .getInt() calls on the ByteBuffer. It's part of Kotlin's automatic conversion of getters and setters to "properties."

@mortenholmgaard
Copy link

Thanks a lot!!
It works great with getBlob. I have spend a lot of hours to try to make the SQLiteCustomFunction work with matchinfo.
Do you know if we should use ByteOrder.nativeOrder() make sure to support all devices or just ByteOrder.LITTLE_ENDIAN for the sqlite blob?

@dhleong
Copy link
Contributor Author

dhleong commented Feb 25, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants