Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid unmarshalling QueryRes.Hits into json default types #744

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

georgantasp
Copy link

@georgantasp georgantasp commented Nov 2, 2023

Q A
Bug fix? no
New feature? yes?
BC breaks? no
Related Issue Fix #743
Need Doc update yes

Describe your change

Unmarshal QueryRes.Hits into a dedicated Hit struct that holds json.RawMessage (alias for []byte) and the bare minimum of object attributes. Implement custom UnmarshalJSON and MarshalJSON to handle this json optimization.

It should be a considered a breaking change because QueryRes.Hits was previously exported as []map[string]interface. However, not breaking for developers using QueryRes.UnmarshalHits.

What problem is this fixing?

During the unmarshalling of the QueryRes object, the existing implementation will completely unmarshal each hit object into json default types. As a result, the method QueryRes.UnmarshalHits remarshals the default types just to unmarshal again into the users's desired type. For large result sets and/or large objects, this can be wasteful.

Updated 2024-05-08:

image
The gap between the green http.request span and the UnmarshalHits span is where the Aloglia client unmarshals the response into the QueryRes struct and Hits map[string]interface{}. The UnmarshalHits span takes even longer than the gap because the client re-marshals the Hits just to unmarshal into the user's desired struct.

@georgantasp georgantasp changed the title Full Unmarshaling and Remarshalling of Hits should be avoided Avoid unmarshalling QueryRes.Hits into json default types Nov 2, 2023
@vredens
Copy link
Contributor

vredens commented Mar 18, 2024

+1

@Fluf22
Copy link
Contributor

Fluf22 commented Jun 21, 2024

Hey @georgantasp 👋🏻

I'm really sorry we took so long to get back to you.

We can't merge your PR as-is because it would introduce breaking changes, which we won't do until we release our new client, at least.

However, if you're up to slightly updating your PR (and I promise I'll answer faster this time), you can introduce a new RawHits variable that would be a []byte directly.
That way, it will give you (and everyone else) raw data you can unmarshal at your convenience in your project with your own structs!

Feel free to reach out if you disagree or want more explanations ☺️

Thanks!

@georgantasp
Copy link
Author

Hey @Fluf22 👋

Understood about the breaking change. I expected that might force the decision.

Your alternative idea is interesting...
My first thought is that it would be doubling the memory requirements. Probably not a show stopper for us, but could affect others?
I played with the code quickly... The encoding/json package doesn't support decoding the same param to two struct attributes, nevermind two struct attributes of differing types. With custom unmarshall/marshal of QueryRes I can get it to work, but it feels pretty hacky. Let me know what you think.

I had also heard that there was a brand new client in the works. Has that gotten any closer to release?

Thanks for your review. Cheers

@Fluf22
Copy link
Contributor

Fluf22 commented Jun 21, 2024

I'll get back to you on Monday about this alternative, I didn't think about the fact it would be painful to unmarshal the same prop into two different types

About the new clients, we are targeting GA this quarter.

Sadly, despite what was initially planned, I don't think the first release of this new version will ship with generics, as we try to use struct methods, and generic on methods are not allowed in Go.

@Fluf22
Copy link
Contributor

Fluf22 commented Jun 25, 2024

@georgantasp you're right that it may not be the most efficient way to solve this issue, in the end. We better not double the amount of memory consumption (for holding hits), if possible.

We just implemented something that could be useful for you in the beta version, if you're willing to try. Each method has a derivative xxxWithHTTPInfo that returns the raw response, allowing you to unmarshal it into your desired struct.
If you're not tied to the current version, and ready to go through some code refactoring in order to migrate to the beta version, this could be a potential solution.

Tell me if it's too much work to only get a raw []byte, or to do the migration to the beta version, as we can still plan the RawHits alternative, even if it's not really performant.

@Fluf22
Copy link
Contributor

Fluf22 commented Jun 25, 2024

Forgot to add an example, so it's more clear: here is the search method, returning the raw data from the API call, in the beta version of the client

@Fluf22
Copy link
Contributor

Fluf22 commented Jul 3, 2024

Hey @georgantasp ,
Just coming back to you to see if you had time to check if the beta migration could be suitable or not?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Avoid unmarshalling QueryRes.Hits into json default types
3 participants