Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

String De-Duplication Support #3378

Closed
thalapura-alapurom opened this issue Nov 21, 2023 · 6 comments
Closed

String De-Duplication Support #3378

thalapura-alapurom opened this issue Nov 21, 2023 · 6 comments

Comments

@thalapura-alapurom
Copy link

thalapura-alapurom commented Nov 21, 2023

Hi

During ExecutableGraphQL generation, it would be better to deduplicate string, since it occupied a considerable amount of memory due to duplication of string objects.

You could see that String alone took 4 MB.
Screenshot 2023-11-21 at 2 43 21 PM

Within that due to duplication, the size of single string is around 674KB. Took only took 2 object for reference.
Screenshot 2023-11-21 at 2 53 00 PM

So we would recommend to optimise this object with string deduplication concept. Also it will be great if the GraphQL object is serialisable (For compression purpose).

@bbakerman
Copy link
Member

Do you have any more insight on how we are duplicating strings. I don't undertand what you are showing in that second screen shot.

In general I would assume we mostly just point to string objects created initially from the SDL parse.

eg

        builder.name(typeDefinition.getName());

==

    public B name(String name) {
        this.name = name;
        return (B) this;
    }

The above is mostly what we do - that is set a pointer to the original string.

So I would love to get insight in where we might be doing String sCopy = new String(sOriginal.getChars()) say

Also it will be great if the GraphQL object is serialisable (For compression purpose).

A GraphQL object contains data and code. So how would the custom code like a registry of DataFetchers be serialized? Also how does serialisaton related to compression in a memory runtime sense ?

@thalapura-alapurom
Copy link
Author

Lets say I have 2 types which have some common fields say field1. When parsing sdl, 2 field1 string objects will be created. Hence duplication occurs.

Hence instead of directly assigning values, if we use string interning, then only one instance will be created for each strings.

public B name(String name) { this.name = name; return (B) this; }
will be replaced as
public B name(String name) { this.name = name.intern(); return (B) this; }

@thalapura-alapurom
Copy link
Author

Regarding serialisation, if an object is serialisable, I can convert the object into a sequence of characters and store it in a file system. Later we can retrieve the same and deserialise into original object

@stefanstrat27
Copy link

Regarding serialisation, if an object is serialisable, I can convert the object into a sequence of characters and store it in a file system. Later we can retrieve the same and deserialise into original object

I'm not 100% sure, but you might be able to store it as a binary file (the Java object itself) rather than a text file (serialized into JSON or something)

@bbakerman
Copy link
Member

for the record this PR from our friends at Netflix (originally) will help in this regard

#3504

@bbakerman
Copy link
Member

Closing this issue as addressed (somewhat) in the linked PR via v22

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants
@bbakerman @stefanstrat27 @thalapura-alapurom and others