Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix memory leak in live loader #5473

Merged
merged 2 commits into from
May 26, 2020
Merged

Fix memory leak in live loader #5473

merged 2 commits into from
May 26, 2020

Conversation

ashish-goswami
Copy link
Contributor

@ashish-goswami ashish-goswami commented May 19, 2020

This PR aims to fix a memory leak in live loader. Currently while getting uid for a NQuad
subject/uid object, we call xidMap.AssignUid(). This subject/uid object is of string type.
We suspect that these uids(strings) are still holding references to original string, which
we get after reading single lines in checker.Parse(). Hence GC is unable to clean these strings.
In this PR, while getting uid for any NQuad subject/uid object, we pass a copy of it to
xidMap.AssignUid().

Original profile:

File: dgraph
Build ID: b80d1c50901c66e79b21afb1ed72a92bed950dc9
Type: inuse_space
Time: May 19, 2020 at 3:52pm (IST)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 62468.74MB, 100% of 62486.76MB total
Dropped 28 nodes (cum <= 312.43MB)
      flat  flat%   sum%        cum   cum%
42551.37MB 68.10% 68.10% 42551.37MB 68.10%  github.com/dgraph-io/dgraph/xidmap.(*XidMap).AssignUid
19902.91MB 31.85% 99.95% 19902.91MB 31.85%  bytes.(*Buffer).ReadString
   13.50MB 0.022%   100% 19923.42MB 31.88%  github.com/dgraph-io/dgraph/chunker.(*rdfChunker).Parse
    0.95MB 0.0015%   100% 42554.34MB 68.10%  github.com/dgraph-io/dgraph/dgraph/cmd/live.(*loader).processLoadFile.func1
         0     0%   100% 19931.42MB 31.90%  github.com/dgraph-io/dgraph/dgraph/cmd/live.(*loader).processFile
         0     0%   100% 19931.42MB 31.90%  github.com/dgraph-io/dgraph/dgraph/cmd/live.(*loader).processLoadFile
         0     0%   100% 42553.38MB 68.10%  github.com/dgraph-io/dgraph/dgraph/cmd/live.(*loader).uid
         0     0%   100% 19931.42MB 31.90%  github.com/dgraph-io/dgraph/dgraph/cmd/live.run.func2

This PR:

File: dgraph
Build ID: d49e85006a57b932fa73218470d0e0b97fbb81d8
Type: inuse_space
Time: May 19, 2020 at 7:33pm (IST)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 28967.65MB, 99.43% of 29134.97MB total
Dropped 71 nodes (cum <= 145.67MB)
      flat  flat%   sum%        cum   cum%
16028.87MB 55.02% 55.02% 16034.87MB 55.04%  github.com/dgraph-io/dgraph/xidmap.(*XidMap).AssignUid
12928.89MB 44.38% 99.39% 12928.89MB 44.38%  strings.(*Builder).WriteString
    8.88MB  0.03% 99.42% 28990.15MB 99.50%  github.com/dgraph-io/dgraph/dgraph/cmd/live.(*loader).processLoadFile.func1
       1MB 0.0034% 99.43% 28975.27MB 99.45%  github.com/dgraph-io/dgraph/dgraph/cmd/live.(*loader).uid


This change is Reviewable

Docs Preview: Dgraph Preview

@github-actions github-actions bot added the area/live-loader Issues related to live loading. label May 19, 2020
@ashish-goswami ashish-goswami changed the title [WIP] Fix memory leak in live loader Fix memory leak in live loader May 20, 2020
Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 1 files at r1.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @vvbalaji-dgraph)

@ashish-goswami ashish-goswami merged commit 0ce7ff9 into master May 26, 2020
@ashish-goswami ashish-goswami deleted the ashish/liveloader branch May 26, 2020 11:28
ashish-goswami added a commit that referenced this pull request May 26, 2020
This PR aims to fix a memory leak in live loader. Currently while getting uid for a NQuad
subject/uid object, we call xidMap.AssignUid(). This subject/uid object is of string type.
We suspect that these uids(strings) are still holding references to original string, which
we get after reading single lines in checker.Parse(). Hence GC is unable to clean these strings.
In this PR, while getting uid for any NQuad subject/uid object, we pass a copy of it to
xidMap.AssignUid().

Original profile:

File: dgraph
Build ID: b80d1c50901c66e79b21afb1ed72a92bed950dc9
Type: inuse_space
Time: May 19, 2020 at 3:52pm (IST)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 62468.74MB, 100% of 62486.76MB total
Dropped 28 nodes (cum <= 312.43MB)
      flat  flat%   sum%        cum   cum%
42551.37MB 68.10% 68.10% 42551.37MB 68.10%  github.com/dgraph-io/dgraph/xidmap.(*XidMap).AssignUid
19902.91MB 31.85% 99.95% 19902.91MB 31.85%  bytes.(*Buffer).ReadString
   13.50MB 0.022%   100% 19923.42MB 31.88%  github.com/dgraph-io/dgraph/chunker.(*rdfChunker).Parse
    0.95MB 0.0015%   100% 42554.34MB 68.10%  github.com/dgraph-io/dgraph/dgraph/cmd/live.(*loader).processLoadFile.func1
         0     0%   100% 19931.42MB 31.90%  github.com/dgraph-io/dgraph/dgraph/cmd/live.(*loader).processFile
         0     0%   100% 19931.42MB 31.90%  github.com/dgraph-io/dgraph/dgraph/cmd/live.(*loader).processLoadFile
         0     0%   100% 42553.38MB 68.10%  github.com/dgraph-io/dgraph/dgraph/cmd/live.(*loader).uid
         0     0%   100% 19931.42MB 31.90%  github.com/dgraph-io/dgraph/dgraph/cmd/live.run.func2
This PR:

File: dgraph
Build ID: d49e85006a57b932fa73218470d0e0b97fbb81d8
Type: inuse_space
Time: May 19, 2020 at 7:33pm (IST)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 28967.65MB, 99.43% of 29134.97MB total
Dropped 71 nodes (cum <= 145.67MB)
      flat  flat%   sum%        cum   cum%
16028.87MB 55.02% 55.02% 16034.87MB 55.04%  github.com/dgraph-io/dgraph/xidmap.(*XidMap).AssignUid
12928.89MB 44.38% 99.39% 12928.89MB 44.38%  strings.(*Builder).WriteString
    8.88MB  0.03% 99.42% 28990.15MB 99.50%  github.com/dgraph-io/dgraph/dgraph/cmd/live.(*loader).processLoadFile.func1
       1MB 0.0034% 99.43% 28975.27MB 99.45%  github.com/dgraph-io/dgraph/dgraph/cmd/live.(*loader).uid

(cherry picked from commit 0ce7ff9)
dna2github pushed a commit to dna2fork/dgraph that referenced this pull request Jul 18, 2020
This PR aims to fix a memory leak in live loader. Currently while getting uid for a NQuad
subject/uid object, we call xidMap.AssignUid(). This subject/uid object is of string type.
We suspect that these uids(strings) are still holding references to original string, which
we get after reading single lines in checker.Parse(). Hence GC is unable to clean these strings.
In this PR, while getting uid for any NQuad subject/uid object, we pass a copy of it to
xidMap.AssignUid().

Original profile:

File: dgraph
Build ID: b80d1c50901c66e79b21afb1ed72a92bed950dc9
Type: inuse_space
Time: May 19, 2020 at 3:52pm (IST)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 62468.74MB, 100% of 62486.76MB total
Dropped 28 nodes (cum <= 312.43MB)
      flat  flat%   sum%        cum   cum%
42551.37MB 68.10% 68.10% 42551.37MB 68.10%  github.com/dgraph-io/dgraph/xidmap.(*XidMap).AssignUid
19902.91MB 31.85% 99.95% 19902.91MB 31.85%  bytes.(*Buffer).ReadString
   13.50MB 0.022%   100% 19923.42MB 31.88%  github.com/dgraph-io/dgraph/chunker.(*rdfChunker).Parse
    0.95MB 0.0015%   100% 42554.34MB 68.10%  github.com/dgraph-io/dgraph/dgraph/cmd/live.(*loader).processLoadFile.func1
         0     0%   100% 19931.42MB 31.90%  github.com/dgraph-io/dgraph/dgraph/cmd/live.(*loader).processFile
         0     0%   100% 19931.42MB 31.90%  github.com/dgraph-io/dgraph/dgraph/cmd/live.(*loader).processLoadFile
         0     0%   100% 42553.38MB 68.10%  github.com/dgraph-io/dgraph/dgraph/cmd/live.(*loader).uid
         0     0%   100% 19931.42MB 31.90%  github.com/dgraph-io/dgraph/dgraph/cmd/live.run.func2
This PR:

File: dgraph
Build ID: d49e85006a57b932fa73218470d0e0b97fbb81d8
Type: inuse_space
Time: May 19, 2020 at 7:33pm (IST)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 28967.65MB, 99.43% of 29134.97MB total
Dropped 71 nodes (cum <= 145.67MB)
      flat  flat%   sum%        cum   cum%
16028.87MB 55.02% 55.02% 16034.87MB 55.04%  github.com/dgraph-io/dgraph/xidmap.(*XidMap).AssignUid
12928.89MB 44.38% 99.39% 12928.89MB 44.38%  strings.(*Builder).WriteString
    8.88MB  0.03% 99.42% 28990.15MB 99.50%  github.com/dgraph-io/dgraph/dgraph/cmd/live.(*loader).processLoadFile.func1
       1MB 0.0034% 99.43% 28975.27MB 99.45%  github.com/dgraph-io/dgraph/dgraph/cmd/live.(*loader).uid
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/live-loader Issues related to live loading.
Development

Successfully merging this pull request may close these issues.

None yet

2 participants