Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf issue: ipfs add file ~1.5 GB #234

Closed
btc opened this issue Oct 29, 2014 · 22 comments
Closed

perf issue: ipfs add file ~1.5 GB #234

btc opened this issue Oct 29, 2014 · 22 comments
Labels
kind/bug A bug in existing code (including security flaws)

Comments

@btc
Copy link
Contributor

btc commented Oct 29, 2014

executing ipfs add on 1.5 GB file at 3a0bbe2

  • Wall clock time: 2-4 minutes
  • Peak memory usage: ~6.35 GB

The same command executed on a 500 MB file consumed 2.13 GB at peak. Extrapolating wildly, this is ~4x overhead.

We may want to start benchmarking operations on data characteristic of common usage patterns. Next week, I can throw up a Jenkins box to return these jobs regularly.

LevelDB periodically rearranges data on disk to merge newly inserted items. Compaction rewrites existing data, slowing down other operations. The compaction algorithm blocks writers while background compaction is active.

😠 You're killin' me Smalls.

image

ipfs-add-1 5gb-at-3a0bbe2aa8f

@btc btc added the kind/bug A bug in existing code (including security flaws) label Oct 29, 2014
@jbenet
Copy link
Member

jbenet commented Oct 29, 2014

Woah, that's crazy. Is there a similar graph showing the used memory?

Also, i think part of the issue is we read the whole file. We should be able to add a 1.5GB file to IPFS without using more than 1MB of memory (above the standard Go process mem usage) sequentially (so xN for parallelism, but i think right now ipfs add is all sequential).

@jbenet
Copy link
Member

jbenet commented Oct 29, 2014

cc @whyrusleeping

@btc
Copy link
Contributor Author

btc commented Oct 29, 2014

Woah, that's crazy. Is there a similar graph showing the used memory?

This is a different file. This time, ~1.7 GB. (Performance is different for files that already exist. Don't want to go through trouble of wiping db)

Peak Mem: 7.2 GB (hit swap. thrashed 15 GB in/out)
Wall Clock: long enough

On the brightside, our performance metrics have so much room to grow! #optimism #growth-potential #cs149

Total: 3331.9 MB
  1659.2  49.8%  49.8%   1659.2  49.8% github.com/jbenet/go-ipfs/merkledag/internal/pb.(*PBNode).Marshal
  1658.5  49.8%  99.6%   1658.5  49.8% github.com/jbenet/go-ipfs/Godeps/_workspace/src/code.google.com/p/goprotobuf/proto.(*Buffer).EncodeRawBytes
     8.0   0.2%  99.8%      8.0   0.2% github.com/jbenet/go-ipfs/Godeps/_workspace/src/github.com/syndtr/goleveldb/leveldb/memdb.New
     1.5   0.0%  99.9%      1.5   0.0% github.com/jbenet/go-ipfs/Godeps/_workspace/src/github.com/syndtr/goleveldb/leveldb/util.(*BufferPool).Get
     0.6   0.0%  99.9%      0.6   0.0% github.com/jbenet/go-ipfs/Godeps/_workspace/src/code.google.com/p/goprotobuf/proto.(*Buffer).enc_slice_int64
     0.6   0.0%  99.9%      0.6   0.0% github.com/jbenet/go-ipfs/Godeps/_workspace/src/github.com/syndtr/goleveldb/leveldb/table.(*Writer).writeBlock
     0.5   0.0%  99.9%      0.5   0.0% github.com/jbenet/go-ipfs/Godeps/_workspace/src/github.com/syndtr/goleveldb/leveldb.(*Batch).grow
     0.5   0.0%  99.9%      0.5   0.0% github.com/jbenet/go-ipfs/Godeps/_workspace/src/github.com/syndtr/goleveldb/leveldb/util.makeSlice
     0.5   0.0%  99.9%      0.5   0.0% allocg
     0.5   0.0% 100.0%      0.5   0.0% github.com/jbenet/go-ipfs/Godeps/_workspace/src/github.com/jbenet/go-multihash.Encode

ipfs-add-1 7gb-3a0bbe2aa8f0778030ba9d926c852f660f67ba96

@jbenet
Copy link
Member

jbenet commented Oct 29, 2014

Haha. Yeah, we're doing it wrong. io.Reader + chunking should do it fine.

@whyrusleeping
Copy link
Member

Huh... why aren't using the dagwriter? It solves this problem

@jbenet
Copy link
Member

jbenet commented Oct 30, 2014

isnt it? add was written before dagwriter. that's probably why.

@btc
Copy link
Contributor Author

btc commented Oct 30, 2014

@jbenet @whyrusleeping

What do you guys make of the LevelDB perf issue? The memory pressure and LevelDB compaction appear to be independent.

@whyrusleeping
Copy link
Member

Yeah, I'll switch that over and report back results

@btc
Copy link
Contributor Author

btc commented Oct 30, 2014

Thank you @whyrusleeping

After #236, adding a 540 MB file exhibits sane memory R/W patterns, trivial peak allocation.

after

@btc
Copy link
Contributor Author

btc commented Oct 30, 2014

@jbenet 1.7 GB

NB: This test was performed on OS X. Profiles are known to be inaccurate on this OS. See http://golang.org/issue/6047

image

@dborzov
Copy link
Contributor

dborzov commented Oct 30, 2014

Whoah
@whyrusleeping , what is a dagwriter?
@maybebtc, how do you generate these fancy flow charts?

@jbenet
Copy link
Member

jbenet commented Oct 30, 2014

@btc
Copy link
Contributor Author

btc commented Oct 30, 2014

@dborzov

Simply compile your binary with CPU (or Mem) profiling enabled.

pprof.StartCPUProfile(ofi) // at beginning
doWork()
defer pprof.StopCPUProfile() // when you want to stop
pprof.WriteHeapProfile(mprof) // at the end of execution

Then, run your program. On exit, files are created with profiling information, one for CPU (and one for Mem).

Run

go tool pprof ipfs cpu.prof

In this example, ipfs is the name of the binary and cpu.prof is the name of the file I chose.

This opens a repl.

(pprof) web

Run web to create the svg file. It'll open in a browser and the path to the svg will be provided in the terminal.

@jbenet
Copy link
Member

jbenet commented Oct 30, 2014

Ok, so I tried a dumb parallelizing dagwriter writing to leveldb, because i thought that perhaps there was a slowdown in the sequential leveldb compress-(cpu)-then-write-(disk) cycle. But it doesn't help at all (here's the code for that just in case it's wanted later: b16b5b9d03da)

@btc
Copy link
Contributor Author

btc commented Oct 30, 2014

To document things we've considered. Feel free to edit.

@btc
Copy link
Contributor Author

btc commented Oct 30, 2014

I'm going to measure perf with snappy compression disabled.

@jbenet
Copy link
Member

jbenet commented Oct 30, 2014

If we pre-compress raw blocks the extra snappy step may not help much.

@btc
Copy link
Contributor Author

btc commented Oct 30, 2014

@btc
Copy link
Contributor Author

btc commented Oct 31, 2014

As of 3270ab0, seeing substantial improvements on yesterday's data sets.

Now, disk latency appears to be the rate limiting factor. That's a good sign. Memory use stays below 80 MB. CPU time is spent in a manner that appears reasonable.

For the curious, the graphs below show the hot spots where CPU time is spent and RAM is allocated.

  • Disabling snappy compression addressed the Compaction starvation. ff490a6
  • Using a streaming reader/writer addressed memory bloat. 18ada93
  • Avoiding write-twice may have had a part to play as well. Buyer beware: no tests were performed to isolate this variable. 461e5a3

Unless there are any objections, my recommendation is that we close this issue. For specific performance concerns that arise in the future, we can open new issues.

@btc btc closed this as completed Oct 31, 2014
@btc btc reopened this Oct 31, 2014
@btc
Copy link
Contributor Author

btc commented Oct 31, 2014

CPU profiling on 1.7 GB file at 3270ab0 (Linux)

cpu

@btc
Copy link
Contributor Author

btc commented Oct 31, 2014

Mem profiling on 1.7 GB file at 3270ab0 (Linux)

mem

@btc btc closed this as completed Oct 31, 2014
@jbenet
Copy link
Member

jbenet commented Oct 31, 2014

👍

@aschmahmann aschmahmann mentioned this issue Dec 1, 2021
80 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws)
Projects
None yet
Development

No branches or pull requests

4 participants