Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report: GC times and improvments from recent optimzations. #3462

Open
kevina opened this issue Dec 3, 2016 · 9 comments
Open

Report: GC times and improvments from recent optimzations. #3462

kevina opened this issue Dec 3, 2016 · 9 comments
Labels
need/analysis Needs further analysis before proceeding

Comments

@kevina
Copy link
Contributor

kevina commented Dec 3, 2016

Repos where created by doing the following:

1000x: random 4000 > afile && ipfs add afile
100x: random-files -q -dirs 4 -files 35 -depth 5 -random-fanout -random-size d && ipfs add -r d
100x: random-files -q -dirs 4 -files 35 -depth 5 -random-fanout -random-size d && ipfs add --pin=false -r d
10x: random-files -q -dirs 4 -files 100 -depth 7 -random-fanout -random-size d && ipfs add -r d
10x: random-files -q -dirs 4 -files 100 -depth 7 -random-fanout -random-size d && ipfs add --pin=false -r d
2x: random-files -q -dirs 4 -files 100 -depth 10 -random-fanout -random-size d && ipfs add -r d
2x: random-files -q -dirs 4 -files 100 -depth 10 -random-fanout -random-size d && ipfs add --pin=false -r d

Other notes:

  • The -seed option was used in random and random-files in order to get deterministic results.
  • Repo. where created both with with and without using the --raw-leaves option.
  • Tests where done before and after any of my optimizations landed.

Implemenation notes:

  • Due to XXX three IPFS repo's where created, one with normal leaves, one with the incorrect Cid for raw leaves, and one with the correct Cid.
  • All three repo where on a 32 Gib xfs partition on linux so that I can easily backup the partition to undo the effects of a GC run.

This creates a repo with around 1.6 million small blocks (under 4k), perfect for stress testing the pinner and filesystem. About 50% of the content is pinned.

The G.C. is done in basically two passes, the first pass collects the colored set, that is the set of blocks to keep, the second step scans the blockstore and removes any blocks in the set. I measured each pass seperatly. For the first pass the only thing that made and measurable difference is if raw leaves, which lead to a speedup of over an order of magnitude.

What Normal Leaves Raw Leaves Speedup
Get Colored Set (First Time) 282s 20.7s 13.6x
Get Colored Set (In Cache) 26.3s 2.1s 12.5x

For the second part when there are a significant amount of item's to blocks to delete the time spent deleting the files outweighs all other factors. In addition the time spend is highly depeneding on the
filesystem used. For the xfs paratation I got different numbers depending on which repo. was used. The time was anywhere from 20 to 75 minutes. I have no idea what can be done about deletion time as long as we stick to one file per block. Increasing the prefix length in flatfs might help (there where on average 3,100 files per directory), but I have not investigated that.

After the initial GC was run and there was nothing to delete my optimizations has an effect on the second pass:

What Pre Opt Post Opt Speedup
Normal Leaves, First Time 5.7s 2.6s 2.2x
Normal Leaves, Second Time 5.8s 2.5s 2.3x
Raw Leaves, First Time 217s 4.4s 49x
Raw Leaves, Second Time 5.8s 2.2s 2.6x

For the case with normal leaves, since everything is pinned, the fpass to get the colored already loaded everything in the OS cache so the second time numbers are about the same as the first.

When raw leaves are used the story is very different. Only a small fraction of the blocks needed to be read from the store for the first pass so during the second pass there was still a lot of disk IO.

I double checked the first time results with raw leaves, there really is that large of a speed up after my optimizations. This is most likely due to the fact that after my optimizations the inodes for the files did not need to read from the disk in order to just list the blocks in the flatfs datastore.

In all cases when everything is in the os cache the speed up from my optimization is somewhere between 2x and 3x.

@kevina
Copy link
Contributor Author

kevina commented Dec 3, 2016

Actually I did discover there was a problem when using raw leaves here are some preliminary numbers:

Total time delete with normal leaves: 23 minutes
Total time delete with raw leaves: 65 minutes

The problem is with raw leaves most of the blocks are in the AFZBE directory in the flatfs here is a count of some of the dirs with raw leaves:

AFZBE 1568285
CIQA2 100
CIQA3 78
CIQA4 68

and without raw leaves

CIQA2 3160
CIQA3 3173
CIQA4 3162
CIQA5 3160

I will report more later.

@whyrusleeping
Copy link
Member

Ah, interesting. We're going to have to increase the shard width for cidV1... good catch.

@kevina kevina changed the title Report GC times and improvments from recent optimzations. Report: GC times and improvments from recent optimzations. Dec 3, 2016
@kevina
Copy link
Contributor Author

kevina commented Dec 3, 2016

Ah, interesting. We're going to have to increase the shard width for cidV1... good catch.

@whyrusleeping see #3463 and let's take the discussion on how to handle it over there.

@kevina
Copy link
Contributor Author

kevina commented Dec 3, 2016

After looking more closely at the deletion times and double checking the results I determined in one case having most of the files in a single directory helped runtime:

Total time delete with normal leaves: 23 minutes
Total time delete with raw leaves (incorrect Cid): 65 minutes
Total time delete with raw leaves (correct Cid): 8 minutes

Weird. I guess it is highly depending on how xfs chose to store the files in the large directory.

(Having so many files in a single directory also explains why ext4 was performing badly. I even managed to trigger the dir_index hash bug.)

@Kubuxu Kubuxu added status/ready Ready to be worked need/analysis Needs further analysis before proceeding and removed status/ready Ready to be worked labels Dec 5, 2016
@whyrusleeping
Copy link
Member

@kevina It would be really nice to have a script that creates these files and runs the tests you were running and outputs timings. This would be a wonderful thing to be able to put into an automated performance regression test (cc @victorbjelkholm ). Where we could run on a specific machine a set of tests to get performance numbers for each commit (or maybe a nightly checkpoint of master).

That said, these numbers look great! I'm really happy to have all of this in :)

@kevina
Copy link
Contributor Author

kevina commented Dec 5, 2016

@whyrusleeping here is the script to create the repo:
https://gist.github.com/kevina/63efcab752d1a15e2fde6651b76d2a80

For the numbers I just used a hacked up version of IPFS.

@rht
Copy link
Contributor

rht commented Jan 30, 2017

@kevina

Would be really really great if you could wrap the script in a repo. Someone needs to do this, as I'm kind of tired of waiting for arbitrary non-transparent approval from the members/those-with-commit-access, or that this has to be implemented by these people, otherwise it is "wrong" (apologize for this statement beforehand if it might not be according to any of the terms in code-of-conduct.md, but I think it is fair to say what I think is the case while remaining respectful, and to be objectively corrected if it is not the case). I have several local benchmark-scripts that I could put in. There is also the network bitswap duplicate benchmark by @ion1 buried somewhere. Benchmarks are means of structured progress (see e.g. the NIST hash function competition for hash function, Stanford Question Answering Dataset benchmark for AI reading comprehension, or http://www.spacex.com/hyperloop for ... hyperloop), where anyone could objectively verify for veracity, and where it reduces reliance on centralized reviews, for 1. the maintainers might sometimes be wrong, 2. this reduces their workload through automaton. Finally, regardless of how I stated this all, I hope there is a comprehensive benchmark suite, for this reason (I did not think of this leveling use case when I stated it was redundant with natural datasets test case in #3621 (comment)).

@whyrusleeping
Copy link
Member

@rht If you want to help out with automated performance benchmarks you should take part in our ongoing CI/CD sprint. It would be great to collect these various benchmarks together and have them be automatically run for each new PR.

@rht
Copy link
Contributor

rht commented Jan 30, 2017

(I am in fact in the middle of curating all these scattered benchmarks into https://github.com/rht/sfpi-benchmark)

@momack2 momack2 added this to Inbox in ipfs/go-ipfs May 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need/analysis Needs further analysis before proceeding
Projects
No open projects
Development

No branches or pull requests

4 participants