Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read in arguments for benchmark tool #55

Closed
spacejam opened this issue Aug 9, 2017 · 9 comments
Closed

read in arguments for benchmark tool #55

spacejam opened this issue Aug 9, 2017 · 9 comments
Assignees
Projects
Milestone

Comments

@spacejam
Copy link
Owner

spacejam commented Aug 9, 2017

the benchmark tool should accept arguments for these parameters:

  1. number of threads
  2. number of total operations
  3. proportion of reads, writes, cas, del, and scan operations among the operations
  4. freshness bias (prefer recent/likely to be in cache, prefer old/not likely to be in cache, no preference)
  5. non-present-key chance (the chance that a request for cas/get/set/del may be sent for a key that does not exist)
  6. key size min, max, median
  7. value size min, max, median
  8. scan iterations min, max, median
@spacejam spacejam added this to the performance milestone Aug 9, 2017
@spacejam spacejam mentioned this issue Aug 9, 2017
5 tasks
@spacejam spacejam added this to doing in v1 Aug 9, 2017
@pmuens
Copy link
Collaborator

pmuens commented Aug 10, 2017

Thanks for the writeup @spacejam 👍

Jumped into this today. Here are just some quick questions / notes from my side which came up during the implementation.

I decided to use the clap crate which is quite popular for such use-cases to implement the CLI application. Any objections with that choice? Is it maybe too bloated? Here are some other crates we could consider.

However so far I'm pretty happy with clap since it's really easy to use and quite powerful as well.

Could you get into more detail regarding the options we want to use here?

  • Which options are required?
  • Which options from "option groups" can be used together and which ones are XOR (e.g. "key size min, max, median" <-- should only one argument usage be valid here or can the user pass arguments for all of them?)

Here's what I came up with so far:

  • number of threads
    • --num-threads - u64 - (defaults to X (TBD))
  • number of total operations
    • --num-operations - u64 - (defaults to X (TBD))
  • proportion of reads, writes, cas, del, and scan operations among the operations
    • --prop-reads - u64
    • --prop-writes - u64
    • --prop-cas - u64
    • --prop-del - u64
    • --prop-scan - u64
  • freshness bias (prefer recent/likely to be in cache, prefer old/not likely to be in cache, no preference)
    • --freshness-bias - String - (defaults to "no preference")
  • non-present-key chance (the chance that a request for cas/get/set/del may be sent for a key that does not exist)
    • --no-present-key-chance - bool
  • key size min, max, median
    • --key-size-min - u64
    • --key-size-max - u64
    • --key-size-median - u64
  • value size min, max, median
    • --value-size-min - u64
    • --value-size-max - u64
    • --value-size-median - u64
  • scan iterations min, max, median
    • --scan-iter-min - u64
    • --scan-iter-max - u64
    • --scan-iter-median - u64

Thanks in advance!

@spacejam
Copy link
Owner Author

  • clap: it looks awesome! I haven't used it, but if you like it, let's stick with it!
  • required options: none
  • option groups: none, but warn when !(min <= median && median <= max), (the log + env_logger crates are a nice first pick for outputting stuff, and we can decorate the logs later with things like machine resource utilization stats with a custom logger etc...)
  • default threads: number of cpu cores
  • operations: 1 million (doctor evil.jpg)
  • reads -> get, writes -> set (cas is a write, scan is a read, so we should be specific to the operation on the tree)
  • freshness-bias: valid values: old, new, random
  • --no-present-key-chance maybe should be --non-present-key-chance
  • defaults: 80% get + 15% set, + 4% scan + 1% cas. 64 byte keys, 512 byte values, with min/max/median all being the same. number of cpu cores on the machine for thread count. 50 scan iterations.

@spacejam
Copy link
Owner Author

for number of cpu cores, I already included the num_cpus crate, so we can use that to get the number

@pmuens
Copy link
Collaborator

pmuens commented Aug 12, 2017

@spacejam could you please get into more detail about the following comments?

defaults: 80% get + 15% set, + 4% scan + 1% cas.

Could you maybe provide an example CLI input with the get, set, scan, delete and cas arguments and how they're translated to percentages? I understand it in a way that you e.g. say --get 10 --set 20 --scan 50 .... But how would we be able to translate that to percentages like the ones described above since the user provides the numbers (or am I missing smth. obvious here?)?

64 byte keys, 512 byte values, with min/max/median all being the same.

Does this mean that the key we use should be exactly 64 byte long and the value always 512 bytes?

Thanks in advance!

@spacejam
Copy link
Owner Author

@pmuens no need for percentages, we can just sum all of the proportions together, use that as a max number to feed a random number generator, (in your example, 80 is the max). Say it spits back 22. We see get is 10, which is less than 22, so we chop off 10 then go to the next. set is 20, and now we're at 12, so we decide that this operation will be a set.

yeah, exactly 64 / 512

@pmuens
Copy link
Collaborator

pmuens commented Aug 13, 2017

@spacejam thanks for the comment 👍

That makes sense! I'll link the outcome of the conversation here in #56 where this will be implemented!

@spacejam spacejam moved this from doing to done in v1 Aug 14, 2017
@pmuens
Copy link
Collaborator

pmuens commented Aug 15, 2017

@spacejam just one quick question regarding this:

@pmuens no need for percentages, we can just sum all of the proportions together, use that as a max number to feed a random number generator, (in your example, 80 is the max). Say it spits back 22. We see get is 10, which is less than 22, so we chop off 10 then go to the next. set is 20, and now we're at 12, so we decide that this operation will be a set.

Unfortunately I'm stuck understanding the random number generation usage here.

Could you provide a quick example how the proportions for a set of given Tree operations would be calculated using this? Thanks in advance! 👍

@spacejam
Copy link
Owner Author

so, if any of the tree op types are provided, the defaults for all of the others should become 0.

bench --set=5 --del=2

for each iteration of each thread that is running commands:

sum = 5 + 2
ops = vec![(Op::Set, 5), (Op::Del, 2)];

let mut choice = rand::gen_range::<usize>(0, sum);

for (op, weight) in ops {
  if weight >= choice {
    return op;
  }
  choice -= weight;
}

@pmuens
Copy link
Collaborator

pmuens commented Aug 16, 2017

Thanks for the explanation and the code-snippet @spacejam 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
v1
  
done
Development

No branches or pull requests

2 participants