New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Insert big docs #250
Insert big docs #250
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good thanks for adding. Couple comments for you to consider but nothing wrong with this as it is if it measures what you're looking for.
BatchSize: 1 | ||
Document: | ||
x: {^RandomInt: {min: 0, max: 2147483647}} | ||
string0: {^RandomString: {length: 15000000}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
I would suggest using
string0: {^FastRandomString: {length: 15000000}}
. The tradeoff is slightly less entropy but much faster generation of the strings. I made this change on my laptop and genny was able to produce writes of around 80mb/sec versus only 25 mb/sec with regularRandomString
. -
Consider dividing up the load between a number of threads. I.e. change
Threads: 1
to something likeThreads: 10
and then divideDocumentCount:
by the new threadcount to maintain the same number of documents. This will of course change what the workload is doing - doing multiple writes in parallel so may not be what you're aiming for. But since this is single-threaded as it is you're going to be waiting between eachinsert_one
invocation for genny to produce the next document.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Yeah good call. FastRandomString is perfect for this use case
- I'm pretty sure we want just 1 thread for this particular workload. The 10 thread case may be interesting but can be done as future work (in the form of another phase)
@@ -36,7 +36,7 @@ | |||
'console_scripts': [ | |||
'genny-metrics-report = genny.cedar_report:main__cedar_report', | |||
'genny-metrics-legacy-report = genny.legacy_report:main__legacy_report', | |||
'lint-yaml = genny.workload_linter:main' | |||
'lint-yaml = genny.yaml_linter:main' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Stealing this PR 😛 since I promised @ldennis to get some DSI results today. |
This adds a new workload that inserts big documents.