benchmark: populating your instance and stress test with locust #31

ppanero · 2020-03-18T08:36:37Z

Adds:

Populating the instance with rpm-records demo. Points to the bulk indexing used in helm-invenio, which is much more performant (3M records in a few hours). I think rdm-records should stay as is, 10 records is more than enough and is better not to add complexity. WDYT?
How to use locust and points to the locust.py file.

closes #34

fenekku

Couple of clarifications. I like that we see more and more documentation for this.

docs/deployment/benchmark.md

fenekku · 2020-03-18T12:59:05Z

docs/deployment/benchmark.md

@@ -0,0 +1,65 @@
+# Benchmarking your Invenio instance
+
+## Populate your instance


Normal/discussion: What is the bigger context for why we are showing how to add fake random records to the real deployed system in the benchmarking page? Maybe this is about "Checking your deployed Invenio instance" or "Verifying your Invenio instance" i.e. 1) check that you can add records to your deployed instances 2) check what is the load your deployment can take ?

I guess it is an optional step to engage people. They have the full experience of a deployed instance with some data there so they can see them in the UI. Personally I perceive the deployment as the containerized version but without the warning "You should not put this in production" :)

how to add fake random records to the real deployed system in the benchmarking page?

It is true I did not mention it, sorry: I wouldn't do this test in a production/real instance. Mainly because it is highly likely to take it down, several times. For example, we managed to render the DB to not respond for some time (~20mins) until the DBAs acted on it... It is quite a hardcore test.

On this, my idea was:

If you have a production instance, you might not want to make an effort to "duplicate" your data, nor use the prod DB/ES for stress test. However, you do want to create a sensible amount of records that represent the amount of records in prod. E.g. ES does not search as fast with 1 record than with 3M. It might need improvements on the shards rather than the application architecture, but still this test would point out slow parts.

You do not have a production instance. So you need to create "dummy" records to simulate what you would expect to have.

On naming on the section, I am open to change it. But I would still mention this here :) WDYT?

What about "Creating demo records" and explain a bit what you said about the purpose behind it? I think the 2 examples you just mentioned are quite useful for people :)

Adding some of the context/explanation/motivation you just provided would be helpful and changing the filename + main title from "Benchmarking your InvenioRDM instance" to "Stress testing your InvenioRDM instance" (or "Load testing" or "verifying" or something like that :) ) would clarify the context.

docs/deployment/benchmark.md

fenekku

Nice. Last thing for me is just the filename/title change. You can merge once you are happy with the name.

docs/deployment/benchmark.md

ppanero · 2020-03-20T12:23:21Z

@fenekku thanks once again! I will apply the changes and merge. However, about naming: I actually prefer benchmarking, because even if we have used for stress test it doesnt have to be a stress test... You can just run your instance on a specific load and "benchmark" it.

We did stress test it by getting as high as we could (on requests haha) but it doesnt have to be that way. Plus for title its shorter. WDYT?

fenekku · 2020-03-20T12:55:15Z

I am not too attached to the name. That being said, what I find awkward is the record creation part under benchmarking... The locust part does benchmarking, but the record creation part is more for making sure the application is operating correctly, yes? Maybe I missed it, but we are not measuring anything when it comes to creating records; it's more a binary of did it work or not.

Maybe record creation could be placed in another file...
or
Maybe "Verifying your InvenioRDM instance" verifying.md

?

ppanero · 2020-03-23T08:06:31Z

I personally dont like "verifying" cuz it sounds like a "config check" or so not like testing performance. In terms of record creation, it can be placed somewhere else, however I did not want to create yet another top level section and I did not fin any fit. Where would you put it? (I'll move it and finish this PR :D)

fenekku

This is really small, but it actually connects the "Populate your instance" section with the "Benchmark" making its presence in this file all good for me. You can merge with this.

docs/deployment/benchmark.md

ppanero · 2020-03-23T14:57:01Z

@fenekku sorry about the misunderstanding on:

I will apply the changes and merge. However, about naming [...]

I hadn't commit the text with the explanations, cuz I wanted to do it just once (thats why I wanted to clarify the naming first). To avoid ping-pong, it seems I managed the contrary 👼 I'm sorry.

I have added the following explanation (see in file also):

The operations that we are going to run in this section can be highly
 demanding. Therefore, you might not want to run them against a production
 instance since it might cause downtime.

  Nonetheless, we want to have an environment that resembles as close as
 possible to our InvenioRDM instance. In consequence, we might want to add some
 data to our instance in order to have some records to stress-test against.

Which justifies the data population, and is basically what I explained above in the discussion point also with Zach.

fenekku · 2020-03-24T13:23:39Z

Aaah I get it now. All good.

ppanero requested review from fenekku and zzacharo March 18, 2020 08:36

fenekku reviewed Mar 18, 2020

View reviewed changes

ppanero force-pushed the benchmark branch from 90aec60 to 3365f2d Compare March 19, 2020 08:29

fenekku reviewed Mar 19, 2020

View reviewed changes

docs/deployment/benchmark.md Show resolved Hide resolved

fenekku approved these changes Mar 19, 2020

View reviewed changes

docs/deployment/benchmark.md Outdated Show resolved Hide resolved

ppanero mentioned this pull request Mar 23, 2020

release tasks: march inveniosoftware/cookiecutter-invenio-rdm#57

Closed

33 tasks

fenekku approved these changes Mar 23, 2020

View reviewed changes

docs/deployment/benchmark.md Outdated Show resolved Hide resolved

ppanero force-pushed the benchmark branch from 3365f2d to 1524996 Compare March 23, 2020 14:53

zzacharo approved these changes Mar 23, 2020

View reviewed changes

ppanero added 2 commits March 24, 2020 18:01

benchmark: populating your instance and stress test with locust

b5b763e

deployment: fix grammar

70ef0e1

ppanero force-pushed the benchmark branch from 1524996 to 70ef0e1 Compare March 24, 2020 17:03

ppanero merged commit 70ef0e1 into inveniosoftware:master Mar 24, 2020

ppanero deleted the benchmark branch March 30, 2020 09:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

benchmark: populating your instance and stress test with locust #31

benchmark: populating your instance and stress test with locust #31

ppanero commented Mar 18, 2020 •

edited

Loading

fenekku left a comment

fenekku Mar 18, 2020

zzacharo Mar 19, 2020 •

edited

Loading

ppanero Mar 19, 2020 •

edited

Loading

zzacharo Mar 19, 2020

fenekku Mar 19, 2020

fenekku left a comment

ppanero commented Mar 20, 2020

fenekku commented Mar 20, 2020

ppanero commented Mar 23, 2020

fenekku left a comment •

edited

Loading

ppanero commented Mar 23, 2020

fenekku commented Mar 24, 2020

		@@ -0,0 +1,65 @@
		# Benchmarking your Invenio instance

		## Populate your instance

benchmark: populating your instance and stress test with locust #31

benchmark: populating your instance and stress test with locust #31

Conversation

ppanero commented Mar 18, 2020 • edited Loading

fenekku left a comment

Choose a reason for hiding this comment

fenekku Mar 18, 2020

Choose a reason for hiding this comment

zzacharo Mar 19, 2020 • edited Loading

Choose a reason for hiding this comment

ppanero Mar 19, 2020 • edited Loading

Choose a reason for hiding this comment

zzacharo Mar 19, 2020

Choose a reason for hiding this comment

fenekku Mar 19, 2020

Choose a reason for hiding this comment

fenekku left a comment

Choose a reason for hiding this comment

ppanero commented Mar 20, 2020

fenekku commented Mar 20, 2020

ppanero commented Mar 23, 2020

fenekku left a comment • edited Loading

Choose a reason for hiding this comment

ppanero commented Mar 23, 2020

fenekku commented Mar 24, 2020

ppanero commented Mar 18, 2020 •

edited

Loading

zzacharo Mar 19, 2020 •

edited

Loading

ppanero Mar 19, 2020 •

edited

Loading

fenekku left a comment •

edited

Loading