Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validations support #44

Closed
wants to merge 16 commits into from
Closed

Validations support #44

wants to merge 16 commits into from

Conversation

chatman
Copy link
Collaborator

@chatman chatman commented Feb 6, 2023

Added support for validating results of query benchmark tasks.

Workflow for a user:

  1. Run their suite with -g (generate validation) flag. This will generate a file in the suites/ dir containing a tuple <query, numFound, facets>.
  2. When satisfied with the generated file, the same can be used for validations subsequently as follows:
    a. Add a "validation": "" parameter in the query-benchmark definition,
    b. Run stress.sh with a -v (validate) flag. It will use the validations file in the query benchmark task and report number of successful and failed queries.

@@ -104,33 +106,125 @@ public static void runQueryBenchmarks(List<QueryBenchmark> queryBenchmarks, Stri

for (int threads = benchmark.minThreads; threads <= benchmark.maxThreads; threads++) {
QueryGenerator queryGenerator = new QueryGenerator(benchmark);
List<SolrBenchQuery> queryResults = new Vector<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Experimented a bit in this change to see whether we can get generated queries from controllerExecutor#run instead. 😊

The proposed changes aim at:

  1. Keep the query supplier focused on generating/supplying the queries, it might not be entirely obvious of why query supplier having an list of queryResults as input param (and as mutable, which triggers other concerns such as concurrent access, logic flow etc).
  2. On the other end if controllerExecutor#run method returns a list of "result", that would be more straight forward to understand - these are the results obtained from executing all the supplied runnable/callables.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good :-)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#46

I think once ^ this one is merged, I'll rebase this PR. :-)

…-callable

Proposals to use callable instead of runnable and collect executed qu…
if (StressMain.generateValidations) {
int totalDocsIndexed = getTotalDocsIndexed(baseUrl, collection);

FileWriter outFile = new FileWriter("suites/validations-"+benchmark.queryFile+"-docs-"+totalDocsIndexed+"-queries-"+benchmark.totalCount+".tsv");
Copy link
Contributor

@patsonluk patsonluk Feb 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we use json for this file since:

  1. The facet part is already formatted as json
  2. This would give us better flexibility to do different types of comparison that TSV. For example we can add the total doc indexed as a part of the result and also verify different params per query result in the future (more than just numFound and facets)
  3. In general it's easier to follow json with keys (especially that the steps require dev to go through the generated files) than TSV, which just have the values 😅

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to json.

@hiteshk25
Copy link
Contributor

Thanks @chatman good to see this feature!

int success = 0, failure = 0;

if (StressMain.generateValidations) {
int totalDocsIndexed = getTotalDocsIndexed(baseUrl, collection);
Copy link
Contributor

@patsonluk patsonluk Feb 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is structured under runQueryBenchmarks, perhaps it shouldn't couple with docs indexed as well? This number could mean different things too? The total docs in the collection or docs indexed during the benchmarking?

As for "external" mode, it is valid to run ONLY query benchmarking (ie in the test plan, only test task is query) and start with a pre-populated collection. Therefore such number could change and probably shouldn't be captured here.

Perhaps we can safely remove this as the step of validation does not seem to validate on total # of docs either.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intention was to use that only to name the validation file, but not for any validation. I agree that validating with this wouldn't make sense in external mode.

"\t" + numFound + "\t" + new ObjectMapper().writeValueAsString(facets).replace('\n', ' ') + "\n");
}
outFile.close();
} else if (StressMain.validate && benchmark.validationFile != null && new File("suites/" + benchmark.validationFile).exists()) {
Copy link
Contributor

@patsonluk patsonluk Feb 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧠 storming a little bit here. Wondering if we can make this more abstract and provide at least 2 implementations:

  1. The validator that's based on a previously generated input file - as implemented here
  2. The validator that's based on more static criteria - might be as simple as just verify the doc count as > 0

I do think that the current validation proposal (with a file) would be the preferred option in most cases, however, when I think about the FS test workflow (solr-perf driven by K8s), we might not really want to further complicate the procedure of input doc/query update.

If we apply this file validation to our K8s test flow, then on each query/input docs update, we will need to run it once as a different task (generate result), capture the validation file, then update the corresponding image with the new validation file etc.

It's not super complicated, but it's not trivial either. I think perhaps a middle ground is that : as far as the query executed returns something, we can assume they are legit results.

The main goal of solr-bench is to confirm the performance change due to code changes, the validation imho is useful to flag any obvious anomaly that renders the benchmarking results useless. While it's nice if the validation works more accurately (as suggested by the validation file approach), the added burden to maintain such file might not be that appealing for our use case.

Any thoughts? @justinrsweeney @hiteshk25 @chatman 😊

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validator that's based on a previously generated input file - as implemented here

The validator that's based on more static criteria - might be as simple as just verify the doc count as > 0

That makes sense. We can have validations as a complex JSON object in the configuration, and specify either we blindly compare against a file (your point 1), or some special conditions (e.g. results > 0) etc.

taskResults.put("mean", controlledExecutor.stats.getMean());
taskResults.put("total-queries", controlledExecutor.stats.getN());
taskResults.put("total-time", time);
if (StressMain.validate) taskResults.put("validations-succeeded", success);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit: combine these 2 if cases into one? :)

}
}
}
}

private static Map<String, Pair<Integer, String>> loadValidations(QueryBenchmark benchmark) throws IOException {
Copy link
Contributor

@patsonluk patsonluk Feb 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per pervious thought about make the validation logic abstract, we might be able to move all the loadValidations (and other logic such as preprocessing logic, string replacement, validation file export/import logic) into the concrete implementation of that. For example having a Validator#recordQueryResults(generatedQueries) and a Validator#validateResults(generatedQueries)

This could offload some of the logic from BenchmarksMain 😊

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, to refactor it into an abstraction. Also, I'll raise a separate refactoring PR to rationalize the class names a bit, move every operation (restart, indexing, querying, pause etc.) into its own class (instead of today's huge if-else ladder).

Random random;
AtomicLong counter = new AtomicLong();
final SimpleDateFormat DATE_FORMAT = new SimpleDateFormat("yyyy-MM-dd");
List<String> queryStrings = new ArrayList<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit, can probably remove this field as it's only used locally in the ctor ?

int totalDocsIndexed = getTotalDocsIndexed(baseUrl, collection);

FileWriter outFile = new FileWriter("suites/validations-"+benchmark.queryFile+"-docs-"+totalDocsIndexed+"-queries-"+benchmark.totalCount+".tsv");
for (SolrBenchQuery sbq: generatedQueries) {
Copy link
Contributor

@patsonluk patsonluk Feb 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might not work properly, I assume that we want every single executed queries during the test and verify them all?

Instead of returning a new instance of QueryRequest for each call as in https://github.com/fullstorydev/solr-bench/pull/44/files#diff-e8aa24ed8e0a14385ea7fdfa11268a85f6dcce52021610761f29a76c334e558bL57, it's now returning a saved instance of the stored field in QueryGenerator https://github.com/fullstorydev/solr-bench/pull/44/files#diff-e8aa24ed8e0a14385ea7fdfa11268a85f6dcce52021610761f29a76c334e558bR73 . Therefore calls that uses the same query will overwrite previous results.

However, I do have some concerns if we correct this by generating new SolrBenchQuery instance per call, as the design right now store both the underlying QueryRequest and the string response and those are kept until the very end to generate the validation file/validate result.

Both QueryRequest and the response string used to have a much shorter lifespan (just for the local call and can get GCed), but now they can no longer be GC until the final validation logic. So if we have a huge number of query runs, we could run into memory problem.

@chatman
Copy link
Collaborator Author

chatman commented Mar 27, 2023

FYI, this cannot be merged at the moment. Master branch has some errors, looking into them (#62).

@chatman chatman mentioned this pull request Apr 3, 2023
@chatman
Copy link
Collaborator Author

chatman commented Apr 3, 2023

Struggling to fix the merge conflicts, so opened a new one: #66

@chatman chatman closed this Apr 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants