Skip to content

Conversation

MikhailShchatko
Copy link
Collaborator

After these changes a version generation for mongodb-mongo-master project has increased from ~337 sec to ~372 secs on my local machine:

{"timestamp":"2022-08-01T10:30:16.525079Z","level":"INFO","fields":{"message":"generation completed: 337 seconds"},"target":"mongo_task_generator"}
{"timestamp":"2022-08-01T10:39:08.927945Z","level":"INFO","fields":{"message":"generation completed: 372 seconds"},"target":"mongo_task_generator"}

Currently in evergreen it generates the version in ~143 seconds:
https://evergreen.mongodb.com/task_log_raw/mongodb_mongo_master_generate_tasks_for_version_version_gen_99286ff7b1f837df8449ef990b881c3ed1e3a64b_22_08_01_04_38_52/0?type=T#L915

{"timestamp":"2022-08-01T04:45:40.350646Z","level":"INFO","fields":{"message":"generation completed: 143 seconds"},"target":"mongo_task_generator"}

@MikhailShchatko MikhailShchatko requested review from a team and zituo-jin and removed request for a team August 1, 2022 10:41
@MikhailShchatko MikhailShchatko changed the title Generate tasks separately for Windows, MacOS, Linux distro groups DAG-1940 Generate tasks separately for Windows, MacOS, Linux distro groups Aug 1, 2022
@zituo-jin
Copy link

Are we going to open another PR to fix uneven workloads? How does task splitting generate workloads with this change?

@MikhailShchatko
Copy link
Collaborator Author

One of the reasons that tasks was split unevenly is that before these changes tasks were split based on historic runtime information from one of the build variants that runs the task.

It turned out that on some buildvariants some of the tests are running significantly longer than on other buildvariants, but with the different ratio than the other tests, e.g. runtimes can be like
test1.js - linux 10 secs, windows 12 secs, macos 14 secs
test2.js - linux 10 secs, windows 15 secs, macos 777 secs

So if we split task based on macos historic data, it will be unevenly split for linux and windows. Ideally we should split per each buildvariant historic data, but it will dramatically increase the version_gen task runtime. That's why we decided to split differently on major platforms.

I ran a patch build with these changes. The results are not perfect, but better than before:
https://evergreen.mongodb.com/task/mongodb_mongo_master_nightly_windows_display_aggregation_secondary_reads_patch_27c4d326ad61ffe9140127e368b348fbd32d9bca_62e8f1e532f4175239530161_22_08_02_09_44_27
https://evergreen.mongodb.com/task/mongodb_mongo_master_nightly_enterprise_windows_display_aggregation_secondary_reads_patch_27c4d326ad61ffe9140127e368b348fbd32d9bca_62e8f1e532f4175239530161_22_08_02_09_44_27
https://evergreen.mongodb.com/task/mongodb_mongo_master_nightly_enterprise_amazon2_arm64_display_aggregation_secondary_reads_patch_27c4d326ad61ffe9140127e368b348fbd32d9bca_62e8f1e532f4175239530161_22_08_02_09_44_27
https://evergreen.mongodb.com/task/mongodb_mongo_master_nightly_macos_arm64_display_aggregation_secondary_reads_patch_27c4d326ad61ffe9140127e368b348fbd32d9bca_62e8f1e532f4175239530161_22_08_02_09_44_27

To make it better we also need to update the splitting algorithm itself:

let max_tasks = min(self.config.n_suites, test_list.len());
let runtime_per_subtask = total_runtime / max_tasks as f64;
event!(
Level::INFO,
"Splitting task: {}, runtime: {}, tests: {}",
&params.suite_name,
runtime_per_subtask,
test_list.len()
);
let mut sub_suites = vec![];
let mut running_tests = vec![];
let mut running_runtime = 0.0;
let mut i = 0;
for test in test_list {
let test_name = get_test_name(&test);
if let Some(test_stats) = task_stats.test_map.get(&test_name) {
if (running_runtime + test_stats.average_runtime > runtime_per_subtask)
&& !running_tests.is_empty()
&& sub_suites.len() < max_tasks - 1
{
sub_suites.push(SubSuite {
index: Some(i),
name: multiversion_name.unwrap_or(&params.task_name).to_string(),
test_list: running_tests.clone(),
origin_suite: origin_suite.to_string(),
exclude_test_list: None,
mv_exclude_tags: multiversion_tags.clone(),
is_enterprise: params.is_enterprise,
});
running_tests = vec![];
running_runtime = 0.0;
i += 1;
}
running_runtime += test_stats.average_runtime;
}
running_tests.push(test.clone());
}

I didn't want to do this as part of this ticket. I will file another DAG ticket to update the splitting algorithm.

@MikhailShchatko
Copy link
Collaborator Author

evergreen merge

@ghost ghost merged commit 0b406d1 into mongodb:master Aug 3, 2022
@MikhailShchatko MikhailShchatko deleted the DAG-1940-task-split-by-platform branch August 3, 2022 06:44
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants