-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallel Diff-ing and other Tweaks #66
Conversation
Parallel diff-ing!! 🤩 |
runner.test_output_dir, | ||
runner.thresh_dict_file, | ||
ci_mode=True | ||
) # returns an updated entry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
process_diffs_for_one_case
is now a @staticmethod
to make it easier to use with the process pool. This required some extra parameters to be passed into it.
self.id_like_to_stop_now = False | ||
self.completed_structure = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Completed structure instance is now a member variable on the suite runner class so that the process pool can access it when processes complete.
@@ -372,7 +375,8 @@ def run_build(self, build_tree): | |||
# `apply_async` approach I am using. Blech. Once again, on Windows, this means it will partially not be | |||
# multithreaded. | |||
if self.number_of_threads == 1 or frozen and system() in ['Windows', 'Darwin']: # pragma: no cover | |||
self.my_print("Ignoring num_threads on frozen Windows/Mac instance, just running with one thread.") | |||
if self.number_of_threads > 1: | |||
self.my_print("Ignoring num_threads on frozen Windows/Mac instance, just running with one thread.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only print this warning message if you are actually trying to run multithreaded. Previously it would show up even if you intentionally ran a single thread.
@@ -697,8 +704,7 @@ def process_diffs_for_one_case(self, this_entry, ci_mode=False): | |||
EndErrSummary.STATUS_SUCCESS, | |||
runtime_case2 | |||
)) | |||
self.my_print("TestMathAndKill Fatal-ed as expected, continuing with no diff checking on it") | |||
return this_entry | |||
return this_entry, "TestMathAndKill Fatal-ed as expected, continuing with no diff checking on it" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.my_print
is no longer available from this static method, so this function simply returns a string message that can be printed once back in the process pool callback function.
@@ -786,14 +786,14 @@ def process_diffs_for_one_case(self, this_entry, ci_mode=False): | |||
path_to_math_diff_log)), MathDifferences.SSZ) | |||
|
|||
# Do sorta-math-diff JSON diff | |||
if self.both_files_exist(case_result_dir_1, case_result_dir_2, 'eplusout_hourly.json'): | |||
this_entry.add_math_differences(MathDifferences(self.diff_json_time_series( | |||
if SuiteRunner.both_files_exist(case_result_dir_1, case_result_dir_2, 'eplusout_hourly.json'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Several functions were already static methods but still referenced by self
, so these were cleaned up.
self.build_tree_a['source_dir'], self.build_tree_a['build_dir'], | ||
self.build_tree_b['source_dir'], self.build_tree_b['build_dir'], | ||
os.path.join(self.build_tree_a['build_dir'], self.test_output_dir), | ||
os.path.join(self.build_tree_b['build_dir'], self.test_output_dir) | ||
) | ||
diff_runs = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New functionality to allow for diffs to run multithreaded.
if diff_results.entries_by_file[0].basename == 'my_file': | ||
results_for_file = diff_results.entries_by_file[0] | ||
elif diff_results.entries_by_file[1].basename == 'my_file': | ||
results_for_file = diff_results.entries_by_file[1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was an interesting one, previously the two files would always come out in the serial order, but now sometimes they would come out backwards. I had to simply check which entry was the one I'm inspecting before calling assertions on it.
@@ -638,6 +643,24 @@ def idf_select_all(self): | |||
self.active_idf_listbox.insert(END, idf) | |||
self.idf_refresh_count_status() | |||
|
|||
def idf_select_all_except_long_runs(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add in a new way to select all but the longest running files. This can take tons of time off the run just by skipping these few files.
Take a look at this branch at your leisure and let me know if you run into any issues. |
I just ran this on the rmPlantHack branch and it worked flawlessly, showing all the diffs. The best part...I pulled the branch and it took just over 7 minutes to do a full rebuild including unit tests...then the full regression suite took just over 4 minutes including both sets of runs. Doing 723 files. Things are looking up! |
Is this run on DecentCI? |
Looks great to me! We're definitely parallel-processing the diffs now. Nice job, @Myoldmopar 🥇 |
It is run on Decent CI, but not in a way that will make a difference with this multi-processing. The call out to the regression script is made as part of a CTest execution. So if you run |
I just pushed a commit based on the rmPlantHack branch that points to this branch in the regression repo. If it looks all clean I'll go ahead and merge this down. |
OK, I think I'm going to merge this...if we notice anything later I'll revert or address it. |
This has some changes in place to get parallel diff calculations working. I haven't tested it exhaustively, but it does look like it is working, and all unit tests continue to pass. Need to verify CI will be OK with it though, by pointing a decent CI config file to this branch and making sure some diffs appear in that branch.
This will continue to run serially on Windows or in frozen apps or whatever the limitation was previously. Now that E+ is thread safe, we need to make sure that table-diff and math-diff are also thread safe, and then we can get all this running using
threading
rather thanmultiprocessing
, and it will be parallel on all platforms.