Accelerating rDT mesh and optm iter. #16

jreniel · 2022-12-06T13:37:34Z

Hello,
I was wondering if there is any way to speed up the rDT mesh and Mesh optm parts of the jigsaw-python subtree source.
Currently, I am compiling with gcc-11, using the standard CMake approach (maybe there's an optimization switch I haven't enabled?).
I tried compiling it with intel 2018 and 2019 but I wasn't successful with the compilation.
Any ideas on achieving some speed-ups will be greatly appreciated.
Thanks,
-J.

dengwirda · 2023-01-12T12:17:22Z

@jreniel I don't expect there'd be a huge difference in performance across compilers, at least I've not seen this with gcc, clang or msvc. Adding support for icc would be interesting to do but is not a compiler I test on currently.

I expect the biggest changes re: efficiency to come from the underlying algorithms, as well as possibly user config. options. Would you mind giving the new 1.0.0.x-beta a try: #17, this includes a faster converging optimisation scheme as well as initial support for thread-parallelism.

jreniel · 2023-01-16T15:38:59Z

Thank you for your answer. As I've learned more about compilers recently, I have also started to think that there would probably be no significant difference between using gcc or icc. However in the past, with numerical models written in Fortran, I'm under the impression that I've seen cases of 10x speed increase between gcc against icc. However, that doesn't mean that the same would necessarily apply to JIGSAW.

As of now, it doesn't seem to compile on icc (2018/2019). However, it is precisely those tweaks that icc requires that probably enables the icc to do further optimizations (I am speculating slightly here). The question is, would those optimizations lead to significantly faster code? We wouldn't know unless we make it compile with icc and compare. But overall, I agree that it seems probable that there won't be significant difference.

I do however consider JIGSAW significantly fast, but as I experiment I can't help but wonder if we could run it even faster. Right now, JIGSAW seems be working exactly as planned, so I am definitely enjoying experimenting with the software. I have more "experiments" in mind I might try with JIGSAW in the future, but so far, I find this software extremely useful and I am grateful for the work you have done.

Finally, I have one more question. Is there anything in particular I need to do to enable the thread-parallelism? (e.g. set value in opts?) This feature would be extremely useful for my use case.

Thanks,
-Jaime

dengwirda · 2023-01-16T22:47:29Z

@jreniel, okay, easy questions first: no, you shouldn't need to enable any flags to run thread-parallel, though the opts.numthread flag can be set if needed. Otherwise the # cores per machine are autodetected and used, with the autodetected value printed in the user-config. echo included at the top of the log file.
Of course, you will need to compile on a machine with support for openmp; the cmake output in the setup.py build_external step can be used to track whether jigsaw is compiled with or without parallel support.

In terms of compiling using Intel (icpc) I experimented with this today and updated the cmake to provide support. To cut a long story short --- it seems that some of the performance "advantage" of icpc may actually be due to its (default) use of "unsafe" floating-point optimisations, which sacrifice numerical precision for speed. While I don't think I'm ever really a fan of this approach in any situation, in the case of jigsaw (and computational geometry libraries in general) the imprecision leads to incorrectness and program crashes (geometry algorithms rely on adaptive precision arithmetic to compute things like intersections in an exact sense, and the unsafe optimisations are a non-starter in these cases).

Setting icpc to use more reasonable optimisations (fp-model=precise), it actually ends up being around 20% slower than g++ for me. I'm not an expert with icpc so it's possible there are improved compiler flags out there, but let's just say that icpc support at this point is experimental, and it's probably best to just stick with g++ or clang++ anyway...

jreniel · 2023-01-17T19:37:06Z

Thanks for looking into icpc support and for adding multithreading support. I will be experimenting with this in the coming days.

jreniel · 2023-02-01T19:13:07Z

@dengwirda
One note: I am getting semi random segfaults since the introduction of the multi-threaded interface. This is particularly true when I am running multiple JIGSAW instances at the same time in a distributed manner (in HPC cluster). The problem persists even if a explicitly set opt.numthread = 1, or even if I make absolutely certain that I launch my jigsaw instance on the scheduler with matching thread numbers. (e.g. srun --cpus-per-task 1 jigsaw and explicitly setting my config to numthread = 1 ) Normally it happens when I have multiple instances, but it seems to be semi random. If I run each process individually, I don't see the problem. I was able to run hundreds of jigsaw instances concurrently before without problem, so I think this might be related to numthread (or maybe netcdf, but I doubt it, I compiled with netcdf support but not using it in JIGSAW).

I will try reverting to the previous version without multithreading to see if the problem goes away definitely.

I'm not sure if I should reopen this as an actual issue or just leave this here as a note.
At the very least I wanted to make you aware of this potential problem.

Thanks,
-Jaime

[EDIT]: Opened issue #18

jreniel closed this as completed Jan 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Accelerating rDT mesh and optm iter. #16

Accelerating rDT mesh and optm iter. #16

jreniel commented Dec 6, 2022 •

edited

dengwirda commented Jan 12, 2023

jreniel commented Jan 16, 2023 •

edited

dengwirda commented Jan 16, 2023

jreniel commented Jan 17, 2023

jreniel commented Feb 1, 2023 •

edited

Accelerating rDT mesh and optm iter. #16

Accelerating rDT mesh and optm iter. #16

Comments

jreniel commented Dec 6, 2022 • edited

dengwirda commented Jan 12, 2023

jreniel commented Jan 16, 2023 • edited

dengwirda commented Jan 16, 2023

jreniel commented Jan 17, 2023

jreniel commented Feb 1, 2023 • edited

jreniel commented Dec 6, 2022 •

edited

jreniel commented Jan 16, 2023 •

edited

jreniel commented Feb 1, 2023 •

edited