Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accelerating rDT mesh and optm iter. #16

Closed
jreniel opened this issue Dec 6, 2022 · 5 comments
Closed

Accelerating rDT mesh and optm iter. #16

jreniel opened this issue Dec 6, 2022 · 5 comments

Comments

@jreniel
Copy link

jreniel commented Dec 6, 2022

Hello,
I was wondering if there is any way to speed up the rDT mesh and Mesh optm parts of the jigsaw-python subtree source.
Currently, I am compiling with gcc-11, using the standard CMake approach (maybe there's an optimization switch I haven't enabled?).
I tried compiling it with intel 2018 and 2019 but I wasn't successful with the compilation.
Any ideas on achieving some speed-ups will be greatly appreciated.
Thanks,
-J.

@dengwirda
Copy link
Owner

@jreniel I don't expect there'd be a huge difference in performance across compilers, at least I've not seen this with gcc, clang or msvc. Adding support for icc would be interesting to do but is not a compiler I test on currently.

I expect the biggest changes re: efficiency to come from the underlying algorithms, as well as possibly user config. options. Would you mind giving the new 1.0.0.x-beta a try: #17, this includes a faster converging optimisation scheme as well as initial support for thread-parallelism.

@jreniel
Copy link
Author

jreniel commented Jan 16, 2023

Thank you for your answer. As I've learned more about compilers recently, I have also started to think that there would probably be no significant difference between using gcc or icc. However in the past, with numerical models written in Fortran, I'm under the impression that I've seen cases of 10x speed increase between gcc against icc. However, that doesn't mean that the same would necessarily apply to JIGSAW.

As of now, it doesn't seem to compile on icc (2018/2019). However, it is precisely those tweaks that icc requires that probably enables the icc to do further optimizations (I am speculating slightly here). The question is, would those optimizations lead to significantly faster code? We wouldn't know unless we make it compile with icc and compare. But overall, I agree that it seems probable that there won't be significant difference.

I do however consider JIGSAW significantly fast, but as I experiment I can't help but wonder if we could run it even faster. Right now, JIGSAW seems be working exactly as planned, so I am definitely enjoying experimenting with the software. I have more "experiments" in mind I might try with JIGSAW in the future, but so far, I find this software extremely useful and I am grateful for the work you have done.

Finally, I have one more question. Is there anything in particular I need to do to enable the thread-parallelism? (e.g. set value in opts?) This feature would be extremely useful for my use case.

Thanks,
-Jaime

@dengwirda
Copy link
Owner

@jreniel, okay, easy questions first: no, you shouldn't need to enable any flags to run thread-parallel, though the opts.numthread flag can be set if needed. Otherwise the # cores per machine are autodetected and used, with the autodetected value printed in the user-config. echo included at the top of the log file.
Of course, you will need to compile on a machine with support for openmp; the cmake output in the setup.py build_external step can be used to track whether jigsaw is compiled with or without parallel support.

In terms of compiling using Intel (icpc) I experimented with this today and updated the cmake to provide support. To cut a long story short --- it seems that some of the performance "advantage" of icpc may actually be due to its (default) use of "unsafe" floating-point optimisations, which sacrifice numerical precision for speed. While I don't think I'm ever really a fan of this approach in any situation, in the case of jigsaw (and computational geometry libraries in general) the imprecision leads to incorrectness and program crashes (geometry algorithms rely on adaptive precision arithmetic to compute things like intersections in an exact sense, and the unsafe optimisations are a non-starter in these cases).

Setting icpc to use more reasonable optimisations (fp-model=precise), it actually ends up being around 20% slower than g++ for me. I'm not an expert with icpc so it's possible there are improved compiler flags out there, but let's just say that icpc support at this point is experimental, and it's probably best to just stick with g++ or clang++ anyway...

@jreniel
Copy link
Author

jreniel commented Jan 17, 2023

Thanks for looking into icpc support and for adding multithreading support. I will be experimenting with this in the coming days.

@jreniel jreniel closed this as completed Jan 17, 2023
@jreniel
Copy link
Author

jreniel commented Feb 1, 2023

@dengwirda
One note: I am getting semi random segfaults since the introduction of the multi-threaded interface. This is particularly true when I am running multiple JIGSAW instances at the same time in a distributed manner (in HPC cluster). The problem persists even if a explicitly set opt.numthread = 1, or even if I make absolutely certain that I launch my jigsaw instance on the scheduler with matching thread numbers. (e.g. srun --cpus-per-task 1 jigsaw and explicitly setting my config to numthread = 1 ) Normally it happens when I have multiple instances, but it seems to be semi random. If I run each process individually, I don't see the problem. I was able to run hundreds of jigsaw instances concurrently before without problem, so I think this might be related to numthread (or maybe netcdf, but I doubt it, I compiled with netcdf support but not using it in JIGSAW).

I will try reverting to the previous version without multithreading to see if the problem goes away definitely.

I'm not sure if I should reopen this as an actual issue or just leave this here as a note.
At the very least I wanted to make you aware of this potential problem.

Thanks,
-Jaime

[EDIT]: Opened issue #18

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants