Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Threading problem of c library. #81

Closed
fockez opened this issue Aug 20, 2020 · 6 comments · Fixed by #109
Closed

Threading problem of c library. #81

fockez opened this issue Aug 20, 2020 · 6 comments · Fixed by #109

Comments

@fockez
Copy link

fockez commented Aug 20, 2020

Definitely sep is a good idea, especially the c library makes it easier to build a customized "sextractor", but now I met some performance problem. Because of the widely use of multi core CPU, obviously the multi threading is a efficient way.
But after some test, multi threading failed, maybe the reason of some global variables.
So, is that possible to make a library suitable for multi-threading, or is that possible to support openmp in sep?

@kbarbary
Copy link
Owner

kbarbary commented Sep 5, 2020

Thanks!

I believe SExtractor (and SEP by extension) do indeed have some global modified state in the extract functionality. Unfortunately this is pretty "baked in" to the design and it would likely be a major rewrite to make it thread safe or use openmp. It also might be tricky to divide work on a single image between threads. An alternative to take advantage of multiple cores would of course be to use multi-processing. You must be working on very large images if you're running into performance problems!

I would expect that the photometry functions (and possibly background estimation) are thread safe.

@fockez
Copy link
Author

fockez commented Sep 6, 2020

Thank you for the reply. Base on our simple test, the consumption of deblending is very high, because we are dealing very large field images and star field is too crowded. I believe this part can be parallelized.
We have too many images to reduce in one C program, it's not easy to use multi-processing to replace the multi-threading while keep the efficient.

@knro
Copy link

knro commented Oct 17, 2020

I was about to open a new issue on this very topic. I was trying to use multiple threads for SEP and that failed miserably due to many static global variables around. This was designed for the era of single threaded programs and certainly multi-threading was not a concern.

One could conceivably partition a large image into discrete regions and run Sep on each to merge the results of each thread later.

@RReverser
Copy link
Collaborator

Has anyone tried to literally replace all those static with thread_local? In C11 compilation mode, there is a chance that might just work.

@RReverser
Copy link
Collaborator

(well, also need to check that it's not using any non-thread-safe system functions)

RReverser added a commit to RReverser/sep that referenced this issue Mar 11, 2021
As mentioned in kbarbary#81, currently the library is not thread-safe, that is, it can't even be used on multiple separate images in parallel due to the shared global mutable variables.

There are a couple of ways to fix this. I've mostly done refactoring in a separate branch to pass everything around via params, but that kind of a change is rather huge and harder to review for correctness.

So I took a different approach I mentioned in kbarbary#81 (comment), and instead went through all the statics and changed them depending on usage:

1. Into local function variables where they are initialized every time anyway - those clearly don't have to be `static`.
2. Into `const static` if it was a static that's been always initialised to the same constant value - there was just `initinfo` that fell into this category.
3. Into `_Atomic` if it's a global config - in particular `sub_object_limit` and `extract_pixstack` params. I believe there is no point in configuring those per-thread, so just left them as globals, but `_Atomic` ensures safe complete initialization in case of a race.
4. Finally, made all the other mutable statics `_Thread_local`, which ensures that none of the threads will override each others' data while storing temporary information about the image being processed.

Then, I used `clang-tidy` to:

1. Find and auto-mark all places where pointers can be `const` - this allows consumer to more easily reason about which data can be shared across images and/or threads, and which will be mutated internally by SEP and so needs to have separate copies per run.
2. Detect all non-thread-safe functions (clang-tidy can do this starting from LLVM 13). Such function was only `rand`, so I added a `_Thread_local` seed variable `randseed`, and switched those callsites to `rand_r(&randseed)`, making those thread-safe too.

Note that due to trimming of unnecessary spacing by clang-tidy and editors the diffs can look a bit more bloated, so I suggest to review the diff using "ignore whitespace changes" mode here: https://github.com/kbarbary/sep/compare/master...RReverser:simple-threadsafe?expand=1&w=1

Overall I think this is the least invasive change that should be relatively easy to review for correctness (and, of course, I checked that all the tests are passing), while at the same time ensures thread-safety throughout the codebase.

Fixes kbarbary#81.
@RReverser
Copy link
Collaborator

In case someone on this thread also wants to test it, I've made the library thread-safe in #109.

RReverser added a commit to RReverser/sep that referenced this issue Mar 29, 2021
As mentioned in kbarbary#81, currently the library is not thread-safe, that is, it can't even be used on multiple separate images in parallel due to the shared global mutable variables.

There are a couple of ways to fix this. I've mostly done refactoring in a separate branch to pass everything around via params, but that kind of a change is rather huge and harder to review for correctness.

So I took a different approach I mentioned in kbarbary#81 (comment), and instead went through all the statics and changed them depending on usage:

1. Into local function variables where they are initialized every time anyway - those clearly don't have to be `static`.
2. Into `const static` if it was a static that's been always initialised to the same constant value - there was just `initinfo` that fell into this category.
3. Into `_Atomic` if it's a global config - in particular `sub_object_limit` and `extract_pixstack` params. I believe there is no point in configuring those per-thread, so just left them as globals, but `_Atomic` ensures safe complete initialization in case of a race.
4. Finally, made all the other mutable statics `_Thread_local`, which ensures that none of the threads will override each others' data while storing temporary information about the image being processed.

Then, I used `clang-tidy` to:

1. Find and auto-mark all places where pointers can be `const` - this allows consumer to more easily reason about which data can be shared across images and/or threads, and which will be mutated internally by SEP and so needs to have separate copies per run.
2. Detect all non-thread-safe functions (clang-tidy can do this starting from LLVM 13). Such function was only `rand`, so I added a `_Thread_local` seed variable `randseed`, and switched those callsites to `rand_r(&randseed)`, making those thread-safe too.

Note that due to trimming of unnecessary spacing by clang-tidy and editors the diffs can look a bit more bloated, so I suggest to review the diff using "ignore whitespace changes" mode here: https://github.com/kbarbary/sep/compare/master...RReverser:simple-threadsafe?expand=1&w=1

Overall I think this is the least invasive change that should be relatively easy to review for correctness (and, of course, I checked that all the tests are passing), while at the same time ensures thread-safety throughout the codebase.

Fixes kbarbary#81.
RReverser added a commit to RReverser/sep that referenced this issue Mar 29, 2021
As mentioned in kbarbary#81, currently the library is not thread-safe, that is, it can't even be used on multiple separate images in parallel due to the shared global mutable variables.

There are a couple of ways to fix this. I've mostly done refactoring in a separate branch to pass everything around via params, but that kind of a change is rather huge and harder to review for correctness.

So I took a different approach I mentioned in kbarbary#81 (comment), and instead went through all the statics and changed them depending on usage:

1. Into local function variables where they are initialized every time anyway - those clearly don't have to be `static`.
2. Into `const static` if it was a static that's been always initialised to the same constant value - there was just `initinfo` that fell into this category.
3. Into `_Atomic` if it's a global config - in particular `sub_object_limit` and `extract_pixstack` params. I believe there is no point in configuring those per-thread, so just left them as globals, but `_Atomic` ensures safe complete initialization in case of a race.
4. Finally, made all the other mutable statics `_Thread_local`, which ensures that none of the threads will override each others' data while storing temporary information about the image being processed.

Then, I used `clang-tidy` to:

1. Find and auto-mark all places where pointers can be `const` - this allows consumer to more easily reason about which data can be shared across images and/or threads, and which will be mutated internally by SEP and so needs to have separate copies per run.
2. Detect all non-thread-safe functions (clang-tidy can do this starting from LLVM 13). Such function was only `rand`, so I added a `_Thread_local` seed variable `randseed`, and switched those callsites to `rand_r(&randseed)`, making those thread-safe too.

Note that due to trimming of unnecessary spacing by clang-tidy and editors the diffs can look a bit more bloated, so I suggest to review the diff using "ignore whitespace changes" mode here: https://github.com/kbarbary/sep/compare/master...RReverser:simple-threadsafe?expand=1&w=1

Overall I think this is the least invasive change that should be relatively easy to review for correctness (and, of course, I checked that all the tests are passing), while at the same time ensures thread-safety throughout the codebase.

Fixes kbarbary#81.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants