Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thread Safety (encapsulate global variables into opaque structure) #70

Closed
bemcdonnell opened this issue Aug 7, 2017 · 16 comments
Closed

Comments

@bemcdonnell
Copy link
Member

  1. This will enable us to run multiple models within a single instance of SWMM.
  2. Will also let us safely set variables with the API assuming a user can use multiple threads to talk to the API. Mutex?
@bemcdonnell bemcdonnell added this to the v5.2.0 milestone Aug 7, 2017
@bemcdonnell bemcdonnell changed the title Thread Safety (encapsulate of global variables into opaque structure) Thread Safety (encapsulate global variables into opaque structure) Aug 7, 2017
@michaeltryby
Copy link

So I am nearly done with this task. I created a struct to encapsulate the static and global variables. It was a big job. While I was working on it, I had this nagging idea that there was another maybe better way to accomplish the same thing. I wanted to get the idea written down for us to think about.

What if we created a memory pool and gave each thread entering the library a unique handle to a separate pool. Then modified the code so that variables were allocated from the pool instead of directly off the system heap. Would this be enough protection to provide reentrancy? I'm not sure.

This could, however, be a less invasive way to achieve our design objective. What do you think?

@samhatchett
Copy link

maybe less invasive, but less clear from the API / code perspective what is going on. The structifying of SWMM is a solid move toward object orientation, and that's a 👍

@michaeltryby
Copy link

I agree, the approach taken is a step towards object orientation.

The side effect of adding 10K pointer de-references, however, is that the code runs 28% slower for the largest model in the test suite.

@michaeltryby
Copy link

I wrote a reentrancy test using pthreads. I successfully ran SWMM on ten threads. I will do more testing tomorrow.

@michaeltryby
Copy link

michaeltryby commented May 15, 2018

So after more testing, it is clear that the SWMM dll is in fact failing the reentrancy test. The failure is related to the root of the memory pool. It is still a static local variable. The first thread to finish deallocates the pool causing the other threads using it to crash. When I tried wrapping the root in the global struct it caused a memory segmentation fault. At this point I figure there are several options,

  1. Have the thread hold the pool root directly.
  2. Synchronize thread access to root using a lock.
  3. Upgrade to C11 and use the thread_local storage duration specifier.

The root of the memory pool has not been wrapped in EPANET either. So we need a solution for both SWMM and EPANET. Thoughts?

Here is a link to a Wikipedia article describing Thread-local storage.

@samhatchett
Copy link

do you have a profile for the degradation you are trying to address? Maybe i'm a little lost on this, but moving in this direction before really discussing the performance degradation with the "structification" mods is premature. I'm probably lost though ;)

@michaeltryby
Copy link

michaeltryby commented May 15, 2018

@samhatchett I am pushing forward to achieve the objective - reentrancy. Once the code is working then I will go back and optimize it for performance. Make sense?

BTW, I just got my reentrancy test passing this morning using a thread_local storage duration specifier. The code compiles on Visual Studio 2010 and MinGW gcc. I tested using MinGW gcc and pthreads and I was able to perform two model runs simultaneously.

I will continue testing.

On the performance front, for the User_5 reg test the delta between SWMM with and without the wrapper is currently 4.2 seconds on a 25 second simulation ~ 17%.

@samhatchett
Copy link

Does this mean that you can't have reentrancy with just the wrapper struct? Meaning, you can't encapsulate the memory pool in the Project? I guess looking back over the thread, I'm not clear on exactly what you are doing:

What if we created a memory pool and gave each thread entering the library a unique handle to a separate pool. Then modified the code so that variables were allocated from the pool instead of directly off the system heap. Would this be enough protection to provide reentrancy? I'm not sure.

... If you think you are on a good track, then great - I don't have to understand it. But in trying to provide some feedback, I realize that I'm at a bit of a loss as to what you are changing.

@michaeltryby
Copy link

Ok I see the source of the confusion ...

What you have in quotes there in the post immediately proceeding this one is just an idea to think about. I haven't implemented it and have no intention of doing so. What I have done is wrap all global and static local variables in a structure and I provide a handle to the thread for that struct (the pattern you applied in EPANET).

While testing that implementation to make sure it was reentrant a problem with the mempool root came up. When I tried encapsulating it, it caused a segmentation fault. So, I fixed the problem by making the thread pool root thread local.

Everything is building and passing tests right now, but there seems to be moderate degradation in performance for large models. I hope this is clear now. Sorry about the confusion.

@samhatchett
Copy link

Interesting - so to be clear, when you put the alloc_root_t *root in the SWMM_Project struct, it somehow created a segfault?

@michaeltryby
Copy link

@samhatchett yeah that's correct.

@samhatchett
Copy link

@michaeltryby can you say any more about the nature of the segfault? or why it occurred? or point to a commit where I can experience it first-hand? This is presumably an issue that may affect epanet as well, since the mempool implementations are very close.

@michaeltryby
Copy link

michaeltryby commented May 15, 2018

@samhatchett I didn't commit the offending code to the repo since it didn't work. I mean, you aren't supposed to knowingly commit broken code. Right?

On my local machine I applied the wrapping pattern - moved root struct from mempool.c to the wrapper struct and added the function args to pass it from the wrapper struct through the mempool api and into the functions where it is dereferenced and used. It compiled and then when I ran the reg tests it seg faulted. I didn't investigate further at that point, figuring there had to be a better way to handle it.

Next, I added the following to the top of mempool.c:

#ifdef _MSC_VER
#define THREAD_LOCAL __declspec(thread)
#else
#define THREAD_LOCAL __thread
#endif

...

THREAD_LOCAL alloc_root_t *root;

This builds on MSVC, gcc, should build on clang, and it passes my reentrancy test. Based on my experience I recommend you make this change to EPANET.

@michaeltryby
Copy link

SWMM appears to be running slightly faster for small models and slightly slower for large ones. I think data locality and alignment issues are affecting performance.

@samhatchett
Copy link

you aren't supposed to knowingly commit broken code. Right?

I get what you're saying, but not sure i would agree in general. There are valid reasons you would want to commit broken code, like for having design-related discussions.

I'm also wondering how the thread local paradigm would work for a single-threaded multi-simulation application. Like, run two models step-by-step with each other, but from a single thread. Would this break since both Projects are sharing a single pool?

@michaeltryby
Copy link

@samhatchett No I don't think it would. Thread local storage would only come into play if there were more than one thread. With the code the way it is now static alloc_root_t *root; there is only one memory pool no matter how many threads are using the dll. Making it thread local makes one memory pool per thread - one thread one memory pool, two threads two memory pools. So, for both cases if there is only one thread, there is still only one memory pool.

I admit the logic is somewhat convoluted, but I think your use case should be compatible with thread local storage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants