Drop MUT* macros and C support. #685

olupton · 2021-10-29T13:21:34Z

Description
The MUT* macros were only used in one place, and I believe the 0 in MUTCONSTRUCT(0) meant that the mutex was never actually used.

This is partly motivated by having been led on a wild goose chase by these macros when debugging another issue...

Use certain branches for the SimulationStack CI

CI_BRANCHES:NEURON_BRANCH=master,

pramodk · 2021-10-29T13:38:07Z

The MUT* macros were only used in one place, and I believe the 0 in MUTCONSTRUCT(0) meant that the mutex was never actually used.

If this is true in the current code then need to look at recent refactoring PRs. I am (pretty) sure that queues are not thread-safe and hence need mutex locks. And CI may not be revealing the issue because most/all tests are today in pure MPI mode (without use of heavy thread usage, long pending #292)

olupton · 2021-10-29T13:44:41Z

OK..there is a non-macro mutex that lives alongside the queue instance here:

CoreNeuron/coreneuron/network/netcvode.hpp

Lines 49 to 51 in 61f3714

    
           TQueue<QTYPE>* tqe_; 
        
           std::vector<InterThreadEvent> inter_thread_events_; 
        
           OMP_Mutex mut;

and is used in a couple of places:

CoreNeuron/coreneuron/network/netcvode.cpp

Lines 136 to 147 in 61f3714

    
           void NetCvodeThreadData::interthread_send(double td, DiscreteEvent* db, NrnThread* /* nt */) { 
        
               std::lock_guard<OMP_Mutex> lock(mut); 
        
               inter_thread_events_.emplace_back(InterThreadEvent{db, td}); 
        
           } 
        
           void NetCvodeThreadData::enqueue(NetCvode* nc, NrnThread* nt) { 
        
               std::lock_guard<OMP_Mutex> lock(mut); 
        
               for (const auto& ite: inter_thread_events_) { 
        
                   nc->bin_event(ite.t_, ite.de_, nt); 
        
               } 
        
               inter_thread_events_.clear(); 
        
           }

It may well be the case that there are supposed to be more locks, but I don't think this PR actually removes any.

bbpbuildbot · 2021-10-29T14:17:07Z

Logfiles from GitLab pipeline #22353 (:white_check_mark:) have been uploaded here!

Status and direct links:

alexsavulescu · 2021-10-29T14:22:25Z

Not straightforward to see which mutexes are enabled or not. In NEURON I debugged with some prints: neuronsimulator/nrntest#13 (comment)

pramodk · 2021-10-29T14:48:50Z

The origin of / changes to tqueue are bit older (than when I started) but I looked into the history bit - TQueue in coreneuron/network/tqueue.ipp seems to be derived from src/nrncvode/spt2queue.cpp in neuron. It doesn't have code related to mutex locks but considering below code MUTCONSTRUCT(0) in coreneuron:

CoreNeuron/coreneuron/network/tqueue.ipp

Line 34 in d2f4d3d

MUTCONSTRUCT(0)

Not sure why coreneuron has just MUTCONSTRUCT(0) but still had all mutex related locks/unlocks (via macros) in place.

I don't think I have understood details with certainty yet but there needs further clarity in order to be confident about this change. We can resume on this next week.

ohm314 · 2021-11-01T09:01:00Z

Should we not anyway redo those queues completely using stdlib queues and a proper C++ implementation? Also, probably a stupid question, but would running some tests with ThreadSanitizer help here?

bbpbuildbot · 2021-11-08T14:02:44Z

Logfiles from GitLab pipeline #23622 (:no_entry:) have been uploaded here!

Status and direct links:

The MUT* macros were only used in one place, and the 0 in MUTCONSTRUCT(0) meant that the mutex was never actually used.

bbpbuildbot · 2021-11-09T12:20:11Z

Logfiles from GitLab pipeline #23825 (:white_check_mark:) have been uploaded here!

Status and direct links:

pramodk · 2021-11-09T15:50:29Z

TQueue in coreneuron/network/tqueue.ipp seems to be derived from src/nrncvode/spt2queue.cpp in neuron. It doesn't have code related to mutex locks

I was wrong with the above comment - the TQueue implementation is derived from nrncvode/sptbinq.cpp which has locks.

The issue / inconsistency in CoreNEURON was that the mutex was never constructed (0 as an argument to MUTCONSTRUCT):

template <container C>
TQueue<C>::TQueue() {
    MUTCONSTRUCT(0)
    nshift_ = 0;

whereas neuron has constructor with a default argument mkmut=0:

class TQueue {
public:
    TQueue(TQItemPool*, int mkmut = 0);
....

TQueue::TQueue(TQItemPool* tp, int mkmut) {
	MUTCONSTRUCT(mkmut)
	tpool_ = tp;

In the neuron source tree I don't see any TQueue instantiation with mkmut=1:

kumbhar@bbpv1:~/.../nrn/src$ grep -r "new TQueue" *
nrncvode/netcvode.cpp:	tqe_ = new TQueue(tpool_, 0);
nrncvode/netcvode.cpp:			d.tq_ = new TQueue(d.tpool_);
nrncvode/netcvode.cpp:		p.lcv_[i].neosim_self_events_ = new TQueue();
nrncvode/netcvode.cpp:		d.tqe_ = new TQueue(p[i].tpool_);
nrncvode/tqueue.cpp:	TQueue* q = new TQueue(0);

Does this mean we can remove all MUT* macros in TQueue? I would say YES.

There was one more thing where I saw mkmut=1 and that is in TQItemPool:

$ grep -r TQItemPool *
....
nrncvode/netcvode.h:	TQItemPool* tpool_;
nrncvode/netcvode.cpp:	tpool_ = new TQItemPool(1000, 1);


nrncvode/tqueue.h:declarePool(TQItemPool, TQItem)

# and in nrncvode/pool.h

#define declarePool(Pool,T) \
class Pool { \
public: \
    Pool(long count, int mkmut = 0); \

but this doesn't interact with TQueue.

With above summary, I had quick call with @nrnhines to verify the origin of mutex macros and if the removal of those is safe to do. He pointed out that the next/newer iteration of original per queue mutex was inter-thread-event buffers i.e. instead of copying events to TQueue using mutex, there is a buffer which is populated by the thread who owns the PreSyn (see interthread_send()) and this is where mutex is required/used. The thread who owns the queue will then copy all those buffered event into it's queue (and hence mutex is not required now). In PreSyn::send() we can see how thread decide to put event into it's queue or inter-thread-event buffer.

This also explains why coreneuron always had MUTCONSTRUCT(0).

In summary, the merge of this PR is safe.

Should we not anyway redo those queues completely using stdlib queues and a proper C++ implementation?

I believe so and this was what attempted by Tim already with the use of std::priority_queue. This implementation is not supporting moving events which is used by net_move() NMODL constructs . (I vaguely remember some perf/efficiently related discussion about this with default SplayTree implementation).

Also, probably a stupid question, but would running some tests with ThreadSanitizer help here?

It would but such test is missing. I am creating ticket in neuron to adapt tqperf model with threads so that it can be used for such validations.

The MUT* macros were only used in one place, and the 0 in MUTCONSTRUCT(0) meant that the mutex was never actually used. CoreNEURON Repo SHA: BlueBrain/CoreNeuron@ff93a93

olupton requested a review from alkino October 29, 2021 13:21

alkino approved these changes Oct 29, 2021

View reviewed changes

pramodk closed this Nov 8, 2021

pramodk reopened this Nov 8, 2021

Drop MUT* macros and C support.

b8e4766

The MUT* macros were only used in one place, and the 0 in MUTCONSTRUCT(0) meant that the mutex was never actually used.

olupton force-pushed the olupton/drop-mut-macros branch from 61f3714 to b8e4766 Compare November 9, 2021 10:39

alkino self-requested a review November 9, 2021 11:04

alkino approved these changes Nov 9, 2021

View reviewed changes

alkino merged commit ff93a93 into master Nov 9, 2021

alkino deleted the olupton/drop-mut-macros branch November 9, 2021 12:33

pramodk mentioned this pull request Nov 9, 2021

Integration test for stressing inter-thread event transfer neuronsimulator/nrn#1527

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Drop MUT* macros and C support. #685

Drop MUT* macros and C support. #685

Uh oh!

olupton commented Oct 29, 2021

Uh oh!

pramodk commented Oct 29, 2021

Uh oh!

olupton commented Oct 29, 2021

Uh oh!

bbpbuildbot commented Oct 29, 2021

Uh oh!

alexsavulescu commented Oct 29, 2021

Uh oh!

pramodk commented Oct 29, 2021

Uh oh!

ohm314 commented Nov 1, 2021

Uh oh!

bbpbuildbot commented Nov 8, 2021

Uh oh!

bbpbuildbot commented Nov 9, 2021

Uh oh!

pramodk commented Nov 9, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Drop MUT* macros and C support. #685

Drop MUT* macros and C support. #685

Uh oh!

Conversation

olupton commented Oct 29, 2021

Uh oh!

pramodk commented Oct 29, 2021

Uh oh!

olupton commented Oct 29, 2021

Uh oh!

bbpbuildbot commented Oct 29, 2021

Uh oh!

alexsavulescu commented Oct 29, 2021

Uh oh!

pramodk commented Oct 29, 2021

Uh oh!

ohm314 commented Nov 1, 2021

Uh oh!

bbpbuildbot commented Nov 8, 2021

Uh oh!

bbpbuildbot commented Nov 9, 2021

Uh oh!

pramodk commented Nov 9, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants