multithreading problems via __pmLogCtl sharing #386

fche · 2017-12-07T11:12:39Z

There exist known / suspected problems in the way libpcp manages archive __pmLogCtl data structures for multithreaded applications. Opening this issue# for formal tracking.

https://groups.io/g/pcp/topic/6095761#16883

goodwinos · 2017-12-14T21:59:27Z

@fche, as you know there is an __pmLogCtl rework in progress, mostly for multithreaded archive access. Is there an existing pmwebapi QA test that exercises this, or should we create a new one? I couldn't find one for tests in group pmwebapi.

fche · 2017-12-14T22:30:45Z

pmwebd is multithreaded for answering wildcarded graphite timeseries queries, but structured in such a way that the particular __pmLogCtl mis-sharing behaviour isn't triggered by it. In order to make pmwebd more multithreaded (be able to serve multiple queries concurrently), it is much more likely to hit the problems. This will need some pmwebd side work to exploit.

Depending on how the auto-decompression bits are implemented exactly, the same timeseries stuff may already be useful in stress-testing it. I'd start by generating a large archive-directory of .0.xz files, then asking pmwebd for some narrow and some broad wildcard queries. There should be no crashes, and it should avoid catastrophic rlimit/OOM.

kmcdonell · 2017-12-15T00:16:01Z

Outside of the pmwebapi ecosystem qa/595 does a pretty good job of hammering this for a variety of concurrent access types to a variety of archives. Itr uses qa/src/multithread12.c:
Thread A: loops over pmNewContext, pmTraversePMNS { pmLookupName, pmLookupDesc } , pmDestroyContext
Thread B: loops over pmNewContext, pmDupContext, pmFetch forwards, pmDestroyContext (x2)
Thread C: loops over pmNewContext, pmFetch backwards, pmDestroyContext
Thread D: loops over pmNewContext, pmTraversePMNS { pmLookupName, pmLookupDesc, pmGetInDomArchive }, pmDestroyContext

At each test case, these are all running against the same archive or multi-archive.

goodwinos self-assigned this Dec 14, 2017

fche mentioned this issue Jan 23, 2018

activate pcp archive compression redhat-developer/osd-monitor-poc#15

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multithreading problems via __pmLogCtl sharing #386

multithreading problems via __pmLogCtl sharing #386

fche commented Dec 7, 2017

goodwinos commented Dec 14, 2017

fche commented Dec 14, 2017

kmcdonell commented Dec 15, 2017

multithreading problems via __pmLogCtl sharing #386

multithreading problems via __pmLogCtl sharing #386

Comments

fche commented Dec 7, 2017

goodwinos commented Dec 14, 2017

fche commented Dec 14, 2017

kmcdonell commented Dec 15, 2017