Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
Already on GitHub? Sign in to your account
pmchart crash on OSX with certain archive #35
Comments
|
Leaving a reference for myself, this was app2 in AUH |
|
Thanks for that Paul ... especially as it lands in my lap via the much maligned interp.c. I would expect this to have nothing to do with pmchart, other than guilt by association. But I will need some additional information to prosecute this:
Unfortunately there are many levers that influence the interpolation code, and it helps to have the fullest possible picture of all of them leading up to the kaboom. |
|
Oh, and |
|
pmlogger was running on 3.6.5-1, pmchart running on the latest build (downloaded yesterday), 3.10.6. I'll setup a step by step guide if I can, and email you the archive privately. |
|
Here's the gist with the command to run, plus the chart file I created, when I run the command with this chart against the archive (emailed to you), it crashes immediately (I don't see anything, just a crash). |
natoscott
added
the
bug
label
Aug 27, 2015
|
I'll just say that this is currently affecting me a lot. I wouldn't normally nag, but I'm struggling to work around this on OSX, doesn't seem to affect others on Linux here for some reason. Any progress would be met with a happy dance. |
|
Paul I'm in Europe at the moment ... no PCP bandwidth for another couple of weeks I'm afraid. -----Original Message----- I'll just say that this is currently affecting me a lot. I wouldn't normally nag, but I'm struggling to work around this on OSX, doesn't seem to affect others on Linux here for some reason. Any progress would be met with a happy dance. Reply to this email directly or view it on GitHub: |
|
@kmcdonell Paul has sent me the archive too now, hoping to diagnose & fix within a week. |
|
| Any progress would be met with a happy dance. I've reproduced the problem, debugged a bit (ohhhh man, interp.c) and have an initial workaround which I'm testing at the moment to see whether it might serve for next weeks release. I'm not 100% sure of the intention of some parts of the affected code (in libpcp/src/interp.c cache_read) so the long-term fix will very likely need to wait Kens input. But, I can definitely replay that archive now on Mac OS X (its all in libpcp memory management code, and nothing to do with archive format changing, or anything like that, as was wondered earlier). cheers. |
|
Paul, could you try the dmg file I've put at ftp://ftp.pcp.io/projects/pcp/download/pcp-3.10.7-rc1.dmg and see how that pmchart fares for you? Seems to be working again for me for your test case. I git-bisect'ed back to commit 7881fd5 ("libpcp pdubuf: rewrite using <search.h> binary trees") which is a libpcp pdubuf performance optimisation only used by pmwebd currently. I believe it may have something to do with buffer unpinning of cache'd pmResult valuesets in the interp.c cache_read() code, but not sure beyond that. Like you, I'm also only seeing it fail on Mac OS X - perhaps the memory allocation strategies there are circumventing the logic at the start of __pmFreeResultValueSets() - again, not clear. I'll add a pointer to some trace logs here shortly so others can have a peek too. cheers. |
|
To clarify a little, the dmg above is a build with the libpcp commit above reverted - there is no known fix beyond that at this stage. |
|
I'm on leave until next Monday up in Sunny Queensland.ill try when I get Thanks heaps for looking into it.
|
natoscott
added a commit
that referenced
this issue
Sep 15, 2015
natoscott
added a commit
that referenced
this issue
Sep 15, 2015
|
I've found a simpler reproducer using pmdumptext, and a smaller version of Paul's production data. I've anonymised that archive and committed it as qa/archives/small and added qa/816 which reproduces the problem using pmdumptext. For reference, I'll add a couple of gists of the full before/after debug logs, shortly showing failure vs success to help triangulate where things go astray. I've temporarily backed out the libpcp optimization behind the regression for pcp-3.10.7 (which is about to release in a day or so). |
|
Failing case - https://gist.github.com/natoscott/f8288c218c1e06ede622 See qa/816 for additional details (and/or the commands at head of each gist) |
|
Thanks Nato. just downloaded -7 today and seems to be working a treat! Sanity restored, many thanks. |
|
No problem, thanks Paul. |
tallpsmith commentedAug 10, 2015
I've tried this with the latest pmchart (3.10.6), but I get this crash pretty reliably as I scroll around time ranges with a specific archive.
I'll attach the actual OSX Debug window data as a separate file. The archive I think I'll have to provide externally rather than here.