New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some specific Specman e file causes cloc to hang #206
Comments
First off, I appreciate the debug efforts you've put in so far, saves me a lot of work. The line where things hang looks innocuous enough. My gut feel is the C++ regex from Regexp::Common freaked out on the input (which is From past experience, problematic C and C++ input happens when people decorate their code with things like /*///////////////////////////////*/ /***************////////******// and so on, in other words, inadvertently mix If you could bisect the code and still reproduce the hang (anything longer than 5 seconds for one file and you might as well kill it as it is hung) that would be most helpful. Ideally you'll be able to trim it down to just a few lines of code and if the cause of the problem isn't obvious then, then at least the obfuscation work will be easier and you could send it. |
I've figured out that it's not a "hang" per se, but rather a highly exponential performance loss based on the number of lines in the file. I have figured out the exact character that I can delete to make the problem go away. You're right; it's a case where there is a multi-line comment token "/*" accidentally embedded inside a quoted string, with no equivalent matching closing token. But if I shorten the testcase, keeping this problem in the file, cloc will eventually finish the analysis. It reports that it's finishing about ~30 lines per second when the problem is present, when the file is about 1000 lines long. Normally the file is about 5000 lines long and that apparently translates to "not finishing in 18 hours." Is this an intelligence that you think you can grant to cloc? I don't own the analyzed code in question, so getting this particular landmine removed will be problematic for me. But I realize that cloc isn't trying to be a universal parser, either. Thanks! |
Being able to understand what makes a string is difficult. The real solution is to correctly parse each language according to its syntax rules but I don't have an easy avenue to that. |
Please give 2d19bf8 a try with your original file set. It won't catch the unmatched |
I tried this version and it still hangs on the file. I do see examples where the timeout is exceeded, but not for this particular file. :(
What if you just analyzed the string that the regexp is about to use, and determined that the string was pathological, likely to cause a hang, and just skipped it instead? Probably with a warning. |
Nuts. It is going to be really tough to make progress on this without being able to duplicate the problem on my own. As far as analyzing the string, the string in question is the contents of the entire file. Figuring out what makes up a string which encompasses an unmatched |
Ok. Will look at ways to mitigate the time sink of obfuscating the code. |
Please find attached the obfuscated code. In order to paste it I had to change the extension from .e to .txt so you'll wanna change that back. |
cloc runs without issue for me on this file: > mv obfuscate_this.txt issue206_specman.e > md5sum issue206_specman.e 645f890dc1d3fb441352831b69c8bb9f issue206_specman.e > cloc issue206_specman.e 1 text file. 1 unique file. 0 files ignored. github.com/AlDanial/cloc v 1.73 T=0.09 s (10.9 files/s, 43749.5 lines/s) ------------------------------------------------------------------------------- Language files blank comment code ------------------------------------------------------------------------------- Specman e 1 542 908 2575 ------------------------------------------------------------------------------- |
Argh. What version of Perl? |
Perl v5.22.1 > lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 16.04.2 LTS Release: 16.04 Codename: xenial Ubuntu 16.04 Linux vex 4.4.0-83-generic #106-Ubuntu SMP Mon Jun 26 17:54:43 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
Ok. I have replicated the problem on several versions of perl up to 5.8.7. Perl 5.12.1 and onwards appears not to demonstrate the problem, although the problematic line apparently results in incorrect code line count for that file. Thanks for your help. |
OK, will chalk this up as a Perl version issue then. |
Hey Al. It might be touchy to solve this one. I'll need to heavily obfuscate the code before I can send a snippet. Perhaps you can help me get some debug help here and maybe it won't be necessary.
I have one particular Specman e file that causes cloc to hang. That is, 18 hours later it hasn't made forward progress on that file as far as I can tell.
I have done two things to get some debug information. First, I ran with verbosity set to some high number. Secondly, I ran with the perl debugger to see what exactly was hanging.
The high-verbosity printout follows:
The ^C at the bottom is me coming back to the terminal and finally killing the process, 18 hours later.
I ran cloc under perl debugger and I found that this is the specific line of cloc that is hanging.
Stepping into that marked line above causes the hang. If I re-execute the debugger and do a "print $1" before stepping on that line, the debugger prints out an empty line. I don't know if there are whitespace tidbits on the blank line, or what. Maybe it's literally a blank string. I'm surprised that perl will hang without warning if it executes a substitution wherein it is looking for an empty string. I don't know how to make perl tell me what's going on so that I can figure out why that substitution is hanging. I will continue to try to narrow down my testcase to see if I can isolate the specific line of data that causes the hang.
Do you have any steps you want me to take?
Thanks for your time.
The text was updated successfully, but these errors were encountered: