Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERROR spark.http.matching.GeneralError #88

Closed
quasarea opened this issue Mar 21, 2017 · 11 comments
Closed

ERROR spark.http.matching.GeneralError #88

quasarea opened this issue Mar 21, 2017 · 11 comments
Assignees
Labels

Comments

@quasarea
Copy link

quasarea commented Mar 21, 2017

In one of repos I'm getting error (on java console, search is just endless 'loading...') for a query http://localhost:8080/?q=q&lan=Unknown&repo=Requirements

Haven't got it for anywhere else, no more detail information in logs.

It will not happen for http://localhost:8080/?q=q&repo=Requirements
nor http://localhost:8080/?q=q&lan=Unknown

The repo holds crap load of random documents, and I wanted to list unknowns to maybe categorise them in some way.
[qtp395660352-21767] ERROR spark.http.matching.GeneralError - java.lang.StringIndexOutOfBoundsException: String index out of range: 45 at java.lang.String.substring(Unknown Source) at com.searchcode.app.service.CodeMatcher.highlightLine(CodeMatcher.java:289) at com.searchcode.app.service.CodeMatcher.highlightLine(CodeMatcher.java:291) at com.searchcode.app.service.CodeMatcher.findMatchingLines(CodeMatcher.java:165) at com.searchcode.app.service.CodeMatcher.matchResults(CodeMatcher.java:65) at com.searchcode.app.service.CodeMatcher.formatResults(CodeMatcher.java:50) at com.searchcode.app.service.route.SearchRouteService.codeSearch(SearchRouteService.java:99) at com.searchcode.app.App.lambda$null$14(App.java:128) at spark.ResponseTransformerRouteImpl$1.handle(ResponseTransformerRouteImpl.java:47) at spark.http.matching.Routes.execute(Routes.java:61) at spark.http.matching.MatcherFilter.doFilter(MatcherFilter.java:130) at spark.embeddedserver.jetty.JettyHandler.doHandle(JettyHandler.java:50) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:189) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:119) at org.eclipse.jetty.server.Server.handle(Server.java:517) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:242) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:261) at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95) at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:75) at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume.java:213) at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:147) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572) at java.lang.Thread.run(Unknown Source)

@boyter
Copy link
Owner

boyter commented Mar 21, 2017

This is a bug in the highlight line logic CodeMatcher.highlightLine which is responsible for highlighting the matching strings in lines.

Odd because this is the most heavily tested method in the application.

Line in question

StringEscapeUtils.escapeHtml4(token.substring(loc, loc + longestTerm.length())) +

If you could supply the actual file that causes this I would be able to fix it. For the moment I am going to attempt to trigger it again based on the above.

@boyter
Copy link
Owner

boyter commented Mar 21, 2017

Have tried the following code which creates the same match term and tries to cause the issue with the following,

  • A string of 1000 characters of just "q"
  • A string of 1000 random length 1-21 alphabetic words
  • A string of 1000 random length 1-21 words using any character

None were able to reproduce the issue.

public void testHighlightLineIssue88() {
        Random rand = new Random();
        CodeMatcher cm = new CodeMatcher();

        List<String> matchTerms = new ArrayList<String>() {{
            add("q");
        }};

        String line = "q";
        for (int i = 0; i < 1000; i++) {
            line += "q";
            cm.highlightLine(line, matchTerms);
        }

        StringBuilder bf = new StringBuilder();
        for(int j=0; j < 1000 + 1; j++) {
            bf.append(RandomStringUtils.randomAlphabetic(rand.nextInt(20) + 1)).append(" ");
        }

        cm.highlightLine(bf.toString(), matchTerms);

        bf = new StringBuilder();
        for(int j=0; j < 1000 + 1; j++) {
            bf.append(RandomStringUtils.random(rand.nextInt(20) + 1)).append(" ").append("q");
        }

        cm.highlightLine(bf.toString(), matchTerms);
  }

Would it be possible to get the files in question which trigger this? I would love to resolve it. For the moment though I am going to put some exception handling in to deal with this where the line in question will not be added if this happens.

@boyter boyter added the bug label Mar 21, 2017
@boyter boyter self-assigned this Mar 21, 2017
boyter added a commit that referenced this issue Mar 21, 2017
@boyter
Copy link
Owner

boyter commented Mar 21, 2017

Have added a workaround bab5b75 which should resolve this issue, but I would prefer to have a fix for the real underlying issue if possible.

@quasarea
Copy link
Author

sure, I will try to locate this file, but as I wrote, there is plenty of awkward files there and I have no clue which it could be

@boyter
Copy link
Owner

boyter commented Mar 22, 2017

Even a dump would be fine. If you are worried about the IP of the files I can promise that I will never leak them, I just want them to fix the issue. I will then purge any copies I have.

@quasarea
Copy link
Author

Hi, I debugged against 1.3.8 tag, hope it will help, unfortunately document is marked as confidential so not sure if I can get it released easily.

it is .doc file (I removed .doc from binary exclusion just to test how much data I can get from them)

it contains both text and images, so possibly this is actually one of embed images source

Exceptions comes from:
String.java class method (line 1958)

    public String substring(int beginIndex, int endIndex) {
        if (beginIndex < 0) {
            throw new StringIndexOutOfBoundsException(beginIndex);
        }
        if (endIndex > value.length) {
            throw new StringIndexOutOfBoundsException(endIndex); /* throws here
        }
        int subLen = endIndex - beginIndex;
        if (subLen < 0) {
            throw new StringIndexOutOfBoundsException(subLen);
        }
        return ((beginIndex == 0) && (endIndex == value.length)) ? this
                : new String(value, beginIndex, subLen);
    }

throws error in /* throws here with variables:
beginIndex = 208
endIndex = 209
this.value = see attached file

and the call comes from CodeMatcher.java (line 289

    public String highlightLine(String line, List<String> matchTerms) {
        ...
        StringEscapeUtils.escapeHtml4(token.substring(loc, loc + longestTerm.length())) +
                            "</strong>" +

value.txt

@boyter
Copy link
Owner

boyter commented Mar 28, 2017

No worries. The value is enough for me to work with. I will use this to debug the issue now.

@boyter
Copy link
Owner

boyter commented Mar 29, 2017

Unable to replicate it using the file supplied sadly. I will try throwing a bunch of images at it and see what that does IE I will remove the blacklist as you did and see what happens.

@quasarea
Copy link
Author

quasarea commented Mar 29, 2017

I created separate 1.3.7 instance with just this file, and oddly I can't replicate it there.

With current version build from sources the error is handled (so it not switching to endless search mode anymore)

Console output:

Mar 29, 2017 10:01:35 AM java.util.logging.LogManager$RootLogger log
SEVERE: Unable to highlightLine ???     o?`♂▲?z♠r☺??????GO??♦l????%=2↔♫??♥?☻←?B?rv??y?pW??@??~?]z???s?RK/??♫?♂♦ Yb?♀?"f?m?o???o???9?►ZB????*?3?K`?▲?2???⌂??▬[l???f?}??~???5.??&]?u[????,??Y??♦)??/??h??'?b?=?3?;??▲{♀?♂$????1???ErV>j?
???$Ñ??9?W?D!q q q??8? d?i??H5♣?♦- using terms q java.lang.StringIndexOutOfBoundsException: String index out of range: 209

Log file:
searchcode-server-0.txt

If you cannot replicate it from this data maybe you could add some more output to the log file that will allow to achieve that?

@boyter
Copy link
Owner

boyter commented Mar 29, 2017

Actually that line output is the indicator is there to help replicate it, but the issue in this case is that its outputting the ASCII/UTF-8 representation of the binary file, and as such information is lost. The only way I could really do it is with the actual file. I could add logic I suppose to try and pull out the line from the file but even then it might not resolve it.

String line = "???     o?`♂▲?z♠r☺??????GO??♦l????%=2↔♫??♥?☻←?B?rv??y?pW??@??~?]z???s?RK/??♫?♂♦ Yb?♀?\"f?m?o???o???9?►ZB????*?3?K`?▲?2???⌂??▬[l???f?}??~???5.??&]?u[????,??Y??♦)??/??h??'?b?=?3?;??▲{♀?♂$????1???ErV>j?\n" +
                "???$Ñ??9?W?D!q q q??8? d?i??H5♣?♦-";
        cm.highlightLine(line, matchTerms);

You can see it here https://github.com/boyter/searchcode-server/blob/master/src/test/java/com/searchcode/app/service/CodeMatcherTest.java#L306

Thats what I added as a test case using the above output as an example and it never hits the issue. I am glad that the workaround put in place prevents the issue from happening though.

@boyter
Copy link
Owner

boyter commented Apr 11, 2017

Ok going to close this one down. There is a fix which is not ideal which resolves the issue, but I would love to get ahold of the actual file causing the issue and fix that as well. At least it won't crash out this time.

@boyter boyter closed this as completed Apr 11, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants