Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于jdk7中 使用TextRankKeyword提取关键词报Comparison method violates its general contract!异常 #11

Closed
a198720 opened this issue May 8, 2015 · 1 comment

Comments

@a198720
Copy link

a198720 commented May 8, 2015

测试代码:
String src = "data/test.txt";
Scanner scanner = new Scanner(Paths.get(src),"gbk");
StringBuilder sb = new StringBuilder();
while(scanner.hasNextLine()){
sb.append(scanner.nextLine().trim());
}
// System.out.println(sb.toString());
scanner.close();
System.out.println(TextRankKeyword.getKeywordList(sb.toString(), 20));

错误代码:
java.lang.IllegalArgumentException: Comparison method violates its general contract!
at java.util.TimSort.mergeLo(Unknown Source)
at java.util.TimSort.mergeAt(Unknown Source)
at java.util.TimSort.mergeCollapse(Unknown Source)
at java.util.TimSort.sort(Unknown Source)
at java.util.TimSort.sort(Unknown Source)
at java.util.Arrays.sort(Unknown Source)
at java.util.Collections.sort(Unknown Source)
at com.hankcs.hanlp.summary.TextRankKeyword.getKeyword(TextRankKeyword.java:115)
at com.hankcs.hanlp.summary.TextRankKeyword.getKeywordList(TextRankKeyword.java:47)

经过网上搜索:
http://www.tuicool.com/articles/MZreyuv
http://blog.csdn.net/ghsau/article/details/42012365

发现是jdk7 中 Collections的排序算法已经发生变化,需要处理两个比较对象相等的情况.
由于TextRankKeyword中的比较对象是Float对象,所以我查了下Float的compare方法(Float是实现Comparable接口的).代码:
public static int compare(float f1, float f2) {
if (f1 < f2)
return -1; // Neither val is NaN, thisVal is smaller
if (f1 > f2)
return 1; // Neither val is NaN, thisVal is larger

    // Cannot use floatToRawIntBits because of possibility of NaNs.
    int thisBits    = Float.floatToIntBits(f1);
    int anotherBits = Float.floatToIntBits(f2);

    return (thisBits == anotherBits ?  0 : // Values are equal
            (thisBits < anotherBits ? -1 : // (-0.0, 0.0) or (!NaN, NaN)
             1));                          // (0.0, -0.0) or (NaN, !NaN)
}

所以我将博主的代码:
Collections.sort(entryList, new Comparator<Map.Entry<String, Float>>()
{
@OverRide
public int compare(Map.Entry<String, Float> o1, Map.Entry<String, Float> o2)
{
return (o1.getValue() - o2.getValue() > 0 ? -1 : 1);
}
});

改为了:
Collections.sort(entryList, new Comparator<Map.Entry<String, Float>>()
{
@OverRide
public int compare(Map.Entry<String, Float> o1, Map.Entry<String, Float> o2)
{
return Float.compare(o1.getValue(),o1.getValue());
}
});

这样就不报错了.

请博主参考哈. 建议最好代码中的所有Float参数的比较实现都采用这中方式.

hankcs added a commit that referenced this issue May 8, 2015
@hankcs
Copy link
Owner

hankcs commented May 8, 2015

感谢指正,已经修复。
其实当初写这个TextRankKeyword的时候就发现了这个问题,当时采取的规避措施是:

    public TextRankKeyword()
    {
        // jdk bug : Exception in thread "main" java.lang.IllegalArgumentException: Comparison method violates its general contract!
        System.setProperty("java.util.Arrays.useLegacyMergeSort", "true");
    }

不过在你的JDK7中,似乎没有生效。
Anyway,你跟我想到一块儿去了,我现在将排序改为:
621178e
这样就没问题了。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants