Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Disambiguate .t files instead of directly assuming Perl #20

Closed
wants to merge 1 commit into from

3 participants

@robinst

This is the reason Mercurial is wrongly classified as "mostly Perl" on
Ohloh[1]. It uses ".t" files for test cases, which are wrapped shell
scripts and not Perl.

It would be even better if Ohcount would detect them as shell scripts,
but that seems hard given that the contents (see e.g. [2]) don't have
many defining characteristics.

[1] https://www.ohloh.net/p/mercurial/analyses/latest/languages_summary
[2] http://selenic.com/hg/file/ccd28eca37f6/tests/test-add.t

@robinst robinst Disambiguate .t files instead of directly assuming Perl
This is the reason Mercurial is wrongly classified as "mostly Perl" on
Ohloh[1]. It uses ".t" files for test cases, which are wrapped shell
scripts and not Perl.

It would be even better if Ohcount would detect them as shell scripts,
but that seems hard given that the contents (see e.g. [2]) don't have
many defining characteristics.

[1] https://www.ohloh.net/p/mercurial/analyses/latest/languages_summary
[2] http://selenic.com/hg/file/ccd28eca37f6/tests/test-add.t
6be21cf
@amujumdar

@robinst thanks! would it be possible for you to add test-cases?

@robinst

Will have to figure out how. I'm getting the following error with ./build, without my commit:

Running detector tests
run_tests: test/unit/detector_test.h:119: test_detector_detect_polyglot: Assertion `lang' failed.
@amujumdar

Tests are working for me on CentOS 5.5.

@psybers

I get a similar test error building on Ubuntu 12.04 on the latest git master:

Running detector tests
run_tests: test/unit/detector_test.h:119: test_detector_detect_polyglot: Assertion `lang' failed.
Aborted (core dumped)

Generating Ruby bindings for i686-linux_ubuntu
ruby/ohcount_wrap.c: In function ‘SWIG_Ruby_define_class’:
ruby/ohcount_wrap.c:1493:9: warning: variable ‘klass’ set but not used [-Wunused-but-set-variable]
Loaded suite ruby_test
Started
............................................................................EE.
Finished in 1.416128 seconds.

1) Error:
test_diff(SourceFileTest):
NoMethodError: undefined method `language' for nil:NilClass
./source_file_test.rb:12:in `test_diff'

2) Error:
test_empty_diff(SourceFileTest):
NoMethodError: undefined method `language' for nil:NilClass
./source_file_test.rb:23:in `test_empty_diff'

79 tests, 156 assertions, 0 failures, 2 errors

@amujumdar

I asked around about .t files and my colleagues who use Perl disagree that a .t file without a perl-shebang be called non-perl by default. Since what you mention is Mercurial test files, which are checked into tests/ folder, I'd suggest we use Ohloh's ignore-directories feature to suppress that part. Many projects ignore test files and assets using the same method.

@amujumdar

After ignoring tests/ directory, Mercurial now shows up as mostly written in Python - https://www.ohloh.net/p/mercurial/analyses/latest/languages_summary

@amujumdar

@psybers what ruby version you have running on Ubuntu 12.04. Ohcount support 1.8.7.

@amujumdar amujumdar closed this
@robinst

Ok, this solution is also fine by me. Thanks for taking the time for looking into this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Sep 22, 2012
  1. @robinst

    Disambiguate .t files instead of directly assuming Perl

    robinst authored
    This is the reason Mercurial is wrongly classified as "mostly Perl" on
    Ohloh[1]. It uses ".t" files for test cases, which are wrapped shell
    scripts and not Perl.
    
    It would be even better if Ohcount would detect them as shell scripts,
    but that seems hard given that the contents (see e.g. [2]) don't have
    many defining characteristics.
    
    [1] https://www.ohloh.net/p/mercurial/analyses/latest/languages_summary
    [2] http://selenic.com/hg/file/ccd28eca37f6/tests/test-add.t
This page is out of date. Refresh to see the latest.
View
16 src/detector.c
@@ -901,6 +901,22 @@ const char *disambiguate_st(SourceFile *sourcefile) {
return NULL;
}
+const char *disambiguate_t(SourceFile *sourcefile) {
+ char *contents = ohcount_sourcefile_get_contents(sourcefile);
+ if (!contents)
+ return NULL;
+
+ // Check for a perl shebang on first line of file
+ const char *error;
+ int erroffset;
+ pcre *re = pcre_compile("#![^\\n]*perl", PCRE_CASELESS, &error, &erroffset, NULL);
+ if (pcre_exec(re, NULL, contents, mystrnlen(contents, 100), 0, PCRE_ANCHORED, NULL, 0) > -1)
+ return LANG_PERL;
+
+ // May be something else, e.g. a test shell script
+ return NULL;
+}
+
int ohcount_is_binary_filename(const char *filename) {
char *p = (char *)filename + strlen(filename);
while (p > filename && *(p - 1) != '.') p--;
View
2  src/hash/disambiguatefuncs.gperf
@@ -18,6 +18,7 @@ const char *disambiguate_pp(SourceFile *sourcefile);
const char *disambiguate_pro(SourceFile *sourcefile);
const char *disambiguate_r(SourceFile *sourcefile);
const char *disambiguate_st(SourceFile *sourcefile);
+const char *disambiguate_t(SourceFile *sourcefile);
%}
struct DisambiguateFuncsMap { const char *key; const char* (*value)(SourceFile*); };
%%
@@ -37,3 +38,4 @@ pp, disambiguate_pp
pro, disambiguate_pro
r, disambiguate_r
st, disambiguate_st
+t, disambiguate_t
View
2  src/hash/extensions.gperf
@@ -194,7 +194,7 @@ svg, BINARY
svgz, BINARY
svn, BINARY
swf, BINARY
-t, LANG_PERL
+t, DISAMBIGUATE("t")
tar, BINARY
tcl, LANG_TCL
tex, LANG_TEX
Something went wrong with that request. Please try again.