Skip to content

luoyesiqiu/StatisticWords

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

介绍

统计某目录下的所有文本文件的单词出现频率。支持大驼峰和小驼峰命名的单词组合拆分,例如:void setName(name);或者void SetName(Name);将被分解成4个单词。只支持英文单词,一个字母的单词将忽略。可自行定义排名前几的数据。可自定义扫描的文件类型。

测试结果

在大小写敏感模式下,统计某Java源码目录的结果

排名 单词 出现频率
1 the 311620
2 if 160965
3 int 147354
4 to 124752
5 ud 122707
6 return 120929
7 is 103377
8 of 97253
9 public 82258
10 code 80901
11 get 80374
12 in 78338
13 this 72584
14 for 66639
15 void 66632
16 const 65662
17 String 61459
18 and 60536
19 static 58577
20 be 52238
21 new 52176
22 value 50750
23 set 48107
24 define 46341
25 or 44280
26 final 44272
27 The 43007
28 null 40870
29 param 39200
30 ua 35429
31 Exception 35177
32 not 33046
33 that 32959
34 with 31674
35 char 31605
36 private 30663
37 name 30092
38 by 28578
39 else 28552
40 on 27356
41 data 27117
42 link 26914
43 type 26831
44 length 26330
45 an 25982
46 License 25953
47 class 25941
48 android 24778
49 udc 24752
50 Code 24273
51 This 24032
52 ude 23355
53 key 22997
54 from 22842
55 are 22709
56 Object 22363
57 result 22353
58 Unicode 22143
59 import 22070
60 as 21689
61 it 21530
62 td 21388
63 size 21351
64 Array 20837
65 status 20546
66 file 20100
67 case 20058
68 udd 19954
69 Type 19706
70 index 19660
71 View 19528
72 use 19258
73 include 18985
74 Name 18622
75 object 18118
76 start 18109
77 boolean 17693
78 Value 17223
79 will 16990
80 out 16947
81 error 16763
82 Set 16560
83 may 16362
84 To 16344
85 string 16319
86 err 16081
87 true 15680
88 throws 15655
89 endif 15619
90 unsigned 15532
91 long 15313
92 udf 15279
93 at 15106
94 ctx 14984
95 State 14939
96 Info 14749
97 If 14680
98 block 14453
99 false 14434
100 used 14247

About

统计某目录下的单词出现频率

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages