Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Direct Threaded VM described in #51. Improving ~49%. #52

Merged
merged 3 commits into from
Oct 2, 2016

Conversation

KeenS
Copy link
Contributor

@KeenS KeenS commented May 25, 2015

Here are benchmark scores. Benchmark suite is derived form http://sljit.sourceforge.net/regex_perf.html.

MasterThis PRImprove Rate
Twain47 ms47 ms0%
^Twain47 ms47 ms0%
Twain$47 ms47 ms0%
Huck[a-zA-Z]+|Finn[a-zA-Z]+127 ms127 ms0%
a[^x]{20}b1172 ms889 ms31%
Tom|Sawyer|Huckleberry|Finn151 ms153 ms-1%
.{0,3}(Tom|Sawyer|Huckleberry|Finn)497 ms449 ms10%
[a-zA-Z]+ing4032 ms2705 ms49%
^[a-zA-Z]{0,4}ing[^a-zA-Z]96 ms98 ms-2%
[a-zA-Z]+ing$4175 ms2797 ms49%
^[a-zA-Z ]{5,}$1770 ms1623 ms9%
^.{16,20}$1757 ms1637 ms7%
([a-f](.[d-m].){0,2}[h-n]){2}1849 ms1670 ms11%
([A-Za-z]awyer|[A-Za-z]inn)[^a-zA-Z]656 ms607 ms8%
"[^"]{0,30}[?!\.]"115 ms93 ms24%
Tom.{10,25}river|river.{10,25}Tom260 ms262 ms-1%

Env:

$ uname -a
Linux Dynabook 3.19.0-18-generic #18-Ubuntu SMP Tue May 19 18:31:35 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

$ cat /proc/cpuinfo
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model       : 37
model name  : Intel(R) Core(TM) i5 CPU       M 450  @ 2.40GHz
stepping    : 5
microcode   : 0x2
cpu MHz     : 1199.000
cache size  : 3072 KB
physical id : 0
siblings    : 4
core id     : 0
cpu cores   : 2
apicid      : 0
initial apicid  : 0
fpu     : yes
fpu_exception   : yes
cpuid level : 11
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt lahf_lm ida arat dtherm tpr_shadow vnmi flexpriority ept vpid
bugs        :
bogomips    : 4788.65
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

processor   : 1
vendor_id   : GenuineIntel
cpu family  : 6
model       : 37
model name  : Intel(R) Core(TM) i5 CPU       M 450  @ 2.40GHz
stepping    : 5
microcode   : 0x2
cpu MHz     : 1199.000
cache size  : 3072 KB
physical id : 0
siblings    : 4
core id     : 0
cpu cores   : 2
apicid      : 1
initial apicid  : 1
fpu     : yes
fpu_exception   : yes
cpuid level : 11
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt lahf_lm ida arat dtherm tpr_shadow vnmi flexpriority ept vpid
bugs        :
bogomips    : 4788.65
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

processor   : 2
vendor_id   : GenuineIntel
cpu family  : 6
model       : 37
model name  : Intel(R) Core(TM) i5 CPU       M 450  @ 2.40GHz
stepping    : 5
microcode   : 0x2
cpu MHz     : 2133.000
cache size  : 3072 KB
physical id : 0
siblings    : 4
core id     : 2
cpu cores   : 2
apicid      : 4
initial apicid  : 4
fpu     : yes
fpu_exception   : yes
cpuid level : 11
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt lahf_lm ida arat dtherm tpr_shadow vnmi flexpriority ept vpid
bugs        :
bogomips    : 4788.65
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

processor   : 3
vendor_id   : GenuineIntel
cpu family  : 6
model       : 37
model name  : Intel(R) Core(TM) i5 CPU       M 450  @ 2.40GHz
stepping    : 5
microcode   : 0x2
cpu MHz     : 1199.000
cache size  : 3072 KB
physical id : 0
siblings    : 4
core id     : 2
cpu cores   : 2
apicid      : 5
initial apicid  : 5
fpu     : yes
fpu_exception   : yes
cpuid level : 11
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt lahf_lm ida arat dtherm tpr_shadow vnmi flexpriority ept vpid
bugs        :
bogomips    : 4788.65
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

$ cat /proc/meminfo
MemTotal:        7965524 kB
MemFree:         6007232 kB
MemAvailable:    6635616 kB
Buffers:          147344 kB
Cached:           791320 kB
SwapCached:            0 kB
Active:          1170092 kB
Inactive:         614612 kB
Active(anon):     848752 kB
Inactive(anon):   164980 kB
Active(file):     321340 kB
Inactive(file):   449632 kB
Unevictable:          64 kB
Mlocked:              64 kB
SwapTotal:             0 kB
SwapFree:              0 kB
Dirty:               856 kB
Writeback:             0 kB
AnonPages:        846212 kB
Mapped:           257944 kB
Shmem:            167692 kB
Slab:              81000 kB
SReclaimable:      52676 kB
SUnreclaim:        28324 kB
KernelStack:        7824 kB
PageTables:        27924 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:     3982760 kB
Committed_AS:    4154020 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      551880 kB
VmallocChunk:   34359178716 kB
HardwareCorrupted:     0 kB
AnonHugePages:    327680 kB
CmaTotal:              0 kB
CmaFree:               0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:       99392 kB
DirectMap2M:     8079360 kB

$ gcc --version
gcc (Ubuntu 4.9.2-10ubuntu13) 4.9.2
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.01%) to 79.77% when pulling d18ec9a on KeenS:master into a0e7f59 on k-takata:master.

@KeenS
Copy link
Contributor Author

KeenS commented Jul 28, 2015

Note direct threading is enabled only when USE_DIRECT_THREAD_VM is defined. You need to choose how to enable the option. One way is making it a compile option, and another is auto detect environment like picrin.

@k-takata
Copy link
Owner

Thank you for the information.
I think this is very useful, but I haven't tried yet.

nurse added a commit to ruby/ruby that referenced this pull request Nov 26, 2015
  Merge Onigmo 58fa099ed1a34367de67fb3d06dd48d076839692
  + k-takata/Onigmo#52

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52756 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
mrkn pushed a commit to mrkn/ruby that referenced this pull request Apr 17, 2016
  Merge Onigmo 58fa099ed1a34367de67fb3d06dd48d076839692
  + k-takata/Onigmo#52

git-svn-id: svn+ssh://svn.ruby-lang.org/ruby/trunk@52756 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
@k-takata k-takata merged commit 919810e into k-takata:master Oct 2, 2016
@k-takata
Copy link
Owner

k-takata commented Oct 2, 2016

Thank you and sorry for the very late response. Merged.

@KeenS
Copy link
Contributor Author

KeenS commented Oct 3, 2016

🎉

@k-takata k-takata mentioned this pull request Oct 11, 2016
@k-takata
Copy link
Owner

k-takata commented Dec 6, 2016

I read the article about YARV and Direct Threaded Code again.
http://magazine.rubyist.net/?0008-YarvManiacs#l8
Isn't your implementation Token Threaded Code instead of Direct Threaded Code?
I'm going to rename the definition USE_DIRECT_THREADED_VM to USE_TOKEN_THREADED_VM.

k-takata added a commit that referenced this pull request Dec 6, 2016
This is token threaded VM, not direct threaded VM.
See: #52
k-takata added a commit that referenced this pull request Dec 6, 2016
PR #52 was actually a token threaded VM.
@KeenS
Copy link
Contributor Author

KeenS commented Dec 7, 2016

To tell the truth, I'm not familiar with threaded vm but I think you are right. It seems I wrote token threaded vm code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants