-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Early benchmarking of rete versus greedy pattern rewriter #2
Merged
Commits on Dec 11, 2021
-
Early benchmarking, rete is much slower
------------ Performance counter stats for '/home/bollu/work/1-hoopl/build/release/bin/hoopl --bench-rete /home/bollu/work/1-hoopl/test/rand-program-seed-0.mlir': 2,684.85 msec task-clock # 0.998 CPUs utilized 379 context-switches # 141.163 /sec 1 cpu-migrations # 0.372 /sec 5,149 page-faults # 1.918 K/sec 8,30,20,19,196 cycles # 3.092 GHz 6,66,40,28,715 instructions # 0.80 insn per cycle 1,62,77,59,783 branches # 606.277 M/sec 45,28,371 branch-misses # 0.28% of all branches 2.691142846 seconds time elapsed 2.353686000 seconds user 0.319053000 seconds sys ------------ Performance counter stats for '/home/bollu/work/1-hoopl/build/release/bin/hoopl --bench-greedy /home/bollu/work/1-hoopl/test/rand-program-seed-0.mlir': 268.81 msec task-clock # 0.989 CPUs utilized 59 context-switches # 219.487 /sec 0 cpu-migrations # 0.000 /sec 4,001 page-faults # 14.884 K/sec 83,33,77,093 cycles # 3.100 GHz 77,91,95,474 instructions # 0.93 insn per cycle 16,33,50,158 branches # 607.683 M/sec 25,91,387 branch-misses # 1.59% of all branches 0.271878822 seconds time elapsed 0.122807000 seconds user 0.146161000 seconds sys ---------------------------------- 36.98% hoopl hoopl [.] alpha_memory_activation 31.69% hoopl hoopl [.] BetaTokensMemory::join_activation 14.04% hoopl hoopl [.] std::__cxx11::list<WME*, std::allocator<WME*> >::remove 1.42% hoopl [kernel.vmlinux] [k] syscall_exit_to_user_mode 0.97% hoopl ld-2.33.so [.] do_lookup_x 0.74% hoopl [kernel.vmlinux] [k] entry_SYSCALL_64 0.73% hoopl [kernel.vmlinux] [k] syscall_return_via_sysret 0.46% hoopl [kernel.vmlinux] [k] preempt_count_add 0.43% hoopl [kernel.vmlinux] [k] _raw_read_unlock_irqrestore 0.41% hoopl ld-2.33.so [.] strcmp 0.40% hoopl [kernel.vmlinux] [k] n_tty_write 0.31% hoopl [kernel.vmlinux] [k] ep_poll_callback 0.30% hoopl [kernel.vmlinux] [k] tty_write 0.28% hoopl hoopl [.] toRete 0.28% hoopl [kernel.vmlinux] [k] _raw_spin_lock_irqsave 0.26% hoopl [kernel.vmlinux] [k] preempt_count_sub 0.25% hoopl [kernel.vmlinux] [k] _raw_read_lock_irqsave 0.24% hoopl [kernel.vmlinux] [k] __wake_up_common 0.23% hoopl [kernel.vmlinux] [k] native_queued_spin_lock_slowpath 0.18% hoopl libc-2.33.so [.] _int_malloc 0.17% hoopl [kernel.vmlinux] [k] _raw_spin_unlock_irqrestore 0.16% hoopl [kernel.vmlinux] [k] queue_work_on 0.14% hoopl [kernel.vmlinux] [k] __fsnotify_parent 0.14% hoopl [kernel.vmlinux] [k] apparmor_file_permission 0.14% hoopl [kernel.vmlinux] [k] vfs_write 0.14% hoopl [kernel.vmlinux] [k] insert_work 0.13% hoopl [kernel.vmlinux] [k] __audit_syscall_exit 0.13% hoopl [kernel.vmlinux] [k] __check_object_size 0.13% hoopl [kernel.vmlinux] [k] update_rq_clock 0.13% hoopl [kernel.vmlinux] [k] try_to_wake_up 0.12% hoopl [kernel.vmlinux] [k] pty_write 0.12% hoopl [kernel.vmlinux] [k] resched_curr 0.12% hoopl [kernel.vmlinux] [k] tty_insert_flip_string_fixed_flag 0.11% hoopl ld-2.33.so [.] _dl_map_object 0.11% hoopl [kernel.vmlinux] [k] select_task_rq_fair
Configuration menu - View commit details
-
Copy full SHA for bc96ee9 - Browse repository at this point
Copy the full SHA bc96ee9View commit details
Commits on Dec 12, 2021
-
Configuration menu - View commit details
-
Copy full SHA for 34d8537 - Browse repository at this point
Copy the full SHA 34d8537View commit details -
Configuration menu - View commit details
-
Copy full SHA for cf9fc09 - Browse repository at this point
Copy the full SHA cf9fc09View commit details -
Configuration menu - View commit details
-
Copy full SHA for 1e5905f - Browse repository at this point
Copy the full SHA 1e5905fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 7d2c79e - Browse repository at this point
Copy the full SHA 7d2c79eView commit details
Commits on Dec 15, 2021
-
Configuration menu - View commit details
-
Copy full SHA for 2428998 - Browse repository at this point
Copy the full SHA 2428998View commit details -
[WIP] accelerate join node by keeping caches
for a test node (test-join WME[wme-ix] == Token[tok-ix][tok-field-ix]), keep data structures: - val2WMEs: value -> set<WME> (invariant: ∀ wme ∈ val2WMEs[v], wme[wme-ix] == v) - val2Toks: value -> set<Token> (invariant: ∀ tok ∈ val2Toks[v], tok[tok-ix][tok-field-ix] == v) This lets us process new tokens / new WMEs in O(# of real joins). This makes our perf report look like: ``` 43.01% hoopl hoopl [.] std::__cxx11::list<WME*, std::allocator<WME*> >::remove 5.89% hoopl [kernel.vmlinux] [k] syscall_exit_to_user_mode 3.54% hoopl ld-2.33.so [.] do_lookup_x 2.64% hoopl [kernel.vmlinux] [k] syscall_return_via_sysret 2.21% hoopl [kernel.vmlinux] [k] entry_SYSCALL_64 1.47% hoopl ld-2.33.so [.] strcmp 1.35% hoopl [kernel.vmlinux] [k] n_tty_write 1.23% hoopl [kernel.vmlinux] [k] _raw_spin_lock_irqsave 1.13% hoopl hoopl [.] JoinNode::alpha_activation ... ``` Now replace the huge list of WMEs in `ReteContext` with something more efficient, like a set for quick removal.
Configuration menu - View commit details
-
Copy full SHA for 9459eae - Browse repository at this point
Copy the full SHA 9459eaeView commit details -
We keep up with greedy rewriter in asymptotics.
Greedy: 0.45 / rete: 0.76 I need to bench how much of the difference comes from `fromRete`, which spends a while rematerializing the internal rete state out into MLIR. ---- Performance counter stats for '/home/bollu/work/1-hoopl/build/release/bin/hoopl --bench-greedy /home/bollu/work/1-hoopl/test/rand-program-seed-0.mlir': 492.89 msec task-clock # 0.999 CPUs utilized 2 context-switches # 4.058 /sec 0 cpu-migrations # 0.000 /sec 20,683 page-faults # 41.963 K/sec 1,654,417,515 cycles # 3.357 GHz 2,501,363,745 instructions # 1.51 insn per cycle 533,254,192 branches # 1.082 G/sec 4,885,449 branch-misses # 0.92% of all branches 0.493355440 seconds time elapsed 0.452724000 seconds user 0.039877000 seconds sys ----- Performance counter stats for '/home/bollu/work/1-hoopl/build/release/bin/hoopl --bench-rete /home/bollu/work/1-hoopl/test/rand-program-seed-0.mlir': 761.01 msec task-clock # 0.999 CPUs utilized 9 context-switches # 11.826 /sec 0 cpu-migrations # 0.000 /sec 41,724 page-faults # 54.827 K/sec 2,496,300,059 cycles # 3.280 GHz 3,568,834,203 instructions # 1.43 insn per cycle 772,742,616 branches # 1.015 G/sec 6,290,441 branch-misses # 0.81% of all branches 0.761639491 seconds time elapsed 0.707385000 seconds user 0.053313000 seconds sys
Configuration menu - View commit details
-
Copy full SHA for 4ccd0f3 - Browse repository at this point
Copy the full SHA 4ccd0f3View commit details -
Configuration menu - View commit details
-
Copy full SHA for e80cff7 - Browse repository at this point
Copy the full SHA e80cff7View commit details -
Configuration menu - View commit details
-
Copy full SHA for 12c9565 - Browse repository at this point
Copy the full SHA 12c9565View commit details -
Configuration menu - View commit details
-
Copy full SHA for ebcc93b - Browse repository at this point
Copy the full SHA ebcc93bView commit details
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.