Skip to content

nekronos-gh/rule110

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

Rule 110 Fast Algorithm

A high-performance implementation of Rule 110 cellular automaton using AVX2 SIMD instructions and advanced bit manipulation techniques.

Overview

This project implements an optimized Rule 110 simulator that leverages Intel AVX2 intrinsics to achieve significant performance improvements over traditional lookup-table approaches. The implementation focuses on minimizing operations through Boolean algebra simplification and efficient bit-level parallelism.

Video Explanation

Performance Characteristics

  • Memory Access: Aligned 256-bit loads/stores for optimal cache performance
  • Bit Operations: Reduced to 3 logical operations per cell (XOR, ANDNOT, OR)
  • Parallelism: Processes 256 cells simultaneously per AVX2 operation
  • Scalability: OpenMP parallelization across multiple groups
┌──────────┬──────────┬──────────┬──────────┐
│  Lane 0  │  Lane 1  │  Lane 2  │  Lane 3  │
│ (64 bits)│ (64 bits)│ (64 bits)│ (64 bits)│
└──────────┴──────────┴──────────┴──────────┘
↑----------------- 256 bits ----------------↑

Requirements

  • C++20 compiler (GCC 11+ or Clang 12+)
  • AVX2-capable CPU (Intel Haswell 2013+ or AMD Excavator 2015+)
  • OpenMP support

About

Efficient implementation of the rule110 automata, using AVX2 SIMD instructions.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors