# Fractal Compression with ISM: A Novel Field-Theoretic Algorithm

Welcome! This notebook introduces a **brand new approach to symbolic data compression** developed in the context of Dawn Field Theory and infodynamics.

You will learn how to:
- Break data into symbolic chunks (fractal/codebook style)
- Attach *intrinsic structural metadata* (ISM) to each chunk
- Analyze entropy and structure across your data
- Run and experiment with a blueprint implementation

**This is the first implementation of ISM-style field-theoretic compression in this repository.**


**Step 1: Import Requirements**

Run this cell to import necessary Python libraries.
If you get an error, run `pip install numpy matplotlib` in your environment.


In [None]:
import numpy as np
import hashlib
import zlib
import struct
import matplotlib.pyplot as plt


**Step 2: Data Chunking**

We'll break your data into equal-sized symbolic chunks.
You can change `chunk_size` in the next cell to see its impact.


In [None]:
def chunkify(data, chunk_size=256):
    chunks = [data[i:i+chunk_size] for i in range(0, len(data), chunk_size)]
    if len(chunks[-1]) < chunk_size:
        chunks[-1] += b'\x00' * (chunk_size - len(chunks[-1]))
    return chunks

# Demo: try with simple pattern
test_data = b'ABCD1234' * 16
chunks = chunkify(test_data, chunk_size=8)
print(f'Chunks: {len(chunks)}')
for i, c in enumerate(chunks):
    print(f'Chunk {i}:', c)


**Step 3: Fractal Codebook Construction**

Now we build a codebook of unique chunks, and represent the data as a sequence of codebook indices.
This mirrors symbolic collapse and recursion in field theory.


In [None]:
def hash_chunk(chunk):
    return hashlib.sha256(chunk).digest()[:8]

def build_fractal_index(chunks):
    index = {}
    codebook = []
    sequence = []
    for chunk in chunks:
        h = hash_chunk(chunk)
        match_idx = index.get(h)
        if match_idx is None:
            match_idx = len(codebook)
            index[h] = match_idx
            codebook.append(chunk)
        sequence.append(match_idx)
    return codebook, sequence

codebook, sequence = build_fractal_index(chunks)
print(f'Unique codebook entries: {len(codebook)}')
print(f'Sequence: {sequence}')


**Step 4: ISM Metadata Extraction**

For each codebook entry, compute intrinsic structural metadata:

- Entropy
- Centroid
- Power
- Dominant Frequency

Run the next cell to compute and visualize these properties.


In [None]:
def ism_metadata(codebook):
    metadata = []
    for chunk in codebook:
        arr = np.frombuffer(chunk, dtype=np.uint8)
        entropy = -np.sum((p := np.bincount(arr, minlength=256) / len(arr)) * np.log2(p + 1e-10))
        centroid = np.mean(arr)
        power = np.mean(arr ** 2)
        fft = np.fft.rfft(arr)
        dom_freq = np.argmax(np.abs(fft))
        metadata.append((entropy, centroid, power, dom_freq))
    return metadata

metadata = ism_metadata(codebook)
print('ISM Metadata (first 3):', metadata[:3])

# Plot entropy and centroid
ent = [m[0] for m in metadata]
cent = [m[1] for m in metadata]
plt.figure(figsize=(6,2))
plt.subplot(1,2,1)
plt.plot(ent, marker='o'); plt.title('Entropy'); plt.xlabel('Chunk'); plt.ylabel('H')
plt.subplot(1,2,2)
plt.plot(cent, marker='o'); plt.title('Centroid'); plt.xlabel('Chunk'); plt.ylabel('Mean')
plt.tight_layout()
plt.show()


**Step 5: Full Compression & Decompression**

Now let's use the codebook and ISM to compress and restore any data block.

- Run the next cell on different data (structured, random, etc.)
- Confirm that decompression is exact (`restored == data`)
- Inspect compression ratio and codebook statistics


In [None]:
def fractal_compress(data, chunk_size=256):
    chunks = chunkify(data, chunk_size)
    codebook, sequence = build_fractal_index(chunks)
    metadata = ism_metadata(codebook)
    sequence_bytes = np.array(sequence, dtype=np.uint16).tobytes()
    codebook_bytes = b''.join(codebook)
    meta_bytes = b''.join(struct.pack('>ffff', *m) for m in metadata)
    packed = struct.pack('>I', len(data)) + struct.pack('>H', chunk_size) + struct.pack('>H', len(codebook)) + codebook_bytes + sequence_bytes + meta_bytes
    return zlib.compress(packed), codebook, sequence, metadata

def fractal_decompress(packed):
    raw = zlib.decompress(packed)
    original_len = struct.unpack('>I', raw[:4])[0]
    chunk_size = struct.unpack('>H', raw[4:6])[0]
    count = struct.unpack('>H', raw[6:8])[0]
    offset = 8
    codebook = [raw[offset+i*chunk_size:offset+(i+1)*chunk_size] for i in range(count)]
    offset += count * chunk_size
    meta_size = count * 16
    sequence_length = (len(raw) - offset - meta_size) // 2
    sequence = np.frombuffer(raw[offset:offset + sequence_length * 2], dtype=np.uint16)
    meta_offset = offset + sequence_length * 2
    metadata = [struct.unpack('>ffff', raw[meta_offset+i*16:meta_offset+(i+1)*16]) for i in range(count)]
    restored = b''.join(codebook[i] for i in sequence)
    return restored[:original_len], codebook, sequence, metadata

# Try the pipeline
data = (b'ABCD1234' * 1024) + (b'XYZ9876' * 512)
compressed, codebook, sequence, metadata = fractal_compress(data, chunk_size=32)
restored, _, _, _ = fractal_decompress(compressed)
print('Compression ratio:', len(compressed)/len(data))
print('Restored == Original:', restored == data)


**Step 6: Explore & Extend**

Try running the above pipeline on:
- Larger data files (try images, audio, text)
- Change `chunk_size` or try random data
- Visualize all ISM metrics
- Compare with classic compressors (zlib, bz2, lzma)

---

**Blueprint for Expansion:**

- Add recursive or hierarchical chunking
- Use ISM metrics for adaptive transforms
- Integrate with field-theoretic tools and validation

This notebook is an open blueprint—extend and adapt to push the boundaries of field-theoretic compression!
