# Array Compression and Serialization Example

This notebook demonstrates how to use the ArrayUtils class in meshly for compressing and serializing numpy arrays. It covers:

1. Encoding and decoding arrays using meshoptimizer compression
2. Saving and loading arrays to/from zip files
3. Working with different data types
4. Performance comparison with uncompressed arrays
5. Cross-platform compatibility with TypeScript

In [2]:
import os
import tempfile
from io import BytesIO
import numpy as np
from meshly import ArrayUtils

## 1. Basic Array Encoding and Decoding

The `ArrayUtils.encode_array()` function compresses numpy arrays using meshoptimizer's vertex buffer encoding, which is optimized for 3D graphics data but works well on general numeric arrays.

In [3]:
# Create test arrays with different characteristics
linear_array = np.linspace(0, 100, 1000, dtype=np.float32)
random_array = np.random.random((100, 3)).astype(np.float32)
integer_array = np.random.randint(0, 255, (50, 50), dtype=np.int32)

print(f"Linear array shape: {linear_array.shape}, dtype: {linear_array.dtype}")
print(f"Random array shape: {random_array.shape}, dtype: {random_array.dtype}")
print(f"Integer array shape: {integer_array.shape}, dtype: {integer_array.dtype}")

Linear array shape: (1000,), dtype: float32
Random array shape: (100, 3), dtype: float32
Integer array shape: (50, 50), dtype: int32


In [4]:
# Encode the linear array
encoded_linear = ArrayUtils.encode_array(linear_array)

print(f"Original size: {linear_array.nbytes} bytes")
print(f"Encoded size: {len(encoded_linear.data)} bytes")
print(f"Compression ratio: {len(encoded_linear.data) / linear_array.nbytes:.3f}")
print(f"Encoded metadata: shape={encoded_linear.shape}, dtype={encoded_linear.dtype}, itemsize={encoded_linear.itemsize}")

Original size: 4000 bytes
Encoded size: 2294 bytes
Compression ratio: 0.574
Encoded metadata: shape=(1000,), dtype=float32, itemsize=4


In [5]:
# Decode the array and verify it matches the original
decoded_linear = ArrayUtils.decode_array(encoded_linear)

print(f"Decoded array shape: {decoded_linear.shape}")
print(f"Decoded array dtype: {decoded_linear.dtype}")
print(f"Arrays are equal: {np.allclose(linear_array, decoded_linear)}")
print(f"Maximum difference: {np.max(np.abs(linear_array - decoded_linear))}")

Decoded array shape: (1000,)
Decoded array dtype: float32
Arrays are equal: True
Maximum difference: 0.0


## 2. Compression Performance on Different Array Types

Different types of data compress differently. Let's compare compression ratios for various array types.

In [6]:
def test_compression(name, array):
    """Test compression ratio for an array."""
    encoded = ArrayUtils.encode_array(array)
    decoded = ArrayUtils.decode_array(encoded)
    
    compression_ratio = len(encoded.data) / array.nbytes
    is_equal = np.allclose(array, decoded, rtol=1e-5)
    
    print(f"{name}:")
    print(f"  Original: {array.nbytes:,} bytes, Encoded: {len(encoded.data):,} bytes")
    print(f"  Compression ratio: {compression_ratio:.3f}")
    print(f"  Decoded correctly: {is_equal}")
    print()
    
    return compression_ratio

# Test different array types
ratios = {}
ratios['Linear'] = test_compression('Linear sequence', linear_array)
ratios['Random'] = test_compression('Random values', random_array)
ratios['Integer'] = test_compression('Integer values', integer_array)

Linear sequence:
  Original: 4,000 bytes, Encoded: 2,294 bytes
  Compression ratio: 0.574
  Decoded correctly: True

Random values:
  Original: 1,200 bytes, Encoded: 1,076 bytes
  Compression ratio: 0.897
  Decoded correctly: True

Integer values:
  Original: 10,000 bytes, Encoded: 3,662 bytes
  Compression ratio: 0.366
  Decoded correctly: True



In [7]:
# Test with more specific data patterns
# Vertex-like data (common in 3D graphics)
vertices = np.array([
    [0.0, 0.0, 0.0], [1.0, 0.0, 0.0], [1.0, 1.0, 0.0], [0.0, 1.0, 0.0],
    [0.0, 0.0, 1.0], [1.0, 0.0, 1.0], [1.0, 1.0, 1.0], [0.0, 1.0, 1.0]
], dtype=np.float32)

# Repeated pattern
pattern = np.tile([1.0, 2.0, 3.0, 4.0], 250).astype(np.float32)

# Smooth gradient
gradient = np.linspace(0, 1, 1000).astype(np.float32)

ratios['Vertices'] = test_compression('3D Vertices', vertices)
ratios['Pattern'] = test_compression('Repeated pattern', pattern)
ratios['Gradient'] = test_compression('Smooth gradient', gradient)

3D Vertices:
  Original: 96 bytes, Encoded: 75 bytes
  Compression ratio: 0.781
  Decoded correctly: True

Repeated pattern:
  Original: 4,000 bytes, Encoded: 1,351 bytes
  Compression ratio: 0.338
  Decoded correctly: True

Smooth gradient:
  Original: 4,000 bytes, Encoded: 2,366 bytes
  Compression ratio: 0.592
  Decoded correctly: True



## 3. Saving and Loading Arrays to/from Zip Files

The ArrayUtils class provides convenient methods to save compressed arrays to zip files and load them back. This is useful for persistent storage and cross-platform data exchange.

In [8]:
# Create a sample array for saving
sample_array = np.random.random((200, 3)).astype(np.float32)
print(f"Sample array shape: {sample_array.shape}, dtype: {sample_array.dtype}")
print(f"Sample array size: {sample_array.nbytes} bytes")

Sample array shape: (200, 3), dtype: float32
Sample array size: 2400 bytes


In [9]:
# Save to a temporary file
with tempfile.NamedTemporaryFile(suffix='.zip', delete=False) as temp_file:
    temp_path = temp_file.name

try:
    # Save the array to zip
    ArrayUtils.save_to_zip(sample_array, temp_path)
    
    zip_size = os.path.getsize(temp_path)
    print(f"Saved to {temp_path}")
    print(f"Zip file size: {zip_size} bytes")
    print(f"Compression ratio vs raw data: {zip_size / sample_array.nbytes:.3f}")
    
    # Load the array back
    loaded_array = ArrayUtils.load_from_zip(temp_path)
    
    print(f"\nLoaded array shape: {loaded_array.shape}")
    print(f"Loaded array dtype: {loaded_array.dtype}")
    print(f"Arrays are equal: {np.allclose(sample_array, loaded_array)}")
    
finally:
    # Clean up
    if os.path.exists(temp_path):
        os.unlink(temp_path)
        print(f"Cleaned up {temp_path}")

Saved to /tmp/tmpd3vlgz9u.zip
Zip file size: 2391 bytes
Compression ratio vs raw data: 0.996

Loaded array shape: (200, 3)
Loaded array dtype: float32
Arrays are equal: True
Cleaned up /tmp/tmpd3vlgz9u.zip


## 4. Working with BytesIO (In-Memory) Zip Operations

You can also work with zip data in memory using BytesIO, which is useful for network transmission or when you don't want to write to disk.

In [10]:
# Create a test array
test_array = np.array([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0], [7.0, 8.0, 9.0]], dtype=np.float32)

# Save to BytesIO
buffer = BytesIO()
ArrayUtils.save_to_zip(test_array, buffer)

print(f"Original array shape: {test_array.shape}")
print(f"Buffer size: {len(buffer.getvalue())} bytes")
print(f"Original array size: {test_array.nbytes} bytes")

# Reset buffer position for reading
buffer.seek(0)

# Load from BytesIO
loaded_from_buffer = ArrayUtils.load_from_zip(buffer)

print(f"\nLoaded array shape: {loaded_from_buffer.shape}")
print(f"Arrays are equal: {np.array_equal(test_array, loaded_from_buffer)}")
print(f"\nOriginal array:")
print(test_array)
print(f"\nLoaded array:")
print(loaded_from_buffer)

Original array shape: (3, 3)
Buffer size: 298 bytes
Original array size: 36 bytes

Loaded array shape: (3, 3)
Arrays are equal: True

Original array:
[[1. 2. 3.]
 [4. 5. 6.]
 [7. 8. 9.]]

Loaded array:
[[1. 2. 3.]
 [4. 5. 6.]
 [7. 8. 9.]]


## 5. Testing Different Data Types

The ArrayUtils can handle various numpy data types, automatically converting them for compression and restoring the original type upon decompression.

In [11]:
def test_dtype_roundtrip(dtype_name, array):
    """Test encoding/decoding preserves data type and values."""
    try:
        buffer = BytesIO()
        ArrayUtils.save_to_zip(array, buffer)
        buffer.seek(0)
        loaded = ArrayUtils.load_from_zip(buffer)
        
        dtype_match = array.dtype == loaded.dtype
        values_match = np.allclose(array, loaded, rtol=1e-5)
        
        print(f"{dtype_name}: dtype={dtype_match}, values={values_match}")
        if not values_match:
            print(f"  Max difference: {np.max(np.abs(array.astype(np.float64) - loaded.astype(np.float64)))}")
        
        return dtype_match and values_match
    except Exception as e:
        print(f"{dtype_name}: ERROR - {e}")
        return False

# Test various data types
test_data = {
    'float32': np.array([1.1, 2.2, 3.3, 4.4], dtype=np.float32),
    'float64': np.array([1.1, 2.2, 3.3, 4.4], dtype=np.float64),
    'int32': np.array([1, 2, 3, 4], dtype=np.int32),
    'uint32': np.array([1, 2, 3, 4], dtype=np.uint32),
    'int16': np.array([1, 2, 3, 4], dtype=np.int16),
    'uint16': np.array([1, 2, 3, 4], dtype=np.uint16),
    'int8': np.array([1, 2, 3, 4], dtype=np.int8),
    'uint8': np.array([1, 2, 3, 4], dtype=np.uint8),
}

print("Data type roundtrip tests:")
all_passed = True
for dtype_name, array in test_data.items():
    passed = test_dtype_roundtrip(dtype_name, array)
    all_passed = all_passed and passed

print(f"\nAll tests passed: {all_passed}")

Data type roundtrip tests:
float32: dtype=True, values=True
float64: dtype=True, values=True
int32: dtype=True, values=True
uint32: dtype=True, values=True
int16: dtype=True, values=True
uint16: dtype=True, values=True
int8: dtype=True, values=True
uint8: dtype=True, values=True

All tests passed: True


## 6. Performance Comparison

Let's compare the performance and size benefits of using ArrayUtils compression versus standard numpy save/load methods.

In [12]:
import time

# Create a larger test array
large_array = np.random.random((1000, 100)).astype(np.float32)
print(f"Test array shape: {large_array.shape}")
print(f"Test array size: {large_array.nbytes:,} bytes ({large_array.nbytes/1024/1024:.2f} MB)")

# Test ArrayUtils compression
start_time = time.time()
buffer_compressed = BytesIO()
ArrayUtils.save_to_zip(large_array, buffer_compressed)
compression_time = time.time() - start_time

start_time = time.time()
buffer_compressed.seek(0)
loaded_compressed = ArrayUtils.load_from_zip(buffer_compressed)
decompression_time = time.time() - start_time

compressed_size = len(buffer_compressed.getvalue())

print(f"\nArrayUtils (compressed):")
print(f"  Compressed size: {compressed_size:,} bytes ({compressed_size/1024/1024:.2f} MB)")
print(f"  Compression ratio: {compressed_size / large_array.nbytes:.3f}")
print(f"  Compression time: {compression_time:.3f} seconds")
print(f"  Decompression time: {decompression_time:.3f} seconds")
print(f"  Data integrity: {np.allclose(large_array, loaded_compressed)}")

Test array shape: (1000, 100)
Test array size: 400,000 bytes (0.38 MB)

ArrayUtils (compressed):
  Compressed size: 342,221 bytes (0.33 MB)
  Compression ratio: 0.856
  Compression time: 0.010 seconds
  Decompression time: 0.002 seconds
  Data integrity: True


In [13]:
# Compare with standard numpy save (uncompressed)
start_time = time.time()
buffer_numpy = BytesIO()
np.save(buffer_numpy, large_array)
numpy_save_time = time.time() - start_time

start_time = time.time()
buffer_numpy.seek(0)
loaded_numpy = np.load(buffer_numpy)
numpy_load_time = time.time() - start_time

numpy_size = len(buffer_numpy.getvalue())

print(f"\nNumpy save/load (uncompressed):")
print(f"  File size: {numpy_size:,} bytes ({numpy_size/1024/1024:.2f} MB)")
print(f"  Save time: {numpy_save_time:.3f} seconds")
print(f"  Load time: {numpy_load_time:.3f} seconds")
print(f"  Data integrity: {np.array_equal(large_array, loaded_numpy)}")

print(f"\nComparison:")
print(f"  Size reduction: {(1 - compressed_size/numpy_size)*100:.1f}% smaller")
print(f"  Compression time vs numpy save: {compression_time/numpy_save_time:.1f}x")
print(f"  Decompression time vs numpy load: {decompression_time/numpy_load_time:.1f}x")


Numpy save/load (uncompressed):
  File size: 400,128 bytes (0.38 MB)
  Save time: 0.000 seconds
  Load time: 0.000 seconds
  Data integrity: True

Comparison:
  Size reduction: 14.5% smaller
  Compression time vs numpy save: 95.6x
  Decompression time vs numpy load: 7.5x


## 7. Cross-Platform Compatibility

One of the key benefits of ArrayUtils is that the zip files created by Python can be loaded by the TypeScript version of the library, enabling seamless data exchange between Python and JavaScript/TypeScript environments.

In [14]:
# Create a demo array that could be used in a web application
# For example, 3D vertex data for a mesh
vertex_data = np.array([
    # Triangle 1
    [0.0, 0.0, 0.0],  # vertex 0
    [1.0, 0.0, 0.0],  # vertex 1
    [0.5, 1.0, 0.0],  # vertex 2
    # Triangle 2
    [1.0, 0.0, 0.0],  # vertex 3
    [2.0, 0.0, 0.0],  # vertex 4
    [1.5, 1.0, 0.0],  # vertex 5
], dtype=np.float32)

print(f"Vertex data shape: {vertex_data.shape}")
print(f"Vertex data:")
print(vertex_data)

# Save to a zip file that can be loaded by TypeScript
demo_path = "demo_vertices.zip"
ArrayUtils.save_to_zip(vertex_data, demo_path)

print(f"\nSaved vertex data to {demo_path}")
print(f"File size: {os.path.getsize(demo_path)} bytes")

# This file can now be loaded in TypeScript using:
# const vertices = await ArrayUtils.loadFromZip(zipBuffer);
print("\nThis zip file can be loaded in TypeScript using:")
print("const vertices = await ArrayUtils.loadFromZip(zipBuffer);")

Vertex data shape: (6, 3)
Vertex data:
[[0.  0.  0. ]
 [1.  0.  0. ]
 [0.5 1.  0. ]
 [1.  0.  0. ]
 [2.  0.  0. ]
 [1.5 1.  0. ]]

Saved vertex data to demo_vertices.zip
File size: 313 bytes

This zip file can be loaded in TypeScript using:
const vertices = await ArrayUtils.loadFromZip(zipBuffer);


In [15]:
# Verify the file can be loaded back in Python
loaded_vertices = ArrayUtils.load_from_zip(demo_path)

print(f"Loaded vertex data shape: {loaded_vertices.shape}")
print(f"Data matches: {np.array_equal(vertex_data, loaded_vertices)}")
print(f"Loaded vertex data:")
print(loaded_vertices)

# Clean up
os.unlink(demo_path)
print(f"\nCleaned up {demo_path}")

Loaded vertex data shape: (6, 3)
Data matches: True
Loaded vertex data:
[[0.  0.  0. ]
 [1.  0.  0. ]
 [0.5 1.  0. ]
 [1.  0.  0. ]
 [2.  0.  0. ]
 [1.5 1.  0. ]]

Cleaned up demo_vertices.zip


## 8. Best Practices and Use Cases

Here are some recommendations for when and how to use ArrayUtils:

## Summary

The ArrayUtils class provides a powerful and efficient way to compress and serialize numpy arrays using meshoptimizer compression. Key benefits include:

- **Efficient compression**: Particularly effective on structured 3D graphics data
- **Cross-platform compatibility**: Python-created files can be loaded in TypeScript
- **Type preservation**: Original numpy data types are preserved through the roundtrip
- **Flexible I/O**: Supports both file-based and in-memory (BytesIO) operations
- **Easy to use**: Simple API with `save_to_zip()` and `load_from_zip()` methods

This makes it an excellent choice for applications that need to share numeric data between Python backends and JavaScript/TypeScript frontends, such as web-based 3D visualizations, scientific data viewers, or game development tools.