# NumPy File I/O and Data Persistence

**Author:** RSK World  
**Website:** https://rskworld.in  
**Email:** help@rskworld.in  
**Phone:** +91 93305 39277

This notebook covers how to save and load NumPy arrays to/from files, including binary formats, text files, and compressed formats.


In [None]:
# Author: RSK World
# Website: https://rskworld.in
# Email: help@rskworld.in
# Phone: +91 93305 39277

import numpy as np
import os


## 1. Saving and Loading .npy Files

The `.npy` format is NumPy's native binary format for single arrays.


In [None]:
# Author: RSK World
# Website: https://rskworld.in
# Email: help@rskworld.in
# Phone: +91 93305 39277

# Create a sample array
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print("Original array:\n", arr)

# Save to .npy file
np.save('array_data.npy', arr)
print("\nArray saved to 'array_data.npy'")

# Load from .npy file
loaded_arr = np.load('array_data.npy')
print("\nLoaded array:\n", loaded_arr)
print("\nArrays are equal:", np.array_equal(arr, loaded_arr))


In [None]:
# Author: RSK World
# Website: https://rskworld.in
# Email: help@rskworld.in
# Phone: +91 93305 39277

# Check file size
if os.path.exists('array_data.npy'):
    file_size = os.path.getsize('array_data.npy')
    print(f"File size: {file_size} bytes")
    print(f"Array size in memory: {arr.nbytes} bytes")
    print(f"Format is efficient for NumPy arrays")


## 2. Saving and Loading Multiple Arrays (.npz)

The `.npz` format allows saving multiple arrays in a compressed archive.


In [None]:
# Author: RSK World
# Website: https://rskworld.in
# Email: help@rskworld.in
# Phone: +91 93305 39277

# Create multiple arrays
arr1 = np.array([1, 2, 3, 4, 5])
arr2 = np.array([[1, 2], [3, 4]])
arr3 = np.random.rand(3, 3)

print("Array 1:", arr1)
print("\nArray 2:\n", arr2)
print("\nArray 3:\n", arr3)

# Save multiple arrays to .npz file
np.savez('multiple_arrays.npz', array1=arr1, array2=arr2, array3=arr3)
print("\nArrays saved to 'multiple_arrays.npz'")

# Load from .npz file
loaded = np.load('multiple_arrays.npz')
print("\nKeys in file:", list(loaded.keys()))
print("\nLoaded array1:", loaded['array1'])
print("\nLoaded array2:\n", loaded['array2'])


In [None]:
# Author: RSK World
# Website: https://rskworld.in
# Email: help@rskworld.in
# Phone: +91 93305 39277

# Save with compression (smaller file size)
np.savez_compressed('compressed_arrays.npz', array1=arr1, array2=arr2, array3=arr3)
print("Compressed file saved")

# Compare file sizes
if os.path.exists('multiple_arrays.npz') and os.path.exists('compressed_arrays.npz'):
    size_normal = os.path.getsize('multiple_arrays.npz')
    size_compressed = os.path.getsize('compressed_arrays.npz')
    print(f"\nNormal .npz size: {size_normal} bytes")
    print(f"Compressed .npz size: {size_compressed} bytes")
    print(f"Compression ratio: {size_normal/size_compressed:.2f}x")


## 3. Text File I/O

Reading and writing arrays as text files (CSV, TXT).


In [None]:
# Author: RSK World
# Website: https://rskworld.in
# Email: help@rskworld.in
# Phone: +91 93305 39277

# Save array to text file
data = np.array([[1.5, 2.3, 3.7], [4.1, 5.9, 6.2], [7.8, 8.4, 9.0]])
np.savetxt('array_data.txt', data, fmt='%.2f', delimiter=',')
print("Array saved to text file:")
print(data)

# Load from text file
loaded_data = np.loadtxt('array_data.txt', delimiter=',')
print("\nLoaded from text file:\n", loaded_data)


In [None]:
# Author: RSK World
# Website: https://rskworld.in
# Email: help@rskworld.in
# Phone: +91 93305 39277

# Save with header and custom formatting
header = "Column1,Column2,Column3"
np.savetxt('array_with_header.csv', data, 
           fmt='%.3f', delimiter=',', 
           header=header, comments='#')
print("Array saved with header to CSV file")

# Load CSV with skiprows if needed
loaded_csv = np.loadtxt('array_with_header.csv', delimiter=',', skiprows=1)
print("\nLoaded CSV data:\n", loaded_csv)


In [None]:
# Author: RSK World
# Website: https://rskworld.in
# Email: help@rskworld.in
# Phone: +91 93305 39277

# Using genfromtxt for more flexibility (handles missing values)
# Create sample data with missing values
with open('data_with_missing.csv', 'w') as f:
    f.write("1.0,2.0,3.0\n")
    f.write("4.0,,6.0\n")  # Missing value
    f.write("7.0,8.0,9.0\n")

# Load with genfromtxt (handles missing values)
data_missing = np.genfromtxt('data_with_missing.csv', delimiter=',')
print("Data with missing values:\n", data_missing)
print("\nMissing values are represented as NaN")


## 4. Memory-Mapped Files

Memory-mapped files allow working with arrays too large to fit in memory.


In [None]:
# Author: RSK World
# Website: https://rskworld.in
# Email: help@rskworld.in
# Phone: +91 93305 39277

# Create a memory-mapped array
# This creates a file-backed array that doesn't load entirely into memory
mmap_arr = np.memmap('memory_mapped.dat', dtype='float64', mode='w+', shape=(1000, 1000))

# Fill the array
mmap_arr[:] = np.random.rand(1000, 1000)
print("Memory-mapped array created and filled")
print(f"Shape: {mmap_arr.shape}")
print(f"Size: {mmap_arr.nbytes / 1024 / 1024:.2f} MB")

# Flush to ensure data is written
mmap_arr.flush()
print("\nData flushed to disk")


In [None]:
# Author: RSK World
# Website: https://rskworld.in
# Email: help@rskworld.in
# Phone: +91 93305 39277

# Load memory-mapped array in read-only mode
mmap_read = np.memmap('memory_mapped.dat', dtype='float64', mode='r', shape=(1000, 1000))
print("Memory-mapped array loaded in read mode")
print(f"First 3x3 slice:\n{mmap_read[:3, :3]}")
print(f"\nMemory usage is minimal - data is read from disk as needed")


## 5. Working with CSV Files

Advanced CSV file operations.


In [None]:
# Author: RSK World
# Website: https://rskworld.in
# Email: help@rskworld.in
# Phone: +91 93305 39277

# Create a structured array and save to CSV
structured_data = np.array([(1, 'Alice', 25.5), (2, 'Bob', 30.2), (3, 'Charlie', 35.8)],
                           dtype=[('id', 'i4'), ('name', 'U10'), ('score', 'f4')])

# Save structured array (only numeric columns work well with savetxt)
numeric_data = np.column_stack([structured_data['id'], structured_data['score']])
np.savetxt('structured_data.csv', numeric_data, 
           fmt='%d,%.1f', delimiter=',', 
           header='ID,Score', comments='#')
print("Structured data saved to CSV")
print(structured_data)


## 6. Binary File Formats

Working with raw binary data.


In [None]:
# Author: RSK World
# Website: https://rskworld.in
# Email: help@rskworld.in
# Phone: +91 93305 39277

# Save array as raw binary
arr_binary = np.array([1, 2, 3, 4, 5], dtype=np.int32)
arr_binary.tofile('binary_data.bin')
print("Array saved as binary file")
print(f"Array: {arr_binary}")

# Load from binary file (must specify dtype and shape)
loaded_binary = np.fromfile('binary_data.bin', dtype=np.int32)
print(f"\nLoaded from binary: {loaded_binary}")


## 7. Best Practices and Tips

Important considerations when working with file I/O.


In [None]:
# Author: RSK World
# Website: https://rskworld.in
# Email: help@rskworld.in
# Phone: +91 93305 39277

print("File I/O Best Practices:")
print("=" * 50)
print("\n1. Use .npy for single arrays (fast, preserves dtype)")
print("2. Use .npz for multiple arrays (can compress)")
print("3. Use .txt/.csv for human-readable data")
print("4. Use memory-mapped files for large arrays")
print("5. Always specify dtype when loading binary files")
print("6. Use compression for large datasets")
print("\nFormat Comparison:")
print("- .npy/.npz: Fast, preserves all metadata, binary")
print("- .txt/.csv: Human-readable, slower, text-based")
print("- Memory-mapped: For arrays larger than RAM")


## Summary

In this notebook, we learned:
- How to save and load arrays using .npy format
- Working with multiple arrays using .npz format
- Text file I/O (CSV, TXT)
- Memory-mapped files for large datasets
- Binary file formats
- Best practices for file I/O

**Key Takeaways:**
- `.npy` is best for single arrays (fast, preserves metadata)
- `.npz` is best for multiple arrays (supports compression)
- Text files are human-readable but slower
- Memory-mapped files enable working with arrays larger than RAM
- Always specify dtype when loading binary files

**Next:** Advanced Indexing and Searching

---

**Author:** RSK World  
**Website:** https://rskworld.in  
**Email:** help@rskworld.in  
**Phone:** +91 93305 39277
