# Saving & Loading Arrays (Binary & Text I/O) in NumPy

### What Are Saving & Loading in NumPy?

In real-world AI/ML projects, we often need to **save our work** — whether it's datasets, model weights, or transformed arrays — and **reload them later** for training, analysis, or deployment. NumPy provides easy and efficient ways to **save arrays to files** and **load them back**, both in **binary format** (`.npy`, `.npz`) and **text format** (`.txt`, `.csv`). This ensures that our data is persistent, shareable, and fast to access — especially for large projects.

Saving arrays is crucial for:

- Reusing processed datasets without recomputing every time
- Storing intermediate results during model training
- Sharing data with teammates or systems
- Building pipelines that load and process data in stages

NumPy gives us two main formats:

1. **Binary format (`.npy`, `.npz`)** – compact, fast, preserves data types and shapes
2. **Text format (`.txt`, `.csv`)** – human-readable, but slower and may lose precision or structure

Learning to save/load correctly improves **efficiency**, **modularity**, and **scalability** in our AI/ML workflows.

### Binary Saving & Loading (npy / npz)

NumPy’s `.npy` format is the **preferred method** for saving a single array. It keeps everything — data type, shape, and structure — safe and fast to reload. When we need to store **multiple arrays together**, `.npz` helps by storing them in a **compressed archive**.

- **Save a single array (npy)**

In [1]:
import numpy as np
    
arr = np.array([[1, 2], [3, 4]])
np.save('my_array.npy', arr)

- **Load a single array**

In [None]:
loaded_arr = np.load('my_array.npy')
print("Loaded array:\n", loaded_arr)

- **Save multiple arrays (npz)**

In [None]:
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
np.savez('arrays_archive.npz', first=a, second=b)

- **Load multiple arrays**

In [None]:
data = np.load('arrays_archive.npz')
print("First array:", data['first'])
print("Second array:", data['second'])

These binary files are **lightweight, fast**, and ideal for machine learning pipelines and models that need to load large amounts of numerical data quickly.

### Text Saving & Loading (txt / csv)

Text-based formats are useful when we want to:

- Inspect data manually
- Share with others who don’t use Python
- Import/export with tools like Excel or MATLAB

We use `np.savetxt()` and `np.loadtxt()` for plain-text files, and we can control the formatting using arguments.

- **Save to text**

In [None]:
arr = np.array([[1.2, 2.3], [3.4, 4.5]])
np.savetxt('array.txt', arr, delimiter=',', fmt='%.2f')

- **Load from text**

In [None]:
loaded_txt = np.loadtxt('array.txt', delimiter=',')
print("Loaded from txt:\n", loaded_txt)

The downside is that text files:

- Take more space
- Can lose precision or metadata (like shape, dtype)
- Are slower to parse

So, for **sharing with others or exporting data**, text is good. But for **performance and accuracy**, binary is better.

### AI/ML Use Cases

| Format | Use Case |
| --- | --- |
| `.npy` | Saving model weights or processed data |
| `.npz` | Storing datasets with labels, images, metadata |
| `.txt` | Exporting predictions or input data for external tools |
| `.csv` | Logging training metrics or tabular datasets |

Saving/loading arrays helps us **build better workflows**, avoid **recomputing expensive operations**, and keep our **training process organized**.

### Summary

Saving and loading arrays is a fundamental skill in every AI/ML pipeline. Whether we’re working with training data, model weights, or intermediate calculations — being able to persist data and reuse it later can save us hours of work. NumPy makes this process incredibly easy with two main formats:

- **Binary format (`.npy`, `.npz`)** is fast, compact, and preserves everything — it’s best when working within Python-based workflows or saving large arrays efficiently.
- **Text format (`.txt`, `.csv`)** is readable and shareable across tools, but not ideal for large-scale or high-precision data.

We use `.npy` when we want to store a single array, and `.npz` when we want to store many. Both load back quickly with full fidelity. On the other hand, `.txt` is great for logging or exporting but may lose data type or precision.

In AI and ML, this helps us:

- Save datasets and reload them without repeating preprocessing
- Store trained model outputs or input tensors
- Export predictions for reports, visualizations, or other tools
- Share arrays with collaborators or pipelines

By mastering both formats, we gain **control over our data**, reduce **redundancy**, and build **reliable, reusable, and efficient code** for real-world machine learning systems. Always choose the format that best matches our goal — speed and structure (binary) vs visibility and portability (text).