## Saving Files with NumPy

# Text files:
- **.txt**: Simple text files, where data is usually space or comma-separated.
- **.csv**: Comma-separated values, which can be read using `numpy.genfromtxt()` or `numpy.loadtxt()`.

### Binary files:
- **.npy**: The native binary file format for NumPy arrays, which can be read using `numpy.load()`. This format stores arrays along with their dtype and shape information, making it efficient for storage and retrieval.
- **.npz**: A compressed archive of `.npy` files. It is a zip file that stores multiple arrays and can be read using `numpy.load()`.

### Other formats (indirectly via `genfromtxt()` or `loadtxt()` for certain types):
- **.xls and .xlsx**: While not directly supported by NumPy, these can be read using external libraries like `pandas` or `openpyxl`, which can then convert the data into a NumPy array.
- **.h5**: HDF5 files can be handled by NumPy indirectly using the `h5py` library to interact with HDF5 data.



In [26]:
import numpy as np

### np.save()

In [42]:
lending_co = np.genfromtxt("Lending-Company-Saving.csv", 
                           delimiter = ',', 
                           dtype = str)

print(type(lending_co))

## We're just importing a dataset, so we can save it later. 
## Usually, we will be working with an array already, so we could skip this. 

<class 'numpy.ndarray'>


In [28]:
np.save("Lending-Company-Saving", lending_co)

## Create an .npy file with the data from the lending_co array. 

In [29]:
lending_data_save = np.load("Lending-Company-Saving.npy")

## Load the NPY file we just created. (Load =/= Import in this case)

In [30]:
print(lending_data_save)

[['LoanID' 'StringID' 'Product' ... 'Location' 'Region' 'TotalPrice']
 ['1' 'id_1' 'Product B' ... 'Location 2' 'Region 2' '16600.0']
 ['2' 'id_2' 'Product B' ... 'Location 3' '' '16600.0']
 ...
 ['1041' 'id_1041' 'Product B' ... 'Location 23' 'Region 4' '16600.0']
 ['1042' 'id_1042' 'Product C' ... 'Location 52' 'Region 6' '15600.0']
 ['1043' 'id_1043' 'Product B' ... 'Location 142' 'Region 6' '16600.0']]


In [None]:
np.array_equal(lending_data_save, lending_co)

# The original array is identical to the one we saved and then loaded back into Python. 

### np.savez()
The np.savez() function in NumPy is used to save multiple arrays into a single compressed or uncompressed file in the .npz format. The .npz format is a zipped archive that contains one or more .npy files (which store individual arrays). This is useful when you want to store multiple arrays in a single file.

In [31]:
lending_co = np.genfromtxt("Lending-Company-Saving.csv", 
                           delimiter = ',',
                           dtype = str) 

lending_data_save = np.load('Lending-Company-Saving.npy') 

# Just getting two arrays we want to store (we import one, and load the other)

In [43]:
file1 = np.genfromtxt("file1.csv",
                           delimiter=',',
                           dtype=str)
file2 = np.genfromtxt("file2.csv",
                      delimiter=',',
                      dtype=str)
print(type(file1))
np.savez("file3", file1,file2)
file4 = np.load('file3.npz')
print(type(file4))
# print(file4["arr_0"])
# print(file4["arr_1"])

<class 'numpy.ndarray'>
<class 'numpy.lib.npyio.NpzFile'>


In [32]:
np.savez("Lending-Company-Saving", lending_co, lending_data_save)

# Creates the .npz file, which is an archive of .npy files. 

In [33]:
lending_data_savez = np.load('Lending-Company-Saving.npz')

# We also load .npz files.

In [34]:
print(lending_data_savez["arr_1"])

# np.savez() assigns default names to each .npy inside the archive.

[['LoanID' 'StringID' 'Product' ... 'Location' 'Region' 'TotalPrice']
 ['1' 'id_1' 'Product B' ... 'Location 2' 'Region 2' '16600.0']
 ['2' 'id_2' 'Product B' ... 'Location 3' '' '16600.0']
 ...
 ['1041' 'id_1041' 'Product B' ... 'Location 23' 'Region 4' '16600.0']
 ['1042' 'id_1042' 'Product C' ... 'Location 52' 'Region 6' '15600.0']
 ['1043' 'id_1043' 'Product B' ... 'Location 142' 'Region 6' '16600.0']]


In [None]:
np.savez("Lending-Company-Saving", company = lending_co, data_save = lending_data_save) 

# Assign custom recognizable names to the individual .npy files in the .npz

In [None]:
lending_data_savez = np.load("Lending-Company-Saving.npz")

In [None]:
lending_data_savez.files

# Shows the names of all the .npy files stored in the .npz

In [None]:
print(lending_data_savez["data_save"])

In [None]:
np.array_equal(lending_data_savez["company"],lending_data_savez["data_save"])

# Even after saving and loading the datasets back into Python, they are still identical.

### np.savetxt()
The np.savetxt() function in NumPy is used to save an array (typically a 1D or 2D array) to a text file. The text file is usually formatted as a delimited text file (e.g., CSV, space-separated values).



In [None]:
lending_co = np.genfromtxt("Lending-Company-Saving.csv",
                           delimiter = ',',
                           dtype = str) 

In [None]:
np.savetxt("Lending-Company-Saving.txt", 
           lending_co, 
           fmt = '%s', 
           delimiter = ',')

# We must specify the file extension (txt or csv).
# We must specify the format (strings in this case).
# We must set a delimiter (comma in this case).

In [None]:
lending_data_savetxt = np.genfromtxt("Lending-Company-Saving.txt", 
                                     delimiter = ',', 
                                     dtype = np.str)

print(lending_data_savetxt)

# We're importing the .txt file we just created.

In [None]:
lending_data_save = np.load("Lending-Company-Saving.npy")

In [None]:
np.array_equal(lending_data_savetxt, lending_data_save)