### Arrays

The array module in Python is used when you need to create arrays that are more memory-efficient than lists. While lists in Python are flexible and can hold any type of object, they are less efficient in terms of memory and performance when dealing with large amounts of numerical data.

In [18]:
import array
import sys

ll = [ii for ii in range(10_000_000)]
aa = array.array("i", range(10_000_000))


print(f"type(ll): {type(ll)}")
print(f"type(aa): {type(aa)}")


type(ll): <class 'list'>
type(aa): <class 'array.array'>


In [19]:

# Get size of each element of aa and ll.
ll_size = sys.getsizeof(ll)
aa_size = sys.getsizeof(aa)


print(f"Size of ll: {ll_size:,.0f} bytes")
print(f"Size of aa: {aa_size:,.0f} bytes")


Size of ll: 89,095,160 bytes
Size of aa: 40,970,224 bytes


<br>

**When to use array vs. list:**

- **When You Need Memory Efficiency**: If you need to store a large number of numerical values and memory usage is a concern, arrays are a good choice.

- **When You Need Consistent Type Elements**: If you want to ensure that all elements in your collection are of the same type, use arrays.

- **When You Need Performance Optimization**: For operations that involve a lot of numerical computations, arrays can offer performance benefits.

<br>

In practice, the array module is almost never used. Instead, practitioners opt for [NumPy](https://numpy.org/doc/stable/index.html). NumPy is a powerful library for numerical computing in Python. It provides support for large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on arrays efficiently. Numpy is widely used in scientific computing, data analysis, and machine learning due to its performance and ease of use.

Note that NumPy is not included as part of the standard library. It needs to be installed separately using pip:

```sh
$ pip install numpy
```


In [20]:

# Example creating numpy array.
import numpy as np

# Create numpy array:
arr = np.asarray(range(10))

print(f"type(arr): {type(arr)}")
print(f"arr: {arr}")

type(arr): <class 'numpy.ndarray'>
arr: [0 1 2 3 4 5 6 7 8 9]


NumPy makes use of vectorization, which is a technique that allows you to perform operations on entire arrays of data at once, rather than iterating through the data element by element using loops. Vectorization takes advantage of low-level optimizations and parallelism provided by modern CPU architectures, leading to significant performance improvements.

For example, to add 10 to each element of the arr array, we run:

In [21]:

arr + 10


array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19])

Let's compare the required runtime to add 10 to each element of a 10,000,000 element Python list vs. a Numpy array:

In [15]:

# 10M element list.
ll = [i for i in range(10_000_000)]

# 10M element Numpy array.
aa = np.asarray(range(10_000_000))


In [10]:

%timeit -n 5 [ii + 10 for ii in ll]


546 ms ± 74.2 ms per loop (mean ± std. dev. of 7 runs, 5 loops each)


In [11]:

%timeit -n 5 aa + 10


19.1 ms ± 990 µs per loop (mean ± std. dev. of 7 runs, 5 loops each)


In [12]:
546 / 19.1

28.58638743455497

Use NumPy arrays when you need efficient numerical computations, are working with large datasets, require advanced mathematical functions or need to perform operations on multidimensional data. NumPy arrays provide significant performance and memory efficiency advantages over Python lists, making them the preferred choice for scientific computing and data analysis tasks.

- [Absolute Beginner's Guide to NumPy](https://numpy.org/doc/stable/user/absolute_beginners.html)
- [100 NumPy Exercises](https://github.com/amberjrivera/numpy-100/blob/master/100%20Numpy%20exercises.ipynb)

In [22]:
pwd

't:\\Repos\\DMACC-2\\CIS189-202403-Supplemental\\misc'

In [23]:
ls

 Volume in drive T is Data
 Volume Serial Number is 324D-69E7

 Directory of t:\Repos\DMACC-2\CIS189-202403-Supplemental\misc

07/02/2024  12:44 PM    <DIR>          .
06/24/2024  08:09 PM    <DIR>          ..
06/25/2024  06:10 PM    <DIR>          __pycache__
07/01/2024  02:21 PM            59,302 assignment-exhibit-maker.ipynb
07/01/2024  12:07 PM    <DIR>          assignments
06/17/2024  11:34 AM             8,265 exhibit_maker.py
06/01/2024  07:48 PM            23,973 functions.png
06/25/2024  05:57 PM             1,216 functions.py
06/12/2024  10:05 AM    <DIR>          IDOT
06/25/2024  05:57 PM               513 main.py
07/02/2024  06:19 PM             7,074 numpy-demo.ipynb
06/25/2024  05:55 PM             1,666 python-scripts.md
06/22/2024  09:03 AM    <DIR>          scripts
               7 File(s)        102,009 bytes
               6 Dir(s)  577,907,752,960 bytes free
