# Why Numpy

Effective data-driven science and computation requires understanding how data is
stored and manipulated. Here we ll outline and contrast how arrays of data are
handled in the Python language itself, and how NumPy improves on this. Understanding
this difference is fundamental to understanding much of the material
throughout the rest of the course

## Difference between Python and C integer

![image.png](attachment:image.png)

A single integer in Python 3.4 actually contains four pieces:
- ob_refcnt, a reference count that helps Python silently handle memory allocation
and deallocation
- ob_type, which encodes the type of the variable
- ob_size, which specifies the size of the following data members
- ob_digit, which contains the actual integer value that we expect the Python variable
to represent
- Here PyObject_HEAD is the part of the structure containing the reference count, type
code, and other pieces mentioned above.

The standard Python implementation is written in C. This means that every Python
object is simply a cleverly disguised C structure, which contains not only its value, but
other information as well
- This is why Python can be dynamically typed unlike C and other statically typed languages
- This flexibility however, comes at a cost 
- The cost is more memory and more time to get to the value of the integer while computing 

# Python Lists 

In [1]:
L = list(range(10))
L

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [2]:
type(L[0])

int

In [3]:
L2 = [str(c) for c in L]
L2

['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']

In [4]:
type(L2[0])

str

In [5]:
L3 = [True, "2", 3.0, 4]
[type(item) for item in L3]

[bool, str, float, int]

But this flexibility comes at a cost: to allow these flexible types, each item in the list
must contain its own type info, reference count, and other information—that is, each
item is a complete Python object. In the special case that all variables are of the same
type, much of this information is redundant: it can be much more efficient to store
data in a fixed-type array.

![image.png](attachment:image.png)

At the implementation level, the array essentially contains a single pointer to one contiguous
block of data. The Python list, on the other hand, contains a pointer to a
block of pointers, each of which in turn points to a full Python object like the Python
integer we saw earlier. Again, the advantage of the list is flexibility: because each list
element is a full structure containing both data and type information, the list can be
filled with data of any desired type. Fixed-type NumPy-style arrays lack this flexibility,
but are much more efficient for storing and manipulating data.

In [6]:
l1 = [1,2,3]
l2 = [2,3,4]
# l1+l2
# l1*l2

In [7]:
l3 = []
for i, j in zip(l1,l2):
    l3.append(i+j)
l3    

[3, 5, 7]

In [8]:
l3 = [(i+j) for i, j in zip(l1,l2)]
l3

[3, 5, 7]

In [9]:
import numpy as np

In [10]:
a1 = np.array([1,2,3])
a2 = np.array([2,3,4])
a3 = a1+a2
a3

array([3, 5, 7])

In [11]:
a1*a2

array([ 2,  6, 12])

In [12]:
size = 1000000
l1 = range(size)
l2 = range(size)

a1 = np.arange(size)
a2 = np.arange(size)



In [13]:
%timeit [(i+j) for i, j in zip(l1,l2)]

389 ms ± 5.69 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [14]:
%timeit a1+a2

6.81 ms ± 128 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


## Arrays in NumPy:
NumPy’s main object is the homogeneous multidimensional array.

- It is a table of elements (usually numbers), all of the same type, indexed by a tuple of positive integers.
- In NumPy dimensions are called axes. The number of axes is rank.
- NumPy’s array class is called ndarray. It is also known by the alias array.

In [15]:
# Creating array object 
arr = np.array( [[ 1, 2, 3], 
                 [ 4, 2, 5]] ) 
  
# Printing type of arr object 
print("Array is of type: ", type(arr)) 
  
# Printing array dimensions (axes) 
print("No. of dimensions: ", arr.ndim) 
  
# Printing shape of array 
print("Shape of array: ", arr.shape) 
  
# Printing size (total number of elements) of array 
print("Size of array: ", arr.size) 
  
# Printing type of elements in array 
print("Array stores elements of type: ", arr.dtype) 

Array is of type:  <class 'numpy.ndarray'>
No. of dimensions:  2
Shape of array:  (2, 3)
Size of array:  6
Array stores elements of type:  int32


## Array creation: There are various ways to create arrays in NumPy.

- For example, you can create an array from a regular Python list or tuple using the array function. The type of the resulting array is deduced from the type of the elements in the sequences.
- Often, the elements of an array are originally unknown, but its size is known. Hence, NumPy offers several functions to create arrays with initial placeholder content. These minimize the necessity of growing arrays, an expensive operation.
- For example: np.zeros, np.ones, np.full, np.empty, etc.
- To create sequences of numbers, NumPy provides a function analogous to range that returns arrays instead of lists.
- arange: returns evenly spaced values within a given interval. step size is specified.
- linspace: returns evenly spaced values within a given interval. num no. of elements are returned.
- Reshaping array: We can use reshape method to reshape an array. Consider an array with shape (a1, a2, a3, …, aN). We can reshape and convert it into another array with shape (b1, b2, b3, …, bM). The only required condition is:
- a1 x a2 x a3 … x aN = b1 x b2 x b3 … x bM . (i.e original size of array remains unchanged.)
- Flatten array: We can use flatten method to get a copy of array collapsed into one dimension. It accepts order argument. Default value is ‘C’ (for row-major order). Use ‘F’ for column major order.

In [16]:
# Creating array from list with type float 
a = np.array([[1, 2, 4], [5, 8, 7]], dtype = 'float') 
print ("Array created using passed list:\n", a) 

Array created using passed list:
 [[1. 2. 4.]
 [5. 8. 7.]]


In [17]:
# Creating array from tuple 
b = np.array((1 , 3, 2)) 
print ("\nArray created using passed tuple:\n", b) 


Array created using passed tuple:
 [1 3 2]


In [18]:
# Creating a 3X4 array with all zeros 
c = np.zeros((3, 4)) 
print ("\nAn array initialized with all zeros:\n", c) 


An array initialized with all zeros:
 [[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]


In [19]:
# Create a constant value array of complex type 
d = np.full((3, 3), 6, dtype = 'complex') 
print ("\nAn array initialized with all 6s." 
            "Array type is complex:\n", d) 
  



An array initialized with all 6s.Array type is complex:
 [[6.+0.j 6.+0.j 6.+0.j]
 [6.+0.j 6.+0.j 6.+0.j]
 [6.+0.j 6.+0.j 6.+0.j]]


In [20]:
# Create an array with random values 
e = np.random.random((2, 2)) 
print ("\nA random array:\n", e) 


A random array:
 [[0.67399594 0.80347422]
 [0.25866756 0.56680733]]


In [21]:
# Create a sequence of integers  
# from 0 to 30 with steps of 5 
f = np.arange(0, 30, 5) 
print ("\nA sequential array with steps of 5:\n", f) 


A sequential array with steps of 5:
 [ 0  5 10 15 20 25]


In [22]:
# Reshaping 3X4 array to 2X2X3 array 
arr = np.array([[1, 2, 3, 4], 
                [5, 2, 4, 2], 
                [1, 2, 0, 1]]) 
newarr = arr.reshape(2, 2, 3) 
print ("\nOriginal array:\n", arr) 
print ("Reshaped array:\n", newarr) 


Original array:
 [[1 2 3 4]
 [5 2 4 2]
 [1 2 0 1]]
Reshaped array:
 [[[1 2 3]
  [4 5 2]]

 [[4 2 1]
  [2 0 1]]]


In [23]:
# Flatten array 
arr = np.array([[1, 2, 3], [4, 5, 6]]) 
flarr = arr.flatten() 
  
print ("\nOriginal array:\n", arr) 
print ("Fattened array:\n", flarr) 


Original array:
 [[1 2 3]
 [4 5 6]]
Fattened array:
 [1 2 3 4 5 6]
