# Today's Agenda
> ### What is an Array?
> ### What is NumPy?
> ### What is the use of NumPy?
> ### What is data analysis?
> ### What is the role of NumPy in Data analysis?

NumPy (**Num**erical **Py**thon) is an open-source Python library that’s used in almost every field of science and engineering.   
NumPy can be used to perform a wide variety of **mathematical operations** on arrays. It adds powerful data structures to Python that guarantee efficient calculations with arrays and matrices and it supplies an enormous library of high-level mathematical functions that operate on these arrays and matrices.   

It comes with a great number of built-in functions.     

An **array** is defined as the collection of similar type of data items stored at contiguous memory locations.

NumPy is a Python library used for working with arrays. NumPy arrays are called **ndarray or N-dimensional arrays** and they store elements of the **same type** and size. It is known for its high-performance and provides efficient storage and data operations as arrays grow in size.

We use NumPy arrays that contain only **homogeneous elements**, i.e. elements having the same data type. This makes it more efficient at storing and manipulating the array. This difference becomes apparent when the array has a large number of elements, say thousands or millions. Also, with NumPy arrays, you can perform **element-wise operations**, something which is not possible using Python lists!  

An array of one dimension is called a Vector while having two dimensions is called a Matrix.    

NumPy is used to work with arrays. The array object in NumPy is called **ndarray**.  

We have lists that serve the purpose of arrays, but they are slow to process.

NumPy aims to provide an array object that is up to 50x faster than traditional Python lists.

 NumPy’s array class is called ndarray. It is also known by the alias array. Note that numpy.array is not the same as the Standard Python Library class array.array, which only handles one-dimensional arrays and offers less functionality.

**Data Manipulation with numpy** 

You can perform standard mathematical operations on either individual elements or complete array.   
The range of functions covered is linear algebra, statistical operations, and other specialized mathematical operations.  
For our purpose, we need to know about ndarray and the range of mathematical functions that are relevant to our research purpose.   
If you already know languages such as C, Fortran, then you can integrate NumPy code with code written in these languages and can pass NumPy arrays seamlessly.   

### Possible application of NumPy package in research work are:

+ Algorithmic operations such as sorting, grouping and set operations
+ Performing repetitive operations on whole arrays of data without using loops
+ Data merging and alignment operations
+ Data indexing, filtering, and transformation on individual elements or whole arrays
+ Data summarization and descriptive statistics

![image.png](attachment:image.png)

## Installing NumPy
In order to check if NumPy is installed, go to Package Manager and type NumPy. You will get a list of packages with names closely matching to NumPy. For our purpose, we need to focus on package named numpy 1.xx. If the package is not installed, click on Install.

In [None]:
!pip install numpy

In [2]:
numpy.array([1,2])

NameError: name 'numpy' is not defined

In [3]:
import numpy

The above statement will import all of NumPy into your workspace. For starters its good, but if you are doing performance intensive work, then saving space is of importance. In such cases, you can import specific modules of NumPy by using

In [4]:
from numpy import array

In [1]:
import numpy as np


In [6]:
print(np.__version__)

1.20.3


In [7]:
np.array([1,2,3])

array([1, 2, 3])

In [None]:
np.

In [9]:
np.abs([-1,-2])

array([1, 2])

In [10]:
np.absolute([-1,-2])

array([1, 2])

In [11]:
help(np.absolute)

Help on ufunc:

absolute = <ufunc 'absolute'>
    absolute(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])
    
    Calculate the absolute value element-wise.
    
    ``np.abs`` is a shorthand for this function.
    
    Parameters
    ----------
    x : array_like
        Input array.
    out : ndarray, None, or tuple of ndarray and None, optional
        A location into which the result is stored. If provided, it must have
        a shape that the inputs broadcast to. If not provided or None,
        a freshly-allocated array is returned. A tuple (possible only as a
        keyword argument) must have length equal to the number of outputs.
    where : array_like, optional
        This condition is broadcast over the input. At locations where the
        condition is True, the `out` array will be set to the ufunc result.
        Elsewhere, the `out` array will retain its original value.
        Note that if an uninitialized 

In [13]:
np.absolute([-1,-2])

array([1, 2])

In [17]:
import numpy as np

numpy_array= np.arange(2000) 

# print(numpy_array)
print(numpy_array.size)

2000


In [18]:
print(numpy_array.itemsize)

4


In [19]:
print(numpy_array.size*numpy_array.itemsize)

8000


Python containers such as lists and dicts are versatile, allowing for the storage of heterogeneous elements. However, each value in these containers is its own object and reserves its own space in memory. The Python variable just points to the locations in memory. This method is not as efficient for storing data of the same type. A NumPy array reserves a space in memory and stores data of the size -in bits- side by side, which is more efficient and faster when you need to obtain the values.

NumPy arrays are:   
more compact, especially when there’s more than one dimension   
faster than lists when the operation can be vectorized   
slower than lists when you append elements to the end   
usually homogeneous: can only work fast with elements of one type   


NumPy arrays cannot grow the way a Python list does: No space is reserved at the end of the array to facilitate quick appends.   So it is a common practice to either grow a Python list and convert it to a NumPy array when it is ready or to preallocate the necessary space with np.zeros or np.empty  

To create an ndarray, we can pass a list, tuple or any array-like object into the array() method, and it will be converted into an ndarray:   

Nested Array: are arrays that have arrays as their elements.   

In NumPy, dimensions are called axes. 

In [22]:
# 0-D arrays, or Scalars, are the elements in an array. Each value in an array is a 0-D array.  

arr=np.array(42)

print(arr)

42


In [23]:
arr.ndim

0

In [24]:
help(np.ndim)

Help on function ndim in module numpy:

ndim(a)
    Return the number of dimensions of an array.
    
    Parameters
    ----------
    a : array_like
        Input array.  If it is not already an ndarray, a conversion is
        attempted.
    
    Returns
    -------
    number_of_dimensions : int
        The number of dimensions in `a`.  Scalars are zero-dimensional.
    
    See Also
    --------
    ndarray.ndim : equivalent method
    shape : dimensions of array
    ndarray.shape : dimensions of array
    
    Examples
    --------
    >>> np.ndim([[1,2,3],[4,5,6]])
    2
    >>> np.ndim(np.array([[1,2,3],[4,5,6]]))
    2
    >>> np.ndim(1)
    0



In [25]:
print(arr.ndim)

0


In [26]:
print(type(arr))

<class 'numpy.ndarray'>


In [27]:
help(arr)

Help on ndarray object:

class ndarray(builtins.object)
 |  ndarray(shape, dtype=float, buffer=None, offset=0,
 |          strides=None, order=None)
 |  
 |  An array object represents a multidimensional, homogeneous array
 |  of fixed-size items.  An associated data-type object describes the
 |  format of each element in the array (its byte-order, how many bytes it
 |  occupies in memory, whether it is an integer, a floating point number,
 |  or something else, etc.)
 |  
 |  Arrays should be constructed using `array`, `zeros` or `empty` (refer
 |  to the See Also section below).  The parameters given here refer to
 |  a low-level method (`ndarray(...)`) for instantiating an array.
 |  
 |  For more information, refer to the `numpy` module and examine the
 |  methods and attributes of an array.
 |  
 |  Parameters
 |  ----------
 |  (for the __new__ method; see Notes below)
 |  
 |  shape : tuple of ints
 |      Shape of created array.
 |  dtype : data-type, optional
 |      Any objec

In [28]:
np.array([1,2,3])

array([1, 2, 3])

In [29]:
list1 = [1,2,3]
nump_arr = np.array(list1)

nump_arr

array([1, 2, 3])

In [30]:
# 1-D array : An array that has 0-D arrays as its elements is called uni-dimensional or 1-D array.   

arr = np.array([1, 2, 3, 4, 5])
print(arr.ndim)
print(arr.size)
print(arr.shape)

1
5
(5,)


![image.png](attachment:image.png)

In [31]:
arr = np.array([10, 20, 30, 40, 50])

print(arr.ndim)
print(arr.size)
print(arr.shape)

1
5
(5,)


In [32]:
arr[2]

30

### Accessing arrays

In [33]:
a= [1,2,3]
b = [4,5,6]

np.array([a,b])

array([[1, 2, 3],
       [4, 5, 6]])

In [34]:
#2-D array : An array that has 1-D arrays as its elements is called a 2-D array.
#These are often used to represent matrix or 2nd order tensors.    

arr = np.array([[3, 5, 9], [4,8, 2]])  
print(arr.ndim)
print(arr.size)
print(arr.shape)
print(arr)

2
6
(2, 3)
[[3 5 9]
 [4 8 2]]


In [35]:
arr

array([[3, 5, 9],
       [4, 8, 2]])

In [36]:
arr = np.array([[3, 5, 9], [4,8, 2]])
arr

array([[3, 5, 9],
       [4, 8, 2]])

In [37]:
arr[0][0]

3

In [38]:
#  Access the 2nd element on 1st dim:
arr[0, 1]

5

In [39]:
# Access the 3rd element on 2nd dim:
arr[1, 2]

2

In [40]:
print(arr[0:2,1])

[5 8]


In [41]:
arr

array([[3, 5, 9],
       [4, 8, 2]])

####  Exercise
Get output as  
1. [8,2]  
2. [9,2]  

In [42]:
arr[ 1 , 1: ]

array([8, 2])

In [43]:
arr[ :, 2]

array([9, 2])

In [44]:
# 3-D array : An array that has 2-D arrays (matrices) as its elements is called 3-D array.
#These are often used to represent a 3rd order tensor. 

arr = np.array([[[1, 2, 3], [4, 5, 6]], 
                [[7, 8, 9], [10, 11, 12]]])
print(arr.ndim)
print(arr.size)
print(arr.shape)
print(arr)

3
12
(2, 2, 3)
[[[ 1  2  3]
  [ 4  5  6]]

 [[ 7  8  9]
  [10 11 12]]]


In [45]:
arr[0][0][0]

1

In [46]:
arr[0][0][0]

1

In [47]:
arr = np.array([[3, 5, 9], [4,8, 2]])
arr

array([[3, 5, 9],
       [4, 8, 2]])

### Exercise

1. [5,8]
2. [8]
3. [3,4]
4. [5,9]

In [48]:
arr[ 0:2  , 1 ]

array([5, 8])

In [49]:
arr[ 1 , 1]

8

In [50]:
arr[ 0:2 , 0]

array([3, 4])

In [51]:
arr[ 0 , 1:]

array([5, 9])

# Understanding Dimensions

scalar quantity, e.g. \   
Revan's marks in Data Structures = 87.50   

### 1-Dimension Array, or a Vector, e.g.
Mohit's marks = [  98, 76, 65, 85, 90  ]   

### 2-Dimension Array
Student 1 :  28	100	31	82	35    
Student 2 : 96	12	25	10	70    
Student 3 : 89	67	10	73	50  


### 3-Dimension Array  
Batch 70:  
Student 1 :  28	100	31	82	35    
Student 2 : 96	12	25	10	70     
Student 3 : 89	67	10	73	50   

Batch 71:  
Student 1 :	83	63	80	56	67     
Student 2 :	73	74	77	88	69     
Student 3 :	97	85	94	63	87   

![image.png](attachment:image.png)

![image.png](attachment:image.png)

In [52]:
help(np.array)

Help on built-in function array in module numpy:

array(...)
    array(object, dtype=None, *, copy=True, order='K', subok=False, ndmin=0,
          like=None)
    
    Create an array.
    
    Parameters
    ----------
    object : array_like
        An array, any object exposing the array interface, an object whose
        __array__ method returns an array, or any (nested) sequence.
    dtype : data-type, optional
        The desired data-type for the array.  If not given, then the type will
        be determined as the minimum type required to hold the objects in the
        sequence.
    copy : bool, optional
        If true (default), then the object is copied.  Otherwise, a copy will
        only be made if __array__ returns a copy, if obj is a nested sequence,
        or if a copy is needed to satisfy any of the other requirements
        (`dtype`, `order`, etc.).
    order : {'K', 'A', 'C', 'F'}, optional
        Specify the memory layout of the array. If object is not an array

In [53]:
arr = np.array([1, 2, 3, 4], ndmin=5)

In [54]:
arr

array([[[[[1, 2, 3, 4]]]]])

An array can have any number of dimensions.

When the array is created, you can define the number of dimensions by using the ndmin argument.   

In this array the innermost dimension (5th dim) has 4 elements, the 4th dim has 1 element that is the vector, the 3rd dim has 1 element that is the matrix with the vector, the 2nd dim has 1 element that is 3D array and 1st dim has 1 element that is a 4D array. 

In [55]:
# Creating Arrays With a Defined Data Type

arr1 = np.array([1, 2, 3, 4], dtype='float')  

arr2 = np.array([1.1, 2.1, 3.1],dtype="int32")

#arr3 = arr1.astype(int)

In [56]:
arr1

array([1., 2., 3., 4.])

In [57]:
arr2

array([1, 2, 3])

In [58]:
arr3 = arr1.astype(int)
arr3

array([1, 2, 3, 4])

In [59]:
arr2

array([1, 2, 3])

Since NumPy arrays can contain only homogeneous datatypes, values will be upcast if the types do not match:



In [61]:
np.array([1,2.0,3,4])

array([1., 2., 3., 4.])

In [62]:
np.array([1,2.0,3,"4"])

array(['1', '2.0', '3', '4'], dtype='<U32')

In [63]:
list_1 = [3, 6, 7, 5]
list_1

[3, 6, 7, 5]

In [64]:
# Square a list

list_squared = [i**2 for i in list_1]

print(list_squared)

[9, 36, 49, 25]


In [65]:
list_1

[3, 6, 7, 5]

In [66]:
np.array([3,5,7,9])

array([3, 5, 7, 9])

In [67]:
list_1

[3, 6, 7, 5]

In [68]:
list_1**2

TypeError: unsupported operand type(s) for ** or pow(): 'list' and 'int'

In [75]:
arr1

array([1., 2., 3., 4.])

In [76]:
arr1**2

array([ 1.,  4.,  9., 16.])

In [71]:
# Square a numpy array

array_1=np.array(list_1)

array_squared = array_1**2

print(array_squared)

[ 9 36 49 25]


In [72]:
list_1 = [3, 6, 7, 5]
list_2 = [4, 5, 1, 7]

# the list way to do it: map a function to the two lists

product_list = list(map(lambda x, y: x+y, list_1, list_2))
print(product_list)

[7, 11, 8, 12]


In [73]:
# The numpy array way to do it: simply multiply the two arrays

array_1 = np.array(list_1)
array_2 = np.array(list_2)

array_3 = array_1+array_2

print(array_3)
#print(type(array_3))

[ 7 11  8 12]


In [9]:
a = np.array([1,3,5,7,9,10,13,15])
print(a)

[ 1  3  5  7  9 10 13 15]


In [10]:
print(a[:3])

[1 3 5]


In [17]:
a3 = np.arange(1,31)

In [18]:
a3.shape

(30,)

In [19]:
a4 = np.reshape(a3,(5,6))
print(a4)

[[ 1  2  3  4  5  6]
 [ 7  8  9 10 11 12]
 [13 14 15 16 17 18]
 [19 20 21 22 23 24]
 [25 26 27 28 29 30]]


In [14]:
a4[1,1:3]

array([11, 12])

In [13]:
a4[2,6]

24

In [43]:
# a5 = np.random.randint(1000,9999,10)
# print(a5)
np.reshape( np.random.randint(1000,9999,10),(2,5))

array([[1014, 3191, 9446, 6425, 5477],
       [3308, 8824, 8921, 8157, 4122]])

In [53]:
np.random.rand(5)

array([0.78059695, 0.60690386, 0.34156226, 0.94976734, 0.76260458])

In [54]:
np.random.randn(5)

array([ 1.59872556, -1.93361479, -1.05054019,  1.54629544, -1.69248288])

In [55]:
a4.flatten()

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
       18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30])

In [57]:
a4.flat[4:8]

array([5, 6, 7, 8])