# NumPy





*   Stands for Numerical Python
*   Is the fundamental package required for high performance computing and data analysis
*   Most computational packages providing scientific functionality use NumPy’s array objects as the lingua franca for data exchange.



# Numpy Basics

**Efficiency in Numerical Computations**
- NumPy is crucial for numerical computations in Python due to its efficiency, especially when working with large arrays of data.

**Multidimensional Arrays with `ndarray`**
- The cornerstone of NumPy is the `ndarray`, a powerful data structure that allows the creation of arrays with multiple dimensions, enabling efficient manipulation of numerical data.

**Contiguous Memory Storage**
- NumPy internally stores data in a contiguous block of memory. This design, independent of other built-in Python objects, results in significantly reduced memory usage compared to native Python sequences.

**Standard Math Functions for Array Operations**
- NumPy provides a set of standard math functions that operate efficiently on entire arrays of data. This eliminates the need for explicit loops, streamlining and accelerating numerical operations.

**Vectorization for Batch Operations**
- NumPy arrays play a pivotal role in expressing batch operations on data without resorting to traditional for loops. This vectorized approach, known as **vectorization**, enhances code readability and execution speed for array-based operations.


### **Why NumPy is so efficient**

*   Internally stores data in contiguous block of memory, so easy to retrieve.
*   Has alogrithm library written in C that can operate on this memory without any type-checking or overhead.
*   NumPy arrays also use much less memory than other built-in Python sequences.
*   NumPy operations perform Complex Computations without need for Python loops.


![image.png](attachment:image.png)

![image.png](attachment:image.png)

# Importing Numpy

In [1]:
import numpy as np

# np is alias for numpy in python

In [2]:
data= np.random.randn(2,3)
data

array([[ 1.10085817,  0.7760125 ,  0.2429144 ],
       [-0.25506499, -0.09650424,  1.16586651]])

In [3]:
data=np.random.seed(3)
data= np.random.randn(2,3)
data

array([[ 1.78862847,  0.43650985,  0.09649747],
       [-1.8634927 , -0.2773882 , -0.35475898]])

In [4]:
# import random
np.random.seed(1042)
data= np.random.randn(2)
data

array([-0.8526478 , -0.43149649])

In [5]:
np.random.seed(1042)
data= np.random.randn(2)
data

array([-0.8526478 , -0.43149649])

In [6]:
print(data)

[-0.8526478  -0.43149649]


In [7]:
data*10

array([-8.52647803, -4.31496487])

In [8]:
data+data

array([-1.70529561, -0.86299297])

An ndarray is a generic multidimensional container for homogeneous data; that is, all
of the elements must be the same type. Every array has a **shape**, a **tuple** indicating the
size of each dimension, and a **dtype**, an **object** describing the data type of the array:

In [9]:
data.shape

(2,)

In [10]:
data.dtype

dtype('float64')

In [11]:
data.ndim

1

### Different ways to create ndarray
1. array - Will create an array out of a list. For list of lists, will create a higher-dimensional array.
2. empty - Creates array without initializing values. May return uninitialized garbage values.
3. zeros - Creates array initialized with zeroes.
4. arange - Create array initialized with range of values.
For empty and zeros, pass the size of the array. For creating multi-dimensional arrays, use tuples.

In [12]:
python_list=[3,2,0.5,7,9,10]
array=np.array(python_list)
array

array([ 3. ,  2. ,  0.5,  7. ,  9. , 10. ])

In [13]:
type(array)

numpy.ndarray

In [14]:
print(array.ndim)

1


## Creating arrays from scratch

#### Creating arrays: np.zeros()

In [15]:
np.zeros(10)

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

In [16]:
np.zeros((2,3))

array([[0., 0., 0.],
       [0., 0., 0.]])

*   zeros and ones create arrays of 0s or 1s, respectively, with a
given length or shape. 

*   empty creates an array without initializing its values to any par‐
ticular value. 

## Empty

It’s not safe to assume that np.empty will return an array of all
zeros. In some cases, it may return uninitialized “garbage” values.

In [41]:
np.empty((2,3,1))

array([[[ 0.],
        [ 4.],
        [ 1.]],

       [[ 7.],
        [ 2.],
        [12.]]])

**Random VS Empty:** The np.random module in NumPy is used for generating random numbers and random arrays with various probability distributions.Useful when you need arrays with random values for simulations, statistical analyses, or other applications where randomness is required.

While, The np.empty function is used to create an array without initializing its values to any particular value. It allocates memory for the array, but the values in the array are not set to a specific initial value.Useful when you need to quickly allocate memory for an array and plan to fill it with specific values later on. It can be more efficient than creating an array with initialized values if you don't need those initial values.

Both lines involve NumPy functions for generating random numbers, but they use different functions and generate arrays with different distributions.

**np.random.random((2, 4)):**

This line creates a NumPy array with shape (2, 4) filled with random floats sampled from a uniform distribution over the half-open interval [0.0, 1.0). The random function generates values from a uniform distribution.

**data = np.random.randn(2, 3):**

This line creates a NumPy array with shape (2, 3) filled with random floats sampled from a standard normal distribution (mean=0, standard deviation=1). The randn function generates values from a standard normal distribution.

The np.random.random function generates random floats in the half-open interval [0.0, 1.0). This means that the generated values can take any float value from 0.0 (inclusive) to 1.0 (exclusive). In mathematical notation, this can be expressed as:

0.0
≤
random value
<
1.0
0.0≤random value<1.0

So, the range of values produced by np.random.random is from 0.0 (inclusive) to 1.0 (exclusive).

If you want to generate random values from a different range, you can use the np.random.uniform function, which allows you to specify the lower and upper bounds of the desired range. For example

# Arange

Arange is an array-valued version of the built-in Python range function

In [18]:
np.arange(5) 

array([0, 1, 2, 3, 4])

Since NumPy is
focused on numerical computing, the data type, if not specified, will in many cases be
float64 (floating point)

| Function      | Description                                       | Details and Conditions                               | Input Type               |
|---------------|---------------------------------------------------|--------------------------------------------------------|--------------------------|
| `array`       | Create an array from a given object.              | Input can be a list, tuple, or array-like object.      | Array-like, List, Tuple  |
| `asarray`     | Convert input to an array.                        | If input is already an array, it is returned as is.     | Array-like, ndarray      |
| `arange`      | Return evenly spaced values within a given range. | Similar to Python's built-in `range`.                   | Scalar, int, float       |
| `ones`        | Return a new array of given shape and type, filled with ones. | Specify shape as a tuple, e.g., `(2, 3)`.          | Tuple (shape)            |
| `ones_like`   | Return an array of ones with the same shape and type as a given array. | Takes an existing array as an argument.          | ndarray                  |
| `zeros`       | Return a new array of given shape and type, filled with zeros. | Specify shape as a tuple, e.g., `(2, 3)`.         | Tuple (shape)            |
| `zeros_like`  | Return an array of zeros with the same shape and type as a given array. | Takes an existing array as an argument.        | ndarray                  |
| `empty`       | Return a new uninitialized array.                | Shape must be specified. Values are not initialized.   | Tuple (shape)            |
| `empty_like`  | Return a new uninitialized array with the same shape and type as a given array. | Takes an existing array as an argument.        | ndarray                  |
| `full`        | Return a new array of given shape and type, filled with a fill value. | Specify shape and fill value, e.g., `(2, 3), 7`.     | Tuple (shape), Scalar    |
| `full_like`   | Return an array with the same shape and type as a given array, filled with a fill value. | Takes an existing array as an argument and a fill value. | ndarray, Scalar          |
| `eye`, `identity` | Return a 2D array with ones on the diagonal and zeros elsewhere. | Creates an identity matrix with specified size.      | int                      |


### NumPy ndarray vs list

One of the key features of NumPy is its N-dimensional array object, or ndarray, which is a fast, flexible container for large datasets in Python.  
Whenever you see “array,” “NumPy array,” or “ndarray” in the text, with few exceptions they all refer to the same thing: the ndarray object.
NumPy-based algorithms are generally 10 to 100 times faster (or more) than their pure Python counterparts and use significantly less memory.

<!-- import numpy as np
my_arr = np.arange(1000000)
my_list = list(range(1000000)) -->


ndarray is used for storage of homogeneous data
i.e., all elements the same type
Every array must have a shape and a dtype
Supports convenient slicing, indexing and efficient vectorized computation


**Python lists**

Can contain many different data types

In [19]:
python_list = ["beep", False, 56, .945, [3, 2, 5]]

NumPy arrays
Can contain only a single data type
Use less space in memory

In [20]:
numpy_boolean_array = [[True, False], [True, True], [False, True]]
numpy_float_array = [1.9, 5.4, 8.8, 3.6, 3.2]

In [21]:
import numpy as np
my_arr = np.arange(1000000) # NumPy array
my_list = list(range(1000000))

The %time magic command is used in Jupyter notebooks or IPython environments to measure the execution time of a single statement or a block of code. In this example, the loop is executed 10 times, and the my_arr * 2 operation is timed.

The purpose of using the underscore (_) as the loop variable is to indicate that the loop variable itself is not being used in the loop body. It's a common convention in Python to use _ as a variable name when the variable is intentionally unused, and it helps to make the code more readable and self-explanatory.

In [22]:
# Time for NumPy operation
%time for _ in range(10): my_arr2 = my_arr * 2

CPU times: total: 31.2 ms
Wall time: 32.9 ms


In [23]:
# Time for List operation
%time for _ in range(10): my_list2 = [x * 2 for x in my_list]

CPU times: total: 984 ms
Wall time: 988 ms


#### Huge difference in performance time between the 2 (10 to 100 times faster). 

# NumPy Data Types
*   dtype is a special object, containing information (metadata) the ndarray needs to interpret a chunk of memory as a particular data type.
*   It makes numpy flexible to interact with data from other systems. They provide mapping directly to underlying disk or memory representations.
*   This makes it easy to read & write binary streams of data to disk & connect to low-level language code like C.
*   NumPy tries to infer a good data type for any array that it creates.
*   The naming convention is type of data followed by number of bits per element. eg: int23, float64, etc.

### Assigning dtype while creating array

In [25]:
arr1 = np.array([1,2,3], dtype = np.float64)
arr1

array([1., 2., 3.])

In [26]:
arr2 = np.array([1,2,3], dtype = np.int32)
arr2

array([1, 2, 3])

In [27]:
arr1.dtype

dtype('float64')

In [28]:
arr2.dtype

dtype('int32')

*   We can convert or cast an array from 1 dtype to another with astype.
*   We can also directly used dtype of another array.
*   But there may be loss off data if we convert larger size dtype to smaller ones.
*   Loss of data can also happen due to the nature of the data eg. string_ type
*   If the casting may fail, then a ValueError will be raised.

In [29]:
arr = np.array([1,2,3,4])
arr.dtype

dtype('int32')

### Changing dtype of already existing array

In [30]:
float_arr = arr.astype(np.float64)  #Calling astype always creates a new array (a copy of the data), even if the new dtype is the same as the old dtype.
float_arr

array([1., 2., 3., 4.])

In [31]:
arr.astype(float_arr.dtype)

array([1., 2., 3., 4.])

In [32]:
arr = np.array([3.4, -1.4, -4.2, 0.4, 10.4])
arr.astype(np.int32) 

array([ 3, -1, -4,  0, 10])

In [33]:
strings = np.array(['1.23','-9.6','43'], dtype=np.string_)

strings.astype(float)

array([ 1.23, -9.6 , 43.  ])

---
## Arithmetic with NumPy arrays
*   Arrays help in expressing batch operations without writing 'for loops'.
*   Arithemetic operations between equal sized arrays applies to element-wise operations.
*   Scalar operations propogate the scalar argument to each element in the array.
*   Comparisons between equal sized arrays yields boolean arrays.
*   Operations between differently sized arrays is called broadcasting.

---

### Between arrays of equal size

In [34]:
arr = np.array([[1.,2.,3.],[4.,5.,6.]])
arr

array([[1., 2., 3.],
       [4., 5., 6.]])

In [35]:
arr * arr

array([[ 1.,  4.,  9.],
       [16., 25., 36.]])

In [36]:
arr - arr

array([[0., 0., 0.],
       [0., 0., 0.]])

### Arithmetic operations with scalars

Arithmetic operations with scalars propagate the scalar argument to each element in
the array

In [37]:
1 / arr

array([[1.        , 0.5       , 0.33333333],
       [0.25      , 0.2       , 0.16666667]])

In [38]:
arr ** 0.5

array([[1.        , 1.41421356, 1.73205081],
       [2.        , 2.23606798, 2.44948974]])

In [39]:
arr2 = np.array([[0., 4., 1.],[7., 2., 12.]])
arr2

array([[ 0.,  4.,  1.],
       [ 7.,  2., 12.]])

### Comparisons between arrays of the same size yield boolean arrays

In [40]:
arr2 > arr1

array([[False,  True, False],
       [ True, False,  True]])

Operations between differently sized arrays is called **broadcasting** and will be discussed later as it is not needed for now

![Weekend_funny_post.PNG](attachment:Weekend_funny_post.PNG)

**Let's connect** 
> **Course Instructor:** Zartashia Afzal: https://www.linkedin.com/in/zartashiaafzal/
>
> 
> **Moderators:**
> Muhammad Qasim Ali: https://www.linkedin.com/in/muhammad-qasim-ali/
>
> Ayesha Mehboob: https://www.linkedin.com/in/ayesha-mehboob-379643284/
  
