# NumPy

* NumPy, a powerful Python library for numerical computing.

* NumPy provides a multi-dimensional array object and a collection of functions for working with these arrays.

* It is widely used in the field of data science for tasks such as data manipulation, mathematical operations, and more. 



* NumPy's official documentation (https://numpy.org/doc/) is an excellent resource to dive deeper and learn more about the library.


## Installation


`pip install numpy`

or 

`conda install -c anaconda numpy`

To use NumPy, you need to import it in your Python script or notebook:

`import numpy as np`

## Creating NumPy Arrays:

NumPy arrays are similar to Python lists, but they can hold multiple dimensions. Here are a few ways to create NumPy arrays:

* From a Python list:

In [2]:
import numpy as np

my_list = [1, 2, 3, 4, 5]
arr = np.array(my_list)
print(arr)

[1 2 3 4 5]


* Using the arange function:

In [3]:
arr = np.arange(1, 6)  # Creates an array from 1 to 5
print(arr)

[1 2 3 4 5]


* Using Zero functions:
    

In [11]:
arr = np.zeros(5)  # Creates an array of zeros with length 5
print(arr)

[0. 0. 0. 0. 0.]


* Using One functions:
    

In [14]:
arr = np.ones((2,4))  # Creates a 2x3 array of ones
print(arr)


[[1. 1. 1. 1.]
 [1. 1. 1. 1.]]


## Accessing Array Element

* You can access elements of a NumPy array using indexing, similar to Python lists. 

* Remember, the indexing starts from 0.

In [16]:
arr = np.array([1, 2, 3, 4, 5])
print(arr[0])  # Accessing the first element
print(arr[2:4])  # Accessing a slice from index 2 to 4 (exclusive)


1
[3 4]


##  Array Operations

NumPy arrays support various mathematical operations, both element-wise and across arrays. 

* Element-wise operations:

In [17]:
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])

# Addition
result = arr1 + arr2
print(result)

# Multiplication
result = arr1 * arr2
print(result)


[5 7 9]
[ 4 10 18]


* Mathematical functions:

In [18]:
arr = np.array([1, 2, 3, 4, 5])

# Square root
result = np.sqrt(arr)
print(result)

# Mean
result = np.mean(arr)
print(result)


[1.         1.41421356 1.73205081 2.         2.23606798]
3.0


## Array Shape and Reshaping

* You can check the shape of a NumPy array using the shape attribute. 

* Additionally, you can reshape an array using the reshape function. 

In [20]:
arr = np.array([[1, 2, 3], [4, 5, 6]])
print(arr.shape)  # Prints (2, 3)


print("---")
reshaped_arr = arr.reshape(3, 2)
print(reshaped_arr)


(2, 3)
---
[[1 2]
 [3 4]
 [5 6]]


## Basic Data Exploration with numpy

In [35]:
data = np.array([
    [5.1, 3.5, 1.4, 0.2],
    [4.9, 3.0, 1.4, 0.2],
    [4.7, 3.2, 1.3, 0.2],
    [4.6, 3.1, 1.5, 0.2],
    [5.0, 3.6, 1.4, 0.2],
    [5.4, 3.9, 1.7, 0.4],
    [4.6, 3.4, 1.4, 0.3],
    [5.0, 3.4, 1.5, 0.2],
    [4.4, 2.9, 1.4, 0.2],
    [4.9, 3.1, 1.5, 0.1],
    [5.4, 3.7, 1.5, 0.2],
    [4.8, 3.4, 1.6, 0.2],
    [4.8, 3.0, 1.4, 0.1],
    [4.3, 3.0, 1.1, 0.1],
    [5.8, 4.0, 1.2, 0.2],
    [5.7, 4.4, 1.5, 0.4],
    [5.4, 3.9, 1.3, 0.4],
    [5.1, 3.5, 1.4, 0.3],
    [5.7, 3.8, 1.7, 0.3],
    [5.1, 3.8, 1.5, 0.3],
    [5.4, 3.4, 1.7, 0.2],
    [5.1, 3.7, 1.5, 0.4],
    [4.6, 3.6, 1.0, 0.2],
    [5.1, 3.3, 1.7, 0.5],
    [4.8, 3.4, 1.9, 0.2],
    [5.0, 3.0, 1.6, 0.2],
    [5.0, 3.4, 1.6, 0.4],
    [5.2, 3.5, 1.5, 0.2],
    [5.2, 3.4, 1.4, 0.2],
    [4.7, 3.2, 1.6, 0.2],
    [4.8, 3.1, 1.6, 0.2],
    [5.4, 3.4, 1.5, 0.4],
    [5.2, 4.1, 1.5, 0.1],
    [5.5, 4.2, 1.4, 0.2],
    [4.9, 3.1, 1.5, 0.2]])


## Aggregation Functions

* NumPy provides various aggregation functions that summarize data, such as sum(), mean(), median(), min(), max(), var(), and std(). 

* These functions can be applied along specific axes or across the entire array

In [36]:


# mean
mean = np.mean(data)
print(mean)

#standard deviation

std_dev = np.std(data)
print(std_dev)

min_val = np.min(data)
max_val = np.max(data)

print("min val = ",min_val)
print(max_val)


2.5585714285714283
1.8607212777494075
min val =  0.1
5.8


## Filtered data
NumPy provides the capability to filter data based on specific conditions using boolean indexing. 



In [39]:
filtered_data = data[data > 0.5]

print(filtered_data)

[5.1 3.5 1.4 4.9 3.  1.4 4.7 3.2 1.3 4.6 3.1 1.5 5.  3.6 1.4 5.4 3.9 1.7
 4.6 3.4 1.4 5.  3.4 1.5 4.4 2.9 1.4 4.9 3.1 1.5 5.4 3.7 1.5 4.8 3.4 1.6
 4.8 3.  1.4 4.3 3.  1.1 5.8 4.  1.2 5.7 4.4 1.5 5.4 3.9 1.3 5.1 3.5 1.4
 5.7 3.8 1.7 5.1 3.8 1.5 5.4 3.4 1.7 5.1 3.7 1.5 4.6 3.6 1.  5.1 3.3 1.7
 4.8 3.4 1.9 5.  3.  1.6 5.  3.4 1.6 5.2 3.5 1.5 5.2 3.4 1.4 4.7 3.2 1.6
 4.8 3.1 1.6 5.4 3.4 1.5 5.2 4.1 1.5 5.5 4.2 1.4 4.9 3.1 1.5]


In [32]:
p_25 = np.percentile(data, 25)
p_75 = np.percentile(data, 75)

print("percentile of 25", p_25)
print("percentile of 75", p_75)

percentile of 25 2.0
percentile of 75 14.5


## Array Sorting and Searching

* NumPy provides functions for sorting arrays, finding unique elements, and searching for specific values.

* Some commonly used functions include sort(), argsort(), unique(), searchsorted()

In [44]:

# Sample data
da = np.array([2, 1, 5, 3, 4, 1, 5, 2, 4])

# Sort an array in ascending order 
# sorted_arr_desc = np.sort(data)[::-1]  for descending
sorted_arr = np.sort(da)
print("Sorted Array:", sorted_arr)

# Find unique elements in an array
unique_elements = np.unique(da)
print("Unique Elements:", unique_elements)

# Find the index where a value should be inserted to maintain the sorted order
insertion_index = np.searchsorted(sorted_arr, 5.0)
print("Insertion Index for 5.0:", insertion_index)


Sorted Array: [1 1 2 2 3 4 4 5 5]
Unique Elements: [1 2 3 4 5]
Insertion Index for 5.0: 7


The function returns the index position where the value should be inserted. 

This is useful when you want to determine the position of a value in a sorted array or if you want to insert a new value while keeping the array sorted.

