
# Introduction to NumPy and Pandas

### What is NumPy?
- NumPy stands for Numerical Python.
- It is a powerful library for numerical and matrix operations in Python.
### Key Features:
- Multi-dimensional arrays.
- Mathematical functions for operating on these arrays.

## Importing NumPy:

In [4]:
# pip install numpy

In [6]:
import numpy as np


In [48]:
list_a=['a','b','c','d','e','f']
arr_from_list_a = np.array(list_a)



In [9]:
print(arr_from_list)

['a' 'b' 'c' 'd' 'e' 'f']


In [14]:
len(arr_from_list)

6

In [25]:
arr_zeros = np.zeros((1,3))
arr_zeros

array([[0., 0., 0.]])

In [22]:
arr_ones = np.ones((1, 4))
arr_ones

array([[1., 1., 1., 1.]])

- np.arange excludes the stop value from the generated array.


In [34]:
np.arange(0, 11)

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [39]:
arr_range = np.arange(0, 50,5)  # Generates an array with values from 0 to 50 (exclusive) with a step of 5
arr_range

array([ 0,  5, 10, 15, 20, 25, 30, 35, 40, 45])

In this example, np.arange(0, 50, 5) generates an array starting from 0, incrementing by 2, and stopping before 10. Notice that the stop value (10) is not included in the output array.



In [47]:
arr_linspace = np.linspace(0, 100,5)  # Generates an array with 5 values evenly spaced between 0 and 100
arr_linspace

array([  0.,  25.,  50.,  75., 100.])

In this example, np.linspace(0, 100, 5) generates an array with 5 evenly spaced values between 0 and 1 (inclusive). The endpoint=True by default, so the stop value (1) is included in the output array.

`

### Array Operation

In [52]:
list_b=[1,2,3,4,5]
arr_from_list= np.array(list_b)
arr_from_list

array([1, 2, 3, 4, 5])

In [55]:
arr_addition = arr_from_list + 10  # Adds 10 to each element of the array
arr_addition

array([11, 12, 13, 14, 15])

In [57]:
arr_multiply = arr_from_list * 2  # Multiplies each element of the array by 2
arr_multiply

array([ 2,  4,  6,  8, 10])

In [58]:
arr_square = np.square(arr_from_list)  # Calculates the element-wise square of the array
arr_square

array([ 1,  4,  9, 16, 25])

### Random Values

In [70]:
random_arr = np.random.rand(1, 3)  # Generates random values in a 1 d array from a uniform distribution over [0, 1)
random_arr

array([[0.97831609, 0.3151603 , 0.29085362]])

In [72]:
random_integers = np.random.randint(-2, 11, size=(1, 31))  # Generates random integers between 1 and 100 in a 1d array
random_integers

array([[ 2,  9,  0,  6,  4,  8,  5,  3, -2,  9,  2,  8,  1,  4,  1,  3,
         0,  7,  7, 10,  5,  1,  3,  3,  8,  3,  6,  8,  2,  0,  3]])

## Pandas 

### What is Pandas?
- Pandas is an open-source data manipulation and analysis library for Python.
- It provides data structures like Series and DataFrame for working with structured data.


In [74]:
import pandas as pd

# Creating a Pandas DataFrame
data = {
    
    'StudyHour': [2, 4, 1, 2.5, 4],
    'Score': [85, 92, 78, 89, 95]
}

data

{'StudyHour': [2, 4, 1, 2.5, 4], 'Score': [85, 92, 78, 89, 95]}

In [75]:

df = pd.DataFrame(data)  # Creates a Pandas DataFrame from a dictionary
df

Unnamed: 0,StudyHour,Score
0,2.0,85
1,4.0,92
2,1.0,78
3,2.5,89
4,4.0,95


In [81]:
study_hours=np.random.randint(1, 8, 50)
study_hours

array([2, 4, 4, 6, 7, 2, 4, 4, 5, 1, 2, 2, 7, 5, 3, 1, 6, 2, 2, 3, 7, 6,
       3, 5, 3, 7, 7, 1, 3, 6, 5, 3, 3, 6, 2, 2, 2, 2, 7, 7, 6, 4, 6, 6,
       4, 3, 5, 5, 6, 4])

In [82]:
_scores=np.random.randint(40, 100, 50)
_scores

array([73, 71, 95, 78, 91, 91, 81, 90, 56, 53, 60, 71, 41, 86, 72, 63, 93,
       98, 51, 86, 83, 45, 82, 42, 74, 81, 46, 40, 84, 46, 87, 67, 76, 68,
       41, 85, 67, 77, 68, 43, 67, 88, 61, 72, 93, 81, 86, 66, 98, 83])

In [83]:
data_exams = {
    'Study Hours': study_hours,  # Random decimal study hours between 1 and 10
    'Score':   _scores      # Random integer scores between 40 and 100
}
data_exams

{'Study Hours': array([2, 4, 4, 6, 7, 2, 4, 4, 5, 1, 2, 2, 7, 5, 3, 1, 6, 2, 2, 3, 7, 6,
        3, 5, 3, 7, 7, 1, 3, 6, 5, 3, 3, 6, 2, 2, 2, 2, 7, 7, 6, 4, 6, 6,
        4, 3, 5, 5, 6, 4]),
 'Score': array([73, 71, 95, 78, 91, 91, 81, 90, 56, 53, 60, 71, 41, 86, 72, 63, 93,
        98, 51, 86, 83, 45, 82, 42, 74, 81, 46, 40, 84, 46, 87, 67, 76, 68,
        41, 85, 67, 77, 68, 43, 67, 88, 61, 72, 93, 81, 86, 66, 98, 83])}

In [84]:

df_exams = pd.DataFrame(data_exams)
df_exams

Unnamed: 0,Study Hours,Score
0,2,73
1,4,71
2,4,95
3,6,78
4,7,91
5,2,91
6,4,81
7,4,90
8,5,56
9,1,53
