<a href="https://colab.research.google.com/github/dhanushkodi45rS/AI-beginning/blob/main/Numpy.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**NUMPY IN PYTHON**
- NumPy (Numerical Python) is a powerful library used for:
- Fast mathematical operations
- Handling large datasets
- Creating arrays, matrices, and doing scientific computing
- In Data Science, NumPy is the foundation for Pandas, TensorFlow, and many other libraries.

In [None]:
pip install numpy



**Creating Arrays**

In [None]:
import numpy as np

arr1 = np.array([1,2,3,4])
arr2 = np.array([[1,2,3],[4,5,6]])

print(arr1)
print(arr2)

[1 2 3 4]
[[1 2 3]
 [4 5 6]]


In [None]:
print(arr1.ndim)
print(arr2.ndim)
print(arr1.shape)
print(arr2.shape)
print(arr1.dtype)
print(arr2.dtype)

1
2
(4,)
(2, 3)
int64
int64


**Example**

In [None]:
arr3 = np.arange(1,11)
print(arr3)

[ 1  2  3  4  5  6  7  8  9 10]


In [None]:
print(arr3.ndim)
print(arr3.shape)
print(arr3.size)
print(arr3.dtype)

1
(10,)
10
int64


In [None]:
arr4 = np.ones((3 , 3)) * 5
print(arr4)

[[5. 5. 5.]
 [5. 5. 5.]
 [5. 5. 5.]]


In [None]:
print(np.arange(2, 21, 2))

[ 2  4  6  8 10 12 14 16 18 20]


**INDEXING**
- Indexing = selecting a specific element from an array.
- NumPy indexing works similar to Python lists — but more powerful.

In [None]:
arr = ([10,20,30,40,50,60])
print(arr)

print(arr[0])
print(arr[2])
print(arr[-1])

[10, 20, 30, 40, 50, 60]
10
30
60


**SLICING**
- Slicing = selecting a range of elements using [start:end] format.
- (end is exclusive — not included)

In [None]:
print(arr[3:5])
print(arr[0:])
print(arr[:-2])

[40, 50]
[10, 20, 30, 40, 50, 60]
[10, 20, 30, 40]


**Indexing and Slicing in 2D Arrays**

In [None]:
arr2d = np.array([[10, 20, 30], [40, 50, 60], [70, 80, 90]])
print(arr2d)
print(arr2d[0, 1])
print(arr2d[1, 0])
print(arr2d[-1, -1])

[[10 20 30]
 [40 50 60]
 [70 80 90]]
20
40
90


In [None]:
print(arr2d[0:2, 1:3])
print(arr2d[1, :])
print(arr2d[:, 2])

[[20 30]
 [50 60]]
[40 50 60]
[30 60 90]


**RESHAPING**

In [None]:
import numpy as np
arr = np.arange(1, 13)
print(arr)

reshaped = arr.reshape(3,4)
print("reshaped:\n", reshaped)

[ 1  2  3  4  5  6  7  8  9 10 11 12]
reshaped:
 [[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]


**FLATTENING**
- Converts a 2D array → 1D array

In [None]:
arr2d = np.array([[1,2,3],[4,5,6]])
print(arr2d)

flat = arr2d.flatten()
print("Flatten", flat)

[[1 2 3]
 [4 5 6]]
Flatten [1 2 3 4 5 6]


**STACKING ARRAYS**
- U can stack horizontally or even vertically

In [None]:
a = np.array([1,2,3])
b = np.array([4,5,6])

v_stack = np.vstack((a,b))
print(v_stack)

h_stack = np.hstack((a,b))
print(h_stack)

[[1 2 3]
 [4 5 6]]
[1 2 3 4 5 6]


**SPLITIING ARRAYS**

In [None]:
arr = np.arange(1,13).reshape(3,4)
print(arr)

split = np.hsplit(arr, 2)
print(split[0])
print(split[1])

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]
[[ 1  2]
 [ 5  6]
 [ 9 10]]
[[ 3  4]
 [ 7  8]
 [11 12]]


**FILTERRING ARRAYS WITH CONDITIONS**
- This is called Boolean indexing — super useful in data science.

In [None]:
arr = np.array([10,20,30,40,50,60])

condition = arr > 25
print(condition)

filter = arr[arr > 25]
print("Filtered:", filter)

[False False  True  True  True  True]
Filtered: [30 40 50 60]


**RANDOM ARRAY**

In [None]:
arr = np.random.randint(50, 101, 10)
print(arr)

print("Max:", np.max(arr))
print("Mean:", np.mean(arr))

[54 99 73 56 76 66 50 75 67 84]
Max: 99
Mean: 70.0


PANDAS


In [None]:
#Create a Series of 5 temperatures
import pandas as pd
temperatures = [34,24,26,36,29]
s = pd.Series(temperatures)
print(s)

0    34
1    24
2    26
3    36
4    29
dtype: int64


In [None]:
#Create a Series using a dictionary of student names & marks
Students = {
    "Alvin" : 80,
    "Ben" : 75,
    "Champ" : 86
}
s = pd.Series(Students)
print(s)

Alvin    80
Ben      75
Champ    86
dtype: int64


In [None]:
#Print the highest mark , Add 10 to all values
print(s.max())
marks = s + 10
print(marks, +10)

86
Alvin    90
Ben      85
Champ    96
dtype: int64 10


In [None]:
#Create a DataFrame of students & marks , Filter students with marks > 80 , Add a Grade column
students = {
    "Name" : ["Adam", "Bretskey", "Carbin"],
    "Mark" : [85, 78, 89]
}
df = pd.DataFrame(students)
print(df)

Topper = df[df["Mark"] > 80]
print(Topper)

df["Grade"] = ["A", "B", "A"]
print(df)

       Name  Mark
0      Adam    85
1  Bretskey    78
2    Carbin    89
     Name  Mark
0    Adam    85
2  Carbin    89
       Name  Mark Grade
0      Adam    85     A
1  Bretskey    78     B
2    Carbin    89     A


# **PANDAS STUDENT PERFORMANCE ANALYSIS**

In [None]:
import pandas as pd

data = {
    "Name": ["Alice", "Bob", "Charlie"],
    "Class": ["A", "A", "B"],
    "Age": [20, 21, 22],
    "Marks": [85, 90, 78]
}

df = pd.DataFrame(data)
df.to_csv("students.csv", index=False)
pd.read_csv("students.csv")
from google.colab import files
files.download("students.csv")

df = pd.read_csv("students.csv")
print(df)

df["Marks"].fillna(df["Marks"].mean(), inplace=True)
df["Age"].fillna(df["Age"].mean(), inplace=True)

print("Average Marks:", df["Marks"].mean())

topper = df.loc[df["Marks"].idxmax()]
print("Topper:\n", topper)

class_avg = df.groupby("Class")["Marks"].mean()
print("Class-wise Average Marks:\n", class_avg)

sorted_df = df.sort_values(by="Marks", ascending=False)
print(sorted_df)

def grade(m):
  if m >= 80:
    return "A"
  elif m >=70:
    return "B"
  else:
    return "C"

df["Grade"] = df["Marks"].apply(grade)
print(df)

df.to_csv("cleaned_students.csv", index = False)

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

      Name Class  Age  Marks
0    Alice     A   20     85
1      Bob     A   21     90
2  Charlie     B   22     78
Average Marks: 84.33333333333333
Topper:
 Name     Bob
Class      A
Age       21
Marks     90
Name: 1, dtype: object
Class-wise Average Marks:
 Class
A    87.5
B    78.0
Name: Marks, dtype: float64
      Name Class  Age  Marks
1      Bob     A   21     90
0    Alice     A   20     85
2  Charlie     B   22     78
      Name Class  Age  Marks Grade
0    Alice     A   20     85     A
1      Bob     A   21     90     A
2  Charlie     B   22     78     B


The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  df["Marks"].fillna(df["Marks"].mean(), inplace=True)
The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  df["Age"].fillna(df["Age"].mean(), inplace=True)
