# Date : 07/01/2026

# Searching Algorithms

## 1. Linear Search (Sequential Search)
The simplest searching algorithm that checks every element in the list sequentially until the desired element is found or the list ends.

* **Prerequisite:** None (Works on both sorted and unsorted lists).
* **Mechanism:** It starts at the first element and checks items one by one. If it reaches the end without finding the target, it returns a "not found" indicator (usually -1).



### Complexity
* **Time Complexity:** $O(n)$ (Linear time).
    * *Why?* In the worst case (if the item is at the very end or missing), the algorithm must check every single item in the list.
* **Space Complexity:** $O(1)$.

In [None]:
# 1. Linear Search

def linearSearch(num, key):
    count = 0
    for i in num:
        if i == key:
            return f'{key} found at index[{count}]'
        count += 1
    return False

    # pos = 0                                   # --> Another logic
    # found = False
    # while pos < len(num) and not found:
    #     if num[pos] == key:
    #         found = True
    #     else:
    #         pos += 1
    #     return found, pos

num = [1,2,3,4,5]
res = linearSearch(num, 8)
res

(False, 1)

In [10]:
# string linaer search

def linearSearch(str1, key):
    count = 0
    for i in str1:
        if i == key:
            return f'{key} found at index[{count}]'
        count += 1
    return False

str1 = ['qwerty', 'animal', 'pet', 'Cow', 'Donkey', 'shreekant']
res = linearSearch(str1, 'shreekant')
res

'shreekant found at index[5]'


## 2. Binary Search (Divide and Conquer)
An efficient algorithm that finds the position of a target value within a **sorted array** by repeatedly dividing the search interval in half.

* **Prerequisite:** The list **MUST be sorted** (Ascending or Descending).
* **Mechanism:**
    1.  Compare the target value to the **middle** element of the array.
    2.  If they match, the position is returned.
    3.  If the target is **less than** the middle element, the search continues in the **left half** (ignoring the right).
    4.  If the target is **greater than** the middle element, the search continues in the **right half** (ignoring the left).
    5.  Repeat until the element is found or the interval is empty.



### Complexity
* **Time Complexity:** $O(\log n)$ (Logarithmic time).
    * *Why?* The search space is cut in half with every step. For 1,000,000 items, Linear Search takes 1,000,000 steps; Binary Search takes only ~20 steps.
* **Space Complexity:** $O(1)$ (for iterative approach).

---

### Comparison Summary

| Feature | Linear Search | Binary Search |
| :--- | :--- | :--- |
| **Data Structure** | Unsorted or Sorted | **Sorted Only** |
| **Approach** | Sequential (One by one) | Divide and Conquer (Split in half) |
| **Time Complexity** | $O(n)$ (Slow for large data) | $O(\log n)$ (Very Fast) |
| **Efficiency** | Good for small lists | Essential for large databases |

In [14]:
# Binary Search : 

def binarySearchIterartive(array, target):
    low = 0
    high = len(array) - 1

    while low <= high:
        mid = (low + high) // 2

        if array[mid] == target:
            return True, mid
        elif array[mid] < target:
            low = mid + 1
        else:
            high = mid - 1
    
    return False, -1

num = [1,2,3,4,5]
res, index = binarySearchIterartive(num, 4)
print(f'{res} : {index}')

True : 3


In [None]:
# searching string in binary search

def binarySearchIterartive(array, target):
    low = 0
    high = len(array) - 1

    while low <= high:
        mid = (low + high) // 2

        if array[mid] == target:
            return True, mid
        elif array[mid] < target:
            low = mid + 1
        else:
            high = mid - 1
    
    return False, -1

arr = ['a', 'b', 'c', 'd', 'e', 'f']
res, index = binarySearchIterartive(arr, 'f')
print(f'{res} : {index}')

True : 5


In [5]:
# binary search using recursion

def binarySearchRecursion(arr, target, low, high):
    if low > high:
        return False, -1
    
    mid = (low + high) // 2

    if arr[mid] == target:
        return True, mid
    elif arr[mid] < target:
        return binarySearchRecursion(arr, target, mid +1, high)
    else : 
        return binarySearchRecursion(arr, target, low, mid -1)
    
num = [1,2,3,4,5]
res, index = binarySearchRecursion(num, 4, 0, len(arr)-1)
print(f'{res} : {index}')

True : 3


In [6]:
def binarySearchRecursion(arr, target, low, high):
    if low > high:
        return False, -1
    
    mid = (low + high) // 2

    if arr[mid] == target:
        return True, mid
    elif arr[mid] < target:
        return binarySearchRecursion(arr, target, mid +1, high)
    else : 
        return binarySearchRecursion(arr, target, low, mid -1)
    
arr = ['a', 'b', 'c', 'd', 'e', 'f']
res, index = binarySearchRecursion(arr, 'f', 0, len(arr)-1)
print(f'{res} : {index}')

True : 5


# Modules in Python

A **Module** is simply a file containing Python code (functions, classes, and variables). The file name is the module name with the suffix `.py`.

* **Analogy:** If a **Function** is like a tool (e.g., a screwdriver), a **Module** is like a **Toolbox** that contains many related tools (screwdrivers, hammers, wrenches) organized together.

## Why use Modules?
1.  **Code Reusability:** Write code once in a module and reuse it in multiple programs without rewriting it.
2.  **Organization:** instead of one giant file with 1000 lines, you break it into 5 files of 200 lines each, grouped by functionality (e.g., `database.py`, `calculations.py`).
3.  **Namespace Management:** It prevents variable name clashes. A variable named `x` in `module A` is different from `x` in `module B`.



---

## Ways to Import Modules

### 1. Simple Import
Imports the entire toolbox. You must use the module name to access the tools.
* **Syntax:** `import math`
* **Usage:** You must type `math.sqrt(16)`.

### 2. Import with Alias (Renaming)
Imports the module but gives it a shorter nickname to save typing.
* **Syntax:** `import pandas as pd`
* **Usage:** You can type `pd.read_csv()` instead of the full name.

### 3. From ... Import (Specific Items)
Imports **only** the specific function or class you need, not the whole file.
* **Syntax:** `from math import sqrt, pi`
* **Usage:** You can access `sqrt(16)` directly without typing `math.`.

### 4. Import All (The Wildcard `*`)
Imports everything from the module.
* **Syntax:** `from math import *`
* **Warning:** This is **bad practice** in large projects because it pollutes your namespace (it might overwrite your own variables with the same name).

---

## Types of Modules

| Type | Description | Examples |
| :--- | :--- | :--- |
| **Built-in Modules** | Pre-installed with Python. | `math`, `os`, `sys`, `datetime`, `random` |
| **User-Defined Modules** | Modules created by you. | `my_project.py`, `utils.py` |
| **External Modules** | Installed via pip (package manager). | `numpy`, `pandas`, `requests` |

In [7]:
print(help('modules'))


Please wait a moment while I gather a list of all available modules...

test_sqlite3: testing with SQLite version 3.50.4


  tv_pai = load_test_vectors(("PublicKey", "ECC"),
  tv_pai = load_test_vectors(("PublicKey", "ECC"),
  tv_pai = load_test_vectors(("PublicKey", "ECC"),
  tv_pai = load_test_vectors(("PublicKey", "ECC"),
  tv_pai = load_test_vectors(("PublicKey", "ECC"),
  test_vectors_verify = load_test_vectors(("Signature", "PKCS1-v1.5"),
  test_vectors_sign  = load_test_vectors(("Signature", "PKCS1-v1.5"),
  test_vectors_sign += load_test_vectors(("Signature", "PKCS1-v1.5"),
  test_vectors_verify = load_test_vectors(("Signature", "PKCS1-PSS"),
  test_vectors_sign = load_test_vectors(("Signature", "PKCS1-PSS"),
  test_vectors_sign += load_test_vectors(("Signature", "PKCS1-PSS"),
  test_vectors_verify = load_test_vectors(("Signature", "DSA"),
  test_vectors_sign = load_test_vectors(("Signature", "DSA"),
  test_vectors_verify = load_test_vectors(("Signature", "ECDSA"),
  test_vectors_verify += load_test_vectors(("Signature", "ECDSA"),
  test_vectors_sign = load_test_vectors(("Signature", "ECDSA"),
  fr

Crypto              base64              logging             start_pythonwin
IPython             bdb                 lzma                stat
PIL                 binascii            mailbox             statistics
__future__          bisect              markupsafe          statsmodels
__hello__           bitarray            marshal             streamlit
__phello__          blinker             math                string
_abc                builtins            matplotlib          stringprep
_aix_support        bz2                 matplotlib_inline   struct
_android_support    cProfile            mimetypes           subprocess
_apple_support      cachetools          mmap                sympy
_ast                calendar            mmapfile            symtable
_asyncio            certifi             mmsystem            sys
_bisect             charset_normalizer  modulefinder        sysconfig
_blake2             ckzg                mpmath              tabnanny
_bz2                click       

In [9]:
import math
print(dir(math))
print(math.pi)

['__doc__', '__loader__', '__name__', '__package__', '__spec__', 'acos', 'acosh', 'asin', 'asinh', 'atan', 'atan2', 'atanh', 'cbrt', 'ceil', 'comb', 'copysign', 'cos', 'cosh', 'degrees', 'dist', 'e', 'erf', 'erfc', 'exp', 'exp2', 'expm1', 'fabs', 'factorial', 'floor', 'fma', 'fmod', 'frexp', 'fsum', 'gamma', 'gcd', 'hypot', 'inf', 'isclose', 'isfinite', 'isinf', 'isnan', 'isqrt', 'lcm', 'ldexp', 'lgamma', 'log', 'log10', 'log1p', 'log2', 'modf', 'nan', 'nextafter', 'perm', 'pi', 'pow', 'prod', 'radians', 'remainder', 'sin', 'sinh', 'sqrt', 'sumprod', 'tan', 'tanh', 'tau', 'trunc', 'ulp']
3.141592653589793


In [12]:
from math import pi, factorial
print(pi)
print(factorial(5))

3.141592653589793
120


In [13]:
print(__name__)

__main__


In [5]:
import student

s1 = student.student('Noob', 'noob@gmail.com')
s1.display()


name : Noob
email: noob@gmail.com


### Create a Employee module and display the employee the employee information using __init__() and __str__()

In [None]:
import employee as emp

emp1 = emp.employee('ROM', 1, 1000)
emp2 = emp.employee('FIN', 2, 2000)

print(emp1)
print(emp2)

Name : ROM
ID : 1
Salary : ₹1000
Name : FIN
ID : 2
Salary : ₹2000


# NumPy (Numerical Python)

NumPy is the fundamental package for scientific computing in Python. It provides support for **large, multi-dimensional arrays and matrices**, along with a large collection of high-level mathematical functions to operate on these arrays.

* **Core Object:** The `ndarray` (n-dimensional array). Unlike Python lists, this object requires all elements to be of the **same data type** (homogeneous).

## Why use NumPy over Python Lists?
NumPy is typically **50x faster** than standard Python lists.

### 1. Speed (Vectorization)
* **Python Lists:** When you calculate `List_A + List_B`, Python loops through each element one by one, checking the type of every single number before adding. This is slow.
* **NumPy:** Uses **Vectorization**. It pushes the loop operations down to the C-level (compiled code). It knows all data is of the same type (e.g., `int32`), so it can add entire blocks of memory instantly without type-checking each element.

### 2. Contiguous Memory Allocation
* **Python Lists:** An array of pointers to objects scattered across memory. The computer has to jump around memory to read the data (Cache Misses).
* **NumPy:** Stores data in one continuous block of memory. The CPU can read the data much faster (Locality of Reference).



### 3. Memory Efficiency
NumPy arrays use much less memory because they store the raw data (e.g., `10010101`) rather than heavy Python objects with metadata (Reference counts, type info, etc.).

---

## Key Concepts

### 1. Dimensions (Axes)
NumPy refers to dimensions as **axes**.
* **0-D:** A single scalar (e.g., `42`).
* **1-D:** A simple array (like a list).
* **2-D:** A matrix (rows and columns).
* **3-D:** A cube (or a stack of matrices).

In [2]:
import numpy as np

arr = np.array([1,2,3,4,5])
arr

array([1, 2, 3, 4, 5])

In [32]:
val1 = np.arange(15).reshape(3,5)      # 2D Array
val1

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

In [40]:
print(val1[2][2])
print(val1.ndim)

12
2


In [51]:
print(val1.shape)

(3, 5)


In [57]:
val1.itemsize

8

In [58]:
val1.size

15

In [60]:
type(val1)

numpy.ndarray

In [52]:
val2 = np.arange(24).reshape(2,6,2)    # 3D Array
val2

array([[[ 0,  1],
        [ 2,  3],
        [ 4,  5],
        [ 6,  7],
        [ 8,  9],
        [10, 11]],

       [[12, 13],
        [14, 15],
        [16, 17],
        [18, 19],
        [20, 21],
        [22, 23]]])

In [53]:
np.arange(24).reshape(2,6,2).ndim

3

In [None]:
c = np.array([1,2,2.2])
print(c.dtype)
print(c.itemsize)

float64


8

In [62]:
z = np.zeros((3,4))
z

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [70]:
ones = np.ones((2,3), dtype=np.int8)
print(ones)

[[1 1 1]
 [1 1 1]]


In [72]:
empty = np.empty((2,3))
print(empty)

[[5.e-324 5.e-324 5.e-324]
 [5.e-324 5.e-324 5.e-324]]


In [24]:
# universal functions

d = np.arange(3, dtype=int)
print(d)
print(np.exp(d))
print(np.sqrt(d))
print(d ** 2)

[0 1 2]
[1.         2.71828183 7.3890561 ]
[0.         1.         1.41421356]
[0 1 4]


## Reading csv file

In [9]:
import pandas as pd

df = pd.read_csv('results.csv')
print(df)

        Name  Maths  Science  English
0         Om     79       56       34
1  Shreekant     12       23       12
2     Yogesh     55       87       87


In [12]:
df['Total'] = df['Maths'] + df['Science'] + df['English']
df

Unnamed: 0,Name,Maths,Science,English,Total
0,Om,79,56,34,169
1,Shreekant,12,23,12,47
2,Yogesh,55,87,87,229


In [15]:
df['Average'] = df['Total'] / 3
df

Unnamed: 0,Name,Maths,Science,English,Total,Average
0,Om,79,56,34,169,56.333333
1,Shreekant,12,23,12,47,15.666667
2,Yogesh,55,87,87,229,76.333333


In [18]:
df['Result'] = (df[['Maths', 'Science', 'English']] >= 40).all(axis=True)
df

Unnamed: 0,Name,Maths,Science,English,Total,Average,Result
0,Om,79,56,34,169,56.333333,False
1,Shreekant,12,23,12,47,15.666667,False
2,Yogesh,55,87,87,229,76.333333,True


In [21]:
topper = df.loc[df["Total"].idxmax()]
print("Topper: ")
print(topper['Name'], " with ", topper['Total'], 'marks')

Topper: 
Yogesh  with  229 marks


## Making a dice game using pandas

In [138]:
import pandas as pd
import random as rnd

class diceGame:
    def __init__(self, players):
        self.players = players
        self.score = pd.DataFrame(0, index=players, columns=['Scores'])
        
    def rollDice(self):
        return rnd.randint(1,6)
    
    def play(self):
        for player in self.players:
            rollscore = self.rollDice()
            self.score.loc[player, 'Scores'] += rollscore
    
    def winner(self):
        return self.score.sort_values(by='Scores', ascending=False)

if __name__ == "__main__":
    players = ["Prathamesh", "Shreekant"]
    game = diceGame(players)
    for _ in range(5):
        game.play()
    print(game.winner())

            Scores
Prathamesh      17
Shreekant       17


## Create a dataframe: Name : jaya, vinay, tina, gana, Era. Age : 23, 25, 15, 13, 21
- create a dataframe using above data
- show the names who are appliable for voting (age >= 18)
- except tina display the dataframe
- add one columnm 'city' in dataframe
- display who's clty is not equal to pune
- display the only name which start with 'G'
- print only top 3 rows
- print only last 2 rows
- display the data whose city name started with 'M'

In [None]:
import pandas as pd

data = {
    'Name' : ['Jaya', 'Vinay', 'Tina', 'Gana', 'Era'],
    'Age' : [23, 25, 15, 13, 21]
}

# create a dataframe using above data
df = pd.DataFrame(data)

# print only top 3 rows
df.head(3)

Unnamed: 0,Name,Age
0,Jaya,23
1,Vinay,25
2,Tina,15


In [None]:
# print only last 2 rows
df.tail(2)

Unnamed: 0,Name,Age
3,Gana,13
4,Era,21


In [None]:
# show the names who are appliable for voting (age >= 18)
df['Voting'] = (df[['Age']] >= 18).all(axis=True)
df

Unnamed: 0,Name,Age,Voting
0,Jaya,23,True
1,Vinay,25,True
2,Tina,15,False
3,Gana,13,False
4,Era,21,True


In [None]:
# add one columnm 'city' in dataframe
df['Address'] = ['Delhi', 'Pune', 'Mumbai', 'Pune', 'Bengaluru']
df

Unnamed: 0,Name,Age,Address
0,Jaya,23,Delhi
1,Vinay,25,Pune
2,Tina,15,Mumbai
3,Gana,13,Pune
4,Era,21,Bengaluru


In [None]:
# display who's clty is not equal to pune
notPune = df[df['Address'] != 'Pune']
notPune

Unnamed: 0,Name,Age,Address
0,Jaya,23,Delhi
2,Tina,15,Mumbai
4,Era,21,Bengaluru


In [None]:
# except tina display the dataframe
noTina = df[df['Name'] != 'Tina']
noTina

Unnamed: 0,Name,Age,Address
0,Jaya,23,Delhi
1,Vinay,25,Pune
3,Gana,13,Pune
4,Era,21,Bengaluru


In [None]:
# display the only name which start with 'G'
nameStartsG = df[df['Name'].str.startswith('G')]
nameStartsG

Unnamed: 0,Name,Age,Address
3,Gana,13,Pune


In [None]:
# display the data whose city name started with 'M'
addressStartM = df[df['Address'].str.startswith('M')]
addressStartM

Unnamed: 0,Name,Age,Address
2,Tina,15,Mumbai
