Linear Search
Linear search, also known as sequential search, checks each element of a list one by one until the desired element is found or the list ends. It is a straightforward search algorithm that does not require the list to be sorted. The time complexity of linear search is O(n), where n is the number of elements in the list. This is because, in the worst-case scenario, every element in the list needs to be checked before finding the target or concluding that it is not present. Linear search is useful when dealing with unsorted or small datasets, or when the cost of sorting the data is not justified.

Binary Search
Binary search works on sorted lists and repeatedly divides the search interval in half. If the desired element is less than the middle element, the search continues in the lower half; otherwise, it continues in the upper half. This halving process continues until the element is found or the interval is empty. The time complexity of binary search is O(log n), where n is the number of elements in the list. This is because each step reduces the search interval by half, making it much more efficient for large datasets compared to linear search. Binary search is significantly faster than linear search for large, sorted datasets due to its logarithmic time complexity.

Conclusion
Binary search performs faster than linear search, especially for large datasets, due to its logarithmic time complexity. However, binary search requires the dataset to be sorted, whereas linear search does not. Therefore, if the dataset is unsorted and small, linear search is a simple and effective option. For large datasets, sorting the data and using binary search is more efficient.




In [1]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px


# Load the dataset
file_path =( r"C:\Users\pac computers\OneDrive - Mangosuthu University of Technology\Desktop\TP2 Test2\InformalDataset.csv.csv")
df = pd.read_csv(file_path)

In [2]:
# Extract the 'Settlement Name' column for the search
settlement_names = df['Settlement Name'].dropna().tolist()

# Define the element to search for
search_element = "Khayelitsha"

# Linear Search Implementation
def linear_search(arr, target):
    for index, value in enumerate(arr):
        if value == target:
            return index
    return -1

# Binary Search Implementation
def binary_search(arr, target):
    left, right = 0, len(arr) - 1
    while left <= right:
        mid = left + (right - left) // 2
        if arr[mid] == target:
            return mid
        elif arr[mid] < target:
            left = mid + 1
        else:
            right = mid - 1
    return -1

# Perform Linear Search
linear_result = linear_search(settlement_names, search_element)
print(f'Linear Search: Element {search_element} found at index {linear_result}')

# Perform Binary Search
# Note: Binary search requires a sorted list
settlement_names_sorted = sorted(settlement_names)
binary_result = binary_search(settlement_names_sorted, search_element)
print(f'Binary Search: Element {search_element} found at index {binary_result} in the sorted list')

Linear Search: Element Khayelitsha found at index 0
Binary Search: Element Khayelitsha found at index 18 in the sorted list
