Challenge : Traverse and search an array with infinite elements without checking each element. 

Solution lies in Hash Table. 

**Implementation Code available below**

![image.png](attachment:bcbdfc34-dd32-4a2a-8dc7-a8353042173c.png)

**Hash tables, also known as hash maps**

are fundamental data structures used in a wide range of real-world applications for efficient data storage and retrieval. They are known for their fast average-case time complexity for basic operations. Here are some common real-world applications of hash tables along with algorithms associated with this data structure:

**Real-World Applications:**

Databases: Hash tables are used in database systems to implement indexes, enabling fast lookup of records based on keys.

Caching: Caches in web servers, content delivery networks (CDNs), and databases use hash tables to store frequently accessed data for rapid retrieval.

Distributed Systems: Distributed hash tables (DHTs) are used to distribute data across a network of nodes for efficient decentralized data storage and retrieval.

Symbol Tables: Compilers and interpreters use hash tables to store and look up variables, functions, and symbols during program execution.

File Systems: Hash tables are used in file systems to map file names or directory paths to file locations, improving file access efficiency.

Deduplication: Backup and storage systems use hash tables to identify duplicate data blocks and store them only once, reducing storage requirements.

Network Routing: Hash tables are used in routing tables to efficiently determine the next hop for forwarding network packets.

Cryptography: Hash functions are fundamental in cryptography for creating digital signatures, password hashing, and message authentication codes.

Natural Language Processing: Hash tables can be used to store word frequencies, making it easier to analyze and process text data.

Counting and Statistics: Hash tables can count occurrences of items in large datasets, making them useful for generating histograms and frequency analysis.

**Algorithms and Operations:**

Insertion and Retrieval (Hashing): Hash tables use a hash function to convert keys into array indices. Algorithms for insertion and retrieval involve hashing the key to determine the index and handling collisions (e.g., chaining or open addressing).

Collision Resolution: Algorithms like chaining (using linked lists) or open addressing (probing or rehashing) are used to address collisions when multiple keys hash to the same index.

Hash Functions: Designing efficient hash functions is crucial. Algorithms for hash function design aim to distribute keys uniformly across the table to minimize collisions.

Resizing: As the number of stored elements increases, hash tables may need to be resized. Algorithms for resizing involve creating a new, larger table and rehashing existing elements.

Deletion: Deleting an element from a hash table involves locating the key's index, handling any collisions, and updating the table accordingly.

Load Factor and Rehashing: Algorithms for load factor management determine when to resize the hash table to maintain efficient performance. Rehashing involves redistributing elements when resizing.

Open Addressing Techniques: Algorithms like linear probing, quadratic probing, and double hashing are used in open addressing collision resolution to find the next available slot in case of collisions.

Hash Functions for Specific Data Types: Algorithms for designing hash functions for strings, integers, or custom data types aim to minimize collisions and distribute data uniformly.

Distributed Hash Tables (DHTs): DHT algorithms enable efficient distributed storage and retrieval of data across a network of nodes, often using consistent hashing techniques.

**Python provides several libraries and modules that use hash tables (dictionaries) extensively**

for efficient data storage and retrieval. Here are some of the Python libraries and modules that use hash tables in their methods or classes:

Built-in Dictionary (dict): Python's built-in dictionary data structure is implemented as a hash table. It is widely used throughout Python programming.

collections.defaultdict: This module provides a dictionary subclass, defaultdict, that allows you to specify a default value for missing keys. It internally uses a hash table for storage.

collections.Counter: The Counter class is used for counting the occurrences of elements in iterable objects. It uses a dictionary (hash table) for storage.

collections.OrderedDict: An OrderedDict is a dictionary that remembers the order of key-value pairs. It is implemented using a hash table with additional data structures to maintain order.

pickle: The pickle module uses hash tables to efficiently serialize and deserialize Python objects.

shelve: The shelve module is a built-in database system that uses dictionaries (hash tables) to store key-value pairs persistently.

sqlite3: The sqlite3 module, used for working with SQLite databases, internally uses hash tables to optimize query execution.

json: The json module uses dictionaries to parse and serialize JSON data.

yaml: Libraries like PyYAML that parse and generate YAML documents often use hash tables for storing key-value mappings.

Pandas: The Pandas library, used for data manipulation and analysis, utilizes hash tables for indexing and managing DataFrames.

NumPy: NumPy arrays can be indexed using dictionaries (hash tables) to retrieve values based on specific keys or conditions.

SciPy: The SciPy library uses hash tables for various purposes, including sparse matrix representations.

requests: The requests library, used for making HTTP requests, can parse and generate JSON data, which internally uses dictionaries.

Flask: Flask, a popular web framework, uses dictionaries (hash tables) to store configuration settings and manage request context.

Django: Django, a web framework, uses dictionaries (hash tables) for various purposes, including caching and session management.

SQLAlchemy: SQLAlchemy, a database toolkit and ORM, uses dictionaries in various components for query optimization and caching.

Tornado: The Tornado web framework utilizes dictionaries (hash tables) for request and response handling, including URL routing.

Celery: Celery, a distributed task queue, uses dictionaries (hash tables) for task results storage and result backend.

Distributed Computing Libraries: Libraries like Dask and Ray often use dictionaries (hash tables) in their distributed data structures and task scheduling.

Natural Language Processing Libraries: Libraries like spaCy and NLTK may use hash tables for various language processing tasks, such as word frequency counting.

Collision is when two different value is returned the same Hash

**Seperate Chaining**
![image.png](attachment:30847b90-3de3-4513-aea6-0371df7ad782.png)

**Open Addressing with Probing**
![image.png](attachment:3dfcd4de-6d75-4101-be4b-e430f502cb47.png)

The Hash functions can be bad, better and good depending on how well the hashing **avoids collision**

The idea of Pigeon hole will force multiple values into a particular Hash key

After check if we have to rehash 

![image.png](attachment:8b764319-4a0e-46da-bbc2-6a14575638a1.png)

![image.png](attachment:d74bc4f7-98ae-4693-9953-44acf25f6bc3.png)

load factor  :

![image.png](attachment:38350c06-cac5-433b-b6f0-3a14442d9c15.png)

![image.png](attachment:ba6e0653-3388-46b5-9a33-1d0cea3b2ba4.png)

![image.png](attachment:b46e550f-1ad6-4065-9f95-1fd917d5b5fc.png)

## Implementation Code

In [7]:
class Node:
    def __init__(self, key,value,nxt=None):
        self.key: str = key
        self.value: str = key
        self.nxt: Node = nxt

In [31]:
class LinkedList:
    def __init(self, head=None):
        self.head: Node = head

    def insert(self, key, value):
        new_node = Node(key, value)
        self.head = new_node

    def search(self, key):
        node = self.head
        while node:
            if node.key == key:
                return node #If we want the value,then use the get 
            node = node.nxt
        return None

    def delete(self,key):
        if self.head and self.head.key == key:
            self.head == self.head.nxt
            return True

        else:
            node = self.head
            while node and node.nxt and node.nxt.key != key:
                node = node.nxt

            if node:
                node.nxt == node.nxt.nxt
                return true

        return False

    def get(self, key):
        node = self.head
        while node:
            if node.key == key:
                return node.value #After all we want the value
            node = node.nxt
        return None

In [32]:
tryll = LinkedList()

In [33]:
tryll.insert(5,5)

In [34]:
tryll.get(5)

5

In [56]:
tryll.insert(7,6)

In [62]:
tryll.search(7) is None

False

In [52]:
from typing import List

class HashTable:
    def __init__(self):
        self.capacity: int = 30
        self.size: int = 0
        self.load_factor: float = 0.75
        self.slots: List[LinkedList] = [LinkedList() for _ in range(self.capacity)]

    def hash_func(self, key):
        total = 0
        for ch in key:
            total += ord(ch)
        return total % self.capacity

    def rehash(self):
        old = self.slots
        self.capacity *= 2
        self.size = 0
        self.slots = [LinkedList() for _ in range(self.capacity)]
        for slot in old:
            node = slot.head
            while node:
                self.insert(node.key, node.value)
                node = node.nxt

    def insert(self, key, value):
        index = self.hash_func(key)
        print(index)
        print(type(self.slots[index]))
        node = self.slots[index].search(value)
        if node:
            node.value = value
        else:
            self.slots[index].insert(key,value)
            self.size += 1
            if self.size/self.capacity > self.load_factor:
                self.rehash()

    def search(self,key):
        index = self.hash_func(key)
        node = self.slots[index].search(key)
        return node is not None

    def get(self,key):
        index = self.hash_func(key)
        node = self.slots[index].search(key)
        return node.value if node else None

    def delete(self,key):
        index = self.hash_func(key)
        if self.slots[index].delete(key):
            self.size -= 1
        

In [53]:
agenda = HashTable()

In [54]:
agenda.insert('new task','There is more to be done')

17
<class '__main__.LinkedList'>


AttributeError: 'LinkedList' object has no attribute 'head'