## **Interpolation Search**

Interpolation search is an algorithm for searching for a specific value (key) in a sorted array. It uses an interpolation formula to estimate the position of the key based on its value and the values of the surrounding elements. This makes it a more efficient alternative to binary search in certain situations.

* **Data Structure**: Requires a sorted array.
* **Average Performance**: O(log(log(n))), significantly faster than linear search and slightly faster than binary search on average.
* **Best-Case Performance**: O(1), if the target key is the first element.
* **Worst-Case Performance**: O(n), if the data is not uniformly distributed or the target key is not present.
**Strengths**:
* More efficient than binary search when data is uniformly distributed.
* Adapts to the distribution of the data, unlike binary search.
**Weaknesses**:
* Not as efficient as binary search when data is not uniformly distributed.
* More complex to implement than binary search.

Interpolation search finds a particular item by computing the probe position. Initially, the probe position is the position of the middle most item of the collection.

**mid = lo + ((hi - lo) * (X - A[lo]) / (A[hi] - A[lo]))**

where 
lo -> Lowest index of the list\
hi -> Highest index of the list\
A[n] -> Value stored at index n in the list

In [1]:
import time
import pandas as pd
import numpy as np

### Generic Functions to Evaluate Test cases

In [2]:
def evaluate_test_case(function, test):
    """This is a custom function to compute the time taken to execute the test"""
    start_time = time.time()
    output = function(**test['input'])
    end_time = time.time()
    execution_time = end_time - start_time
    print("Test Output is ", output)
    if test['output'] == output:
        print("\033[32mTEST PASSED\033[0m")
    else:
        print("\033[31mTEST FAILED\033[0m")
    print("Function Execution Time: ", execution_time, " seconds")

In [3]:
def evaluate_test_cases(function, tests):
    """This is a custom function to compute the time taken to execute the test cases"""
    for test in tests:
        start_time = time.time()
        output = function(**test['input'])
        end_time = time.time()
        execution_time = end_time - start_time
        print("Test Output is ", output)
        if test['output'] == output:
            print("\033[32mTEST PASSED\033[0m")
        else:
            print("\033[31mTEST FAILED\033[0m")
        print("Function Execution Time: ", execution_time, " seconds")
        print(50 * "===")

### Loading data from file

In [4]:
data = pd.read_csv("../ISBN_Example.csv", sep="|")
data

Unnamed: 0,isbn,name,author
0,9781492032649,"Hands-On Machine Learning with Scikit-Learn, K...",Aurélien Géron
1,9781789955750,Python Machine Learning,"Sebastian Raschka, Vahid Mirjalili"
2,9780262035613,Deep Learning,"Ian Goodfellow, Yoshua Bengio, Aaron Courville"
3,9780596529321,Programming Collective Intelligence,Toby Segaran
4,9781491957660,Python for Data Analysis,Wes McKinney
5,9781449361327,Data Science for Business,"Foster Provost, Tom Fawcett"
6,9781449369415,Introduction to Machine Learning with Python,"Andreas C. Müller, Sarah Guido"
7,9780999247108,Machine Learning Yearning,Andrew Ng
8,9781617294631,Natural Language Processing in Action,"Lane, Howard, and Hapke"
9,9781492041139,Data Science from Scratch,Joel Grus
