<a href="https://colab.research.google.com/github/2003Yash/Fuzzy-Search---RapidFuzz/blob/main/Fuzzy_Search_RapidFuzz.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [2]:
!pip install rapidfuzz

Collecting rapidfuzz
  Downloading rapidfuzz-3.12.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (12 kB)
Downloading rapidfuzz-3.12.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.1 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.1/3.1 MB[0m [31m24.4 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: rapidfuzz
Successfully installed rapidfuzz-3.12.2


In [3]:
from rapidfuzz import process, fuzz

In [4]:
# Custom dataset of 50 product names
product_names = [
    "Apple iPhone 15", "Samsung Galaxy S23", "Google Pixel 8", "OnePlus 11", "Xiaomi Redmi Note 12",
    "Sony PlayStation 5", "Microsoft Xbox Series X", "Nintendo Switch OLED", "Apple MacBook Pro 14",
    "Dell XPS 13", "HP Spectre x360", "Asus ROG Strix G16", "Lenovo ThinkPad X1 Carbon", "Acer Predator Helios 300",
    "Bose QuietComfort 45", "Sony WH-1000XM5", "Apple AirPods Pro", "Samsung Galaxy Buds 2", "JBL Flip 6",
    "Amazon Echo Dot 5th Gen", "Google Nest Hub", "Roku Streaming Stick 4K", "Nvidia RTX 4090", "AMD Ryzen 9 7950X",
    "Intel Core i9-13900K", "Samsung 980 Pro SSD", "Crucial DDR5 RAM", "Logitech MX Master 3", "Razer DeathAdder V3",
    "Corsair K95 RGB Keyboard", "HyperX Cloud Alpha", "SteelSeries Arctis Pro", "Oculus Quest 3",
    "Fitbit Charge 5", "Apple Watch Ultra", "Garmin Fenix 7", "Samsung Galaxy Watch 5",
    "GoPro Hero 11", "DJI Mini 3 Pro", "Canon EOS R6", "Sony A7 IV", "Fujifilm X-T5",
    "LG C3 OLED TV", "Samsung QN90B QLED TV", "Sony Bravia XR A80K", "TCL 6-Series Roku TV",
    "Philips Hue Smart Lights", "TP-Link WiFi 6 Router", "Eufy Smart Lock", "Ring Video Doorbell 4"
]

In [5]:
# Function to perform fuzzy search
def fuzzy_product_search(query, products, top_n=5):
    matches = process.extract(query, products, scorer=fuzz.ratio, limit=top_n) # .extract() returns a list of tuples. with each tuple of ( Name, Score, Index ) attributes
                                                                                                   # and index is it's index value in top_n
    return matches

In [6]:
# User input for search query
user_query = input("Enter product name: ")

# Perform fuzzy search
results = fuzzy_product_search(user_query, product_names)

# Display results
print("\n🔍 Top Matches:")
for name, score, _ in results:
    print(f"✅ {name} (Similarity: {score}%)")

Enter product name: Smart

🔍 Top Matches:
✅ Eufy Smart Lock (Similarity: 50.0%)
✅ Philips Hue Smart Lights (Similarity: 34.48275862068966%)
✅ Samsung Galaxy Watch 5 (Similarity: 29.629629629629626%)
✅ Samsung Galaxy S23 (Similarity: 26.086956521739136%)
✅ Sony PlayStation 5 (Similarity: 26.086956521739136%)


# How fuzz.ratio (Similarity) Works
Levenshtein Distance counts the minimum number of single-character edits (insertions, deletions, or substitutions) needed to transform one string into another.

------------------------------------------------------------------------

## Enhanced Using the Fuzzy Ratio Scores ie..,

The fuzzy ratio scores measure text similarity using different techniques:

1.   Ratio – Direct character-level similarity between query and product.
2.   Partial Ratio – Matches substrings when input is incomplete or extra words exist.
3.   Token Sort Ratio – Compares words after sorting to ignore word order differences.
4.   Token Set Ratio – Handles duplicate/missing words by comparing unique word sets.


These scores help rank results based on relevance in fuzzy searching. 🚀


In [10]:
# Function to compute all fuzzy factors for each product
def compute_fuzzy_scores(query, product):
    query = query.lower().strip()
    product = product.lower().strip()

    score1 = fuzz.ratio(query, product)
    score2 = fuzz.partial_ratio(query, product)
    score3 = fuzz.token_sort_ratio(query, product)
    score4 = fuzz.token_set_ratio(query, product)

    # Weighted score combination
    weighted_score = (0.4 * score1) + (0.2 * score2) + (0.2 * score3) + (0.2 * score4)
    return score1, score2, score3, score4, weighted_score


In [11]:
# Function to perform fuzzy search with detailed output
def fuzzy_product_search(query, products, top_n=5, initial_threshold=60):
    query = query.lower().strip()  # Normalize user input
    threshold = initial_threshold

# Implementation of DYNAMIC THRESHOLDING: starts with 30 as threshold if no match found reduces threshold byb 10 gradually in the last lint on this loop

    while threshold >= 30: # loop ensures that the search continues as long as the threshold is above 30%.
        scored_results = [
            (product, *compute_fuzzy_scores(query, product)) for product in products
        ]

        # Filter matches above threshold and sort by final weighted score
        filtered_results = sorted(
            [res for res in scored_results if res[-1] >= threshold],
            key=lambda x: x[-1],
            reverse=True
        )

        if filtered_results:  # If results found, return them
            return filtered_results[:top_n]
        threshold -= 10  # Reduce threshold dynamically

    return []

In [12]:
# User input for search query
user_query = input("Enter product name: ").strip()

# Perform fuzzy search
results = fuzzy_product_search(user_query, product_names)

# Display results
print("\n🔍 **Top Matches with Detailed Fuzzy Scores:**")
if results:
    print(f"{'Product Name':<35} | {'Ratio':<6} | {'Partial':<8} | {'Sort':<6} | {'Set':<6} | {'Final Score':<6}")
    print("-" * 90)

    for name, ratio, partial, sort, set_score, final in results:
        print(f"{name:<35} | {ratio:<6.1f} | {partial:<8.1f} | {sort:<6.1f} | {set_score:<6.1f} | {final:<6.1f}")
else:
    print("❌ No relevant matches found!")


Enter product name: samsung

🔍 **Top Matches with Detailed Fuzzy Scores:**
Product Name                        | Ratio  | Partial  | Sort   | Set    | Final Score
------------------------------------------------------------------------------------------
Samsung Galaxy S23                  | 56.0   | 100.0    | 56.0   | 100.0  | 73.6  
Samsung 980 Pro SSD                 | 53.8   | 100.0    | 53.8   | 100.0  | 72.3  
Samsung Galaxy Buds 2               | 50.0   | 100.0    | 50.0   | 100.0  | 70.0  
Samsung QN90B QLED TV               | 50.0   | 100.0    | 50.0   | 100.0  | 70.0  
Samsung Galaxy Watch 5              | 48.3   | 100.0    | 48.3   | 100.0  | 69.0  
