#### Fuzzywuzzy is used for String Matching.
It calculates _Levenshtein Distance_ to calculate the difference between sequences.

In [2]:
from fuzzywuzzy import fuzz

#### Simple Ratio

It Measures the Levenstein Ratio i.e.

$$
\text{similarity} = 1 - \frac{\text{lev}(a, b)}{\sum(\text{len}(a), \text{len}(b))}
$$


***While Calculating Ratio, If Any Substitution happens, then distance is increased by 1***

In [None]:
string1 = "ab"
string2 = "ac"

# Substitution b -> c: 1
print(fuzz.ratio(string1, string2))

(1 - ((1+1) / (2+2)))*100

50


50.0

In [11]:
string1 = "ab"
string2 = "a"

# Deletion b: 1
print(fuzz.ratio(string1, string2))

(1 - ((1) / (1+2)))*100

67


66.66666666666667

In [12]:
string1 = "Hello"
string2 = "Hello Worlds"

# Addition " Worlds" -> 7
print(fuzz.ratio(string1, string2))

(1 - ((7) / (5+12)))*100

59


58.82352941176471

In [15]:
string1 = "Apple iPhone 14 Pro Max"
string2 = "iPhone 14"

# Addition "Apple * Pro Max" -> 14
print(fuzz.ratio(string1, string2))

(1 - ((14) / (9+23)))*100

56


56.25

#### Partial Ratio

It is a string similarity metric that measures how well a **shorter string** matches a **substring of a longer string**.

#### How it Works

String 1: Apple iPhone 14 Pro Max

String 2: iPhone 14

Step 1: Identify the shorter and longer string
- Shorter = "iPhone 14"
- Longer = "Apple iPhone 14 Pro Max"

Step 2: Slide the Shorter String over all substring of Longer one
- It compares "iPhone 14" with every possible substring of "Apple iPhone 14 Pro Max" that’s roughly the same length, and calculates a similarity ratio for each (using Levenshtein distance).

Step 3: Return the Highest Similarity Score calculated using Levenshtein distance.
- The algorithm finds the substring "iPhone 14" inside the longer string and gives a near-perfect match.

In [18]:
def all_subtsring(text):
    
    substrings = []

    # Generate all contiguous substrings
    for start in range(len(text)):
        for end in range(start + 1, len(text) + 1):
            substrings.append(text[start:end])

    return substrings


In [None]:
string1 = "Apple iPhone 14 Pro Max"
string2 = "iPhone 14"

# Addition "Apple * Pro Max" -> 14
print(f"Partial Ration between {string1} and {string2} is {fuzz.partial_ratio(string1, string2)}")


# Step 1: 
    # Smaller: iPhone 14
    # Larger: Apple iPhone 14 Pro Max

# Step 2 & 3: Slide Smaller over all Substring, and return the Maximum Levenshtein Ratio
Substrings = all_subtsring(string2)
susbtrings_l_ratio = [fuzz.ratio(string2, sub_str) for sub_str in Substrings]
max_l_ratio = max(susbtrings_l_ratio)

print(f"Maximum Levenshtein Ratio from the substring: {Substrings[susbtrings_l_ratio.index(max_l_ratio)]} is: {max(susbtrings_l_ratio)}")

Partial Ration between Apple iPhone 14 Pro Max and iPhone 14 is 100
Maximum Levenshtein Ratio from the substring: iPhone 14 is: 100


A
Ap
App
Appl
Apple
Apple 
Apple i
Apple iP
Apple iPh
Apple iPho
Apple iPhon
Apple iPhone
Apple iPhone 
Apple iPhone 1
Apple iPhone 14
Apple iPhone 14 
Apple iPhone 14 P
Apple iPhone 14 Pr
Apple iPhone 14 Pro
Apple iPhone 14 Pro 
Apple iPhone 14 Pro M
Apple iPhone 14 Pro Ma
Apple iPhone 14 Pro Max
p
pp
ppl
pple
pple 
pple i
pple iP
pple iPh
pple iPho
pple iPhon
pple iPhone
pple iPhone 
pple iPhone 1
pple iPhone 14
pple iPhone 14 
pple iPhone 14 P
pple iPhone 14 Pr
pple iPhone 14 Pro
pple iPhone 14 Pro 
pple iPhone 14 Pro M
pple iPhone 14 Pro Ma
pple iPhone 14 Pro Max
p
pl
ple
ple 
ple i
ple iP
ple iPh
ple iPho
ple iPhon
ple iPhone
ple iPhone 
ple iPhone 1
ple iPhone 14
ple iPhone 14 
ple iPhone 14 P
ple iPhone 14 Pr
ple iPhone 14 Pro
ple iPhone 14 Pro 
ple iPhone 14 Pro M
ple iPhone 14 Pro Ma
ple iPhone 14 Pro Max
l
le
le 
le i
le iP
le iPh
le iPho
le iPhon
le iPhone
le iPhone 
le iPhone 1
le iPhone 14
le iPhone 14 
le iPhone 14 P
le iPhone 14 Pr
le iPhone 14 Pro
le iPhone 14 Pro 
le i