# Spell Checker with Levenshtein Distance

This code is a **spell checker** that uses the **Levenshtein distance** algorithm to find and suggest corrections for misspelled words. It loads a dictionary of words from a file, checks if a given word is in the dictionary, and suggests similar words if the word is not found.

---

## Code Overview

The code consists of the following main parts:

1. **Importing Required Modules:**
   - The necessary modules (`typing`) and the `optimized_levenshtein_distance` function are imported.

2. **Loading the Dictionary:**
   - A function to load a dictionary of words from a file and check if it is empty.

3. **Finding Suggestions:**
   - A function to find the most similar words from the dictionary based on the Levenshtein distance.

4. **Spell Checking:**
   - A function to check if a word is in the dictionary and suggest corrections if it is not found.

5. **Main Function:**
   - The main function that loads the dictionary, takes user input, and performs the spell check.

## Detailed Explanation

### 1. Importing Required Modules

```python
from levenshtein import optimized_levenshtein_distance
from typing import List, Set

- **optimized_levenshtein_distance**: This is a function from another file (levenshtein.py). It checks how different two words are. For example, how many letters you need to change to turn "cat" into "dog".

- **typing**: This adds hints like List or Set to make the code easier to read and fix later.

In [51]:
from opt_levenshtein import optimized_levenshtein_distance

from typing import List, Set

### 2. **Loading the Dictionary:**

This function reads a file (like english_words.txt) and creates a set of words to use as a dictionary. It makes all words lowercase and removes extra spaces.


Note: I prepared a small dictionary file called english_words.txt with 604 English words. It works for testing, but for better accuracy, you can use a larger dictionary file with more words.

In [52]:
def load_dictionary(file_path: str) -> Set[str]:
    
    try:
        with open(file_path, 'r', encoding='utf-8') as f:
            dictionary = {word.strip().lower() for word in f if word.strip()}
        
        if not dictionary:
            raise ValueError("Error: The dictionary is empty!")
        
        return dictionary
    except FileNotFoundError:
        raise FileNotFoundError(f"Error: File '{file_path}' not found.")
    except Exception as e:
        raise Exception(f"Error while loading the dictionary: {e}")

### 3. **Finding Suggestions:**

This function finds words from the dictionary that are similar to the input word using the Levenshtein distance.

####

- **`max_suggestions: int = 5`**: This decides how many word suggestions the function gives you. The default is 5, so it gives 5 suggestions. You can change it to get more or fewer suggestions.
- **`max_distance: int = 3`**: This is the biggest difference allowed between words to call them similar. The default is 3, meaning up to 3 changes (like adding, removing, or changing a letter). You can change this number.
- **`distances = [(dict_word, optimized_levenshtein_distance(word, dict_word)) for dict_word in dictionary]`**: This makes a list called `distances`. The list has pairs: each word from the dictionary (`dict_word`) and how different it is from the word you entered (`word`). The difference is called the Levenshtein distance.


###### Why Sorting Helps `distances = sorted(distances, key=lambda x: (x[1], x[0]))`
- Sorting by distance puts the most similar words at the start. For example, if "cat" is 1 step away from "cot" and 2 steps away from "dog," "cot" will be first because it’s closer.
- Sorting alphabetically helps when distances are the same. For example, if "cot" and "cut" both have a distance of 1, "cot" comes before "cut" because of the alphabet.
- This makes it easier to pick the best suggestions when we take the top `max_suggestions` words.

###### Example
- Before sorting: `[("dog", 2), ("cot", 1), ("cut", 1)]`
- After sorting: `[("cot", 1), ("cut", 1), ("dog", 2)]`
- This shows "cot" and "cut" are the closest to your word, and "cot" is first because it starts with "c," which is before "d" in the alphabet.

In short, sorting helps the function give you the best and most organized word suggestions.



In [53]:
def get_suggestions_levenshtein(word: str, dictionary: Set[str], max_suggestions: int = 5, max_distance: int = 3) -> List[str]:

    distances = [(dict_word, optimized_levenshtein_distance(word, dict_word)) for dict_word in dictionary]
    distances = sorted(distances, key=lambda x: (x[1], x[0]))
    
    return [word for word, dist in distances[:max_suggestions] if dist <= max_distance]

### 4. **Spell Checking:**

The `spell_check_levenshtein` function checks whether a given word is spelled correctly by comparing it to a provided dictionary. If the word is not found in the dictionary, the function suggests possible corrections based on the Levenshtein distance algorithm.

- `word` (`str`): The word to check. It will be processed by converting it to lowercase and removing any leading/trailing spaces.
- `dictionary` (`Set[str]`): A set containing the correct words to check against. It must be a collection of strings.
- `max_distance` (`int`, optional): The maximum allowed Levenshtein distance for the suggested corrections. Default is 3.

##### Returns
 - If the word is found in the dictionary: "✅ The word 'word' is spelled correctly."
  - If the word is not found, and suggestions are available: "❌ The word 'word' was not found. Possible corrections: [suggestions]."
  - If no suggestions are found: "❌ The word 'word' was not found. No similar words found."

In [54]:
def spell_check_levenshtein(word: str, dictionary: Set[str], max_distance: int = 3) -> str:
    
    word = word.lower().strip()
    if not word:
        return "❌ Please enter a non-empty word!"
    
    if word in dictionary:
        return f"✅ The word '{word}' is spelled correctly."
    
    suggestions = get_suggestions_levenshtein(word, dictionary, max_suggestions=5, max_distance=max_distance)
    
    if suggestions:
        return f"❌ The word '{word}' was not found. Possible corrections:\n" + \
               "\n".join(f"  {i+1}. {s}" for i, s in enumerate(suggestions))
    else:
        return f"❌ The word '{word}' was not found. No similar words found ."

### 5. **Main Function:**

The `main` function loads a dictionary of English words from a file, prompts the user to input a word, and then checks the spelling of that word using the `spell_check_levenshtein` function. If an error occurs, it prints the error message.

The function calls `spell_check_levenshtein` to check the spelling of the entered word against the loaded dictionary.
If any error occurs during the execution (e.g., file not found, wrong file format, etc.), the function catches the exception and prints an error message.

In [55]:
def main():
    try:
        dictionary = load_dictionary("english_words.txt")
        print(f"✅ Loaded {len(dictionary)} words.", flush=True)
        word = input("Enter a word: ").strip()
        print(spell_check_levenshtein(word, dictionary), flush=True)
    except Exception as e:
        print(f"🚨 Error: {e}", flush=True)

`if __name__ == "__main__":`  
  This condition checks if the script is being executed as the main program.

In [56]:
if __name__ == "__main__":
    main()

✅ Loaded 604 words.
❌ The word 'exit' was not found. Possible corrections:
  1. eat
  2. exam
  3. hit
  4. it
  5. next


# Conclusion

##### This program provides an efficient and user-friendly way to check word spellings, offering suggestions for corrections when needed.