<a href="https://colab.research.google.com/github/is5558/colab_samples/blob/main/tutorials/streamlit_notebooks/SPELL_CHECKER_EN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>



![JohnSnowLabs](https://nlp.johnsnowlabs.com/assets/images/logo.png)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://githubtocolab.com/JohnSnowLabs/spark-nlp-workshop/blob/master/tutorials/streamlit_notebooks/SPELL_CHECKER_EN.ipynb)




# **Spell check your text documents**

## 1. Colab Setup

Install dependencies

In [1]:
# Install PySpark and Spark NLP
! pip install -q pyspark spark-nlp

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/718.8 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━[0m[90m╺[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m235.5/718.8 kB[0m [31m6.6 MB/s[0m eta [36m0:00:01[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m716.8/718.8 kB[0m [31m12.9 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m718.8/718.8 kB[0m [31m9.7 MB/s[0m eta [36m0:00:00[0m
[?25h

In [6]:
import sparknlp
from sparknlp.pretrained import PretrainedPipeline

def initialize_spark_nlp():
    try:
        spark = sparknlp.start()
        print("Spark NLP version:", sparknlp.version())
        return spark
    except Exception as e:
        print("Error initializing Spark NLP session:", str(e))
        raise

def load_pipeline(pipeline_name='check_spelling', lang='en'):

    try:
        return PretrainedPipeline(pipeline_name, lang=lang)
    except Exception as e:
        print(f"Error loading pipeline '{pipeline_name}':", str(e))
        raise

def get_corrected_text(annotations):
    try:
        corrected_tokens = [token.result for token in annotations['checked']]
        return " ".join(corrected_tokens).replace(" ,", ",").replace(" .", ".")
    except KeyError:
        print("Error: 'checked' key not found in annotations.")
        return ""

def main():
    text = (
        "Yesturday, I went to the libary to borow a book about anciant civilizations. "
        "The wether was pleasent, so I decidid to walk insted of taking the buss. On the way, "
        "I saw a restuarent that lookt intresting, and I plan to viset it soon."
    )

    try:
        # Initialize Spark NLP and load the pipeline
        spark = initialize_spark_nlp()
        pipeline = load_pipeline()

        # Annotate text
        annotations = pipeline.fullAnnotate(text)[0]

        # Get and print corrected text
        corrected_text = get_corrected_text(annotations)
        print("*"*77)
        print("Original Text:\n", text)
        print("Corrected Text:\n", corrected_text)
        print("*"*77)

    except Exception as e:
        print("An unexpected error occurred:", str(e))

main()

Spark NLP version: 6.0.4
check_spelling download started this may take some time.
Approx size to download 884.9 KB
[OK!]
*****************************************************************************
Original Text:
 Yesturday, I went to the libary to borow a book about anciant civilizations. The wether was pleasent, so I decidid to walk insted of taking the buss. On the way, I saw a restuarent that lookt intresting, and I plan to viset it soon.
Corrected Text:
 Yesterday, I went to the library to borrow a book about ancient civilizations. The whether was pleasant, so I decided to walk instead of taking the bus. On the way, I saw a restuarent that looks interesting, and I plan to visit it soon.
*****************************************************************************
