# Language Processing App

This notebook demonstrates how to create a language processing application using various libraries including Streamlit, Tesseract, Hugging Face Transformers, Autocorrect, Langdetect, and Googletrans.

## Installation

First, we need to install the required libraries. Also all this code need to be done on python as streamlit is no supported by jupyter notebook.

```python
!pip install streamlit pillow pytesseract transformers autocorrect langdetect googletrans==4.0.0-rc1


### The first Step is to Import the required libraries:

```python
import streamlit as st
from PIL import Image
import pytesseract
from transformers import T5Tokenizer, TFT5ForConditionalGeneration
from autocorrect import Speller
from langdetect import detect
from googletrans import Translator

### The Design and colour scheme of the interface (optional): 
Here we assigned the design and colour scheme of the interface in streamlit as per our requirement. This was done to make it more eye catching and can be changed according to one's taste.

```python
st.markdown(
    """
    <style>
    .stApp {
        background: linear-gradient(to right, #87ceeb, #4682b4, #87ceeb, #ffffff);
        background-attachment: fixed;
        color: white;
    }
    .css-1cpxqw2, .css-1d391kg, .css-1v0mbdj, .css-1a32fsj, .css-1dq8tca, .css-1fawtly {
        background-color: white !important;
        color: black !important;
    }
    .css-1d391kg {
        border: 1px solid #ccc;
    }
    .stTextArea textarea {
        background-color: white !important;
        color: black !important;
    }
    .stButton>button {
        background-color: white;
        color: black;
        border: 1px solid #ccc;
        border-radius: 4px;
    }
    .stButton>button:hover {
        background-color: #f0f0f0;
        color: black;
    }
    .stSelectbox div[role='combobox'] {
        background-color: white !important;
        color: black !important;
    }
    .stSelectbox div[role='listbox'] ul {
        background-color: white !important;
    }
    .stSelectbox div[role='listbox'] ul li {
        color: black !important;
    }
    /* Hide the header and footer */
    header {visibility: hidden;}
    footer {visibility: hidden;}
    </style>
    """,
    unsafe_allow_html=True
)

### Loading the first model (Grammar Error Correction): 
The first model is trained here which will eventually work on the Grammar Error Correction functionality.

@st.cache_resource is needed as to run the function only once and then store the result in a cache. Subsequent calls to the function with the same arguments will return the cached result instead of recomputing it.

```python
@st.cache_resource
def load_t5_model():
    try:
        model = TFT5ForConditionalGeneration.from_pretrained('t5-small')
        tokenizer = T5Tokenizer.from_pretrained('t5-small')
        return model, tokenizer
    except Exception as e:
        st.error(f"Error loading T5 model: {e}")
        return None, None

model, tokenizer = load_t5_model()
model.load_weights('GrammarErrorCorrection.h5')


### Loading the second model (Tone enhancement):

The second model is trained here which will eventually work on the Tone Enhancement functionality.

```python
def load_tone_model():
    try:
        model_2 = TFT5ForConditionalGeneration.from_pretrained('t5-small')
        model_2.load_weights('tone_enhancement.h5')  
        tokenizer_2 = T5Tokenizer.from_pretrained('t5-small')
        return model_2, tokenizer_2
    except Exception as e:
        st.error(f"Error loading tone model: {e}")
        return None, None

model_2, tokenizer_2 = load_tone_model()

### Defining the function for Grammar Error Correction:

Defining the functions which will take in the incorrect text from the user and will give out the corrected text after running it through the trained model (stated above).

the **text** parameter is the sentence given by the user while the **model** and **tokenizer** parameters are used to call in the model.

```python
def preprocess_text(text):
    return "correct grammar: " + text

def correct_grammar(text, model, tokenizer):
    if model is None or tokenizer is None:
        return "Model not loaded"
    
    input_text = preprocess_text(text)
    input_ids = tokenizer.encode(input_text, return_tensors='tf')
    outputs = model.generate(input_ids)
    corrected_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
    if corrected_text[0].islower():
        corrected_text = corrected_text[0].upper() + corrected_text[1:]
    spell = Speller(lang='en')
    corrected_text = spell(corrected_text)
    
    return corrected_text

### Defining the function for Tone Enhancement: 

Defining the functions which will take in the informal text from the user and will give out the enhanced text after runnning it through the trained model (stated above).

the **text** parameter takes in the informal text while the **model_2** and **tokenizer_2** parameters are used to call in the model.

```python
def enhance_tone(text, model_2, tokenizer_2):
    if model_2 is None or tokenizer_2 is None:
        return "Model not loaded"
    
    input_ids = tokenizer_2.encode(text, return_tensors='tf')
    outputs = model_2.generate(input_ids)
    enhanced_text = tokenizer_2.decode(outputs[0], skip_special_tokens=True)
    
    return enhanced_text

### Defining the function for translator:


Defining a function to translate a grammatically correct statement to any other language using Google Translate API. The function takes in two parameters :

**text** - this will take in the sentence that need to be translated.<br>
**target_language** - this will take in the desired language from the selection in which we want to translate.

```python
def translator(text, target_language):
    languages = ['English', 'Chinese', 'Hindi', 'Spanish', 'German', 'French', 'Bengali', 'Portuguese', 'Russian', 'Urdu', 'Swedish', 'Dutch', 'Indonesian', 'Italian', 'Gujarati', 'Tamil', 'Telugu', 'Japanese', 'Punjabi']
    codes     = ['en', 'zh-cn', 'hi', 'es', 'de', 'fr', 'bn', 'pt', 'ru', 'ur', 'sv', 'nl', 'id', 'it', 'gu', 'ta', 'te', 'ja', 'pa']

    original_language_code = detect(text)
    original_language = "None"
    target_language_code = "None"

    for i in range(len(languages)):
        if languages[i].lower() == target_language.lower():
            target_language_code = codes[i]
            break

    for i in range(len(languages)):
        if codes[i] == original_language_code:
            original_language = languages[i]
            break

    if original_language == "None":
        return "Unable to recognize language."
    else:
        translator = Translator()
        translated_text = translator.translate(text, src=original_language_code, dest=target_language_code)
        return translated_text.text

This is another way of inputing the text in the form of image which will then extract the text from it and will then run the desired functionality as asked by the user.
```python
def extract_text_from_image(image):
    languages_codes = 'eng+chi_sim+hin+spa+deu+fra+ben+por+rus+urd+swe+nld+ind+ita+guj+tam+tel+jpn+pan'
    text = pytesseract.image_to_string(image, lang=languages_codes)
    return text

### Intializing the Session state variables:

We intitalize the session state variables to check the state of the program so as to stop it from going back to the intial point.
```python
if "input_selected" not in st.session_state:
    st.session_state.input_selected = False

## STREAMLIT APP LAYOUT

### TITLE AND INTRODUCTION:
```python
st.header("About the App")
st.write("""
Our language processing app is a state-of-the-art tool that offers grammar correction, tone analysis, and language translation. 
It uses advanced natural language processing techniques to identify and correct errors in sentence structure, analyze the tone of the text, and translate text from one language to another. 
With its high accuracy and speed, this app is perfect for anyone looking to improve their writing skills or simply get instant feedback on their language.
""")

### Making interactive buttons:

Making the buttons for each functionality and giving them a specific key so as to keep the uniqueness of each button.

```python
grammar_button = st.button("Grammar Error Correction", key="grammar")
tone_button = st.button("Tone Enhancement", key="tone")
translate_button = st.button("Translator", key="translate")

if grammar_button:
    st.session_state.input_selected = "grammar"

if tone_button:
    st.session_state.input_selected = "tone"

if translate_button:
    st.session_state.input_selected = "translate"

### Handling all the functionalities

Integrating the functionalities with their respective buttons and finalizing the whole program with all the specification and design. Also making an area to upload image file so as to extract text from them and use them. 

**Note :-** While using the translator, you must input the text that need to be translated that is **GRAMMATICALLY CORRECT**.

```python
if st.session_state.input_selected:
    if st.session_state.input_selected == "grammar":
        input_option = st.radio("Choose input type:", ("Text Input", "Image Input"))

        if input_option == "Text Input":
            with st.form("grammar_form"):
                text_input = st.text_area("Enter your text here:")
                submit_button = st.form_submit_button("Correct Grammar")
                if submit_button:
                    corrected_text = correct_grammar(text_input, model, tokenizer)
                    st.write("Corrected Text:")
                    st.write(corrected_text)
        
        elif input_option == "Image Input":
            with st.form("grammar_form"):
                image_input = st.file_uploader("Upload an image:", type=["png", "jpg", "jpeg"])
                submit_button = st.form_submit_button("Correct Grammar")
                if submit_button and image_input:
                    extracted_text = extract_text_from_image(Image.open(image_input))
                    corrected_text = correct_grammar(extracted_text, model, tokenizer)
                    st.write("Extracted Text from Image:")
                    st.write(extracted_text)
                    st.write("Corrected Text:")
                    st.write(corrected_text)

    elif st.session_state.input_selected == "tone":
        input_option = st.radio("Choose input type:", ("Text Input", "Image Input"))

        if input_option == "Text Input":
            with st.form("tone_form"):
                text_input = st.text_area("Enter your text here:")
                submit_button = st.form_submit_button("Enhance Tone")
                if submit_button:
                    enhanced_text = enhance_tone(text_input, model_2, tokenizer_2)
                    st.write("Enhanced Text:")
                    st.write(enhanced_text)
        
        elif input_option == "Image Input":
            with st.form("tone_form"):
                image_input = st.file_uploader("Upload an image:", type=["png", "jpg", "jpeg"])
                submit_button = st.form_submit_button("Enhance Tone")
                if submit_button and image_input:
                    extracted_text = extract_text_from_image(Image.open(image_input))
                    enhanced_text = enhance_tone(extracted_text, model_2, tokenizer_2)
                    st.write("Extracted Text from Image:")
                    st.write(extracted_text)
                    st.write("Enhanced Text:")
                    st.write(enhanced_text)

    elif st.session_state.input_selected == "translate":
        input_option = st.radio("Choose input type:", ("Text Input", "Image Input"))

        if input_option == "Text Input":
            with st.form("translate_form"):
                text_input = st.text_area("Enter your text here (ensure it is grammatically correct):")
                languages = ['English', 'Chinese', 'Hindi', 'Spanish', 'German', 'French', 'Bengali', 'Portuguese', 'Russian', 'Urdu', 'Swedish', 'Dutch', 'Indonesian', 'Italian', 'Gujarati', 'Tamil', 'Telugu', 'Japanese', 'Punjabi']
                target_language = st.selectbox("Select the target language:", languages)
                submit_button = st.form_submit_button("Translate")
                if submit_button:
                    translated_text = translator(text_input, target_language)
                    st.write("Translated Text:")
                    st.write(translated_text)
        
        elif input_option == "Image Input":
            with st.form("translate_form"):
                image_input = st.file_uploader("Upload an image:", type=["png", "jpg", "jpeg"])
                submit_button = st.form_submit_button("Translate")
                if submit_button and image_input:
                    extracted_text = extract_text_from_image(Image.open(image_input))
                    languages = ['English', 'Chinese', 'Hindi', 'Spanish', 'German', 'French', 'Bengali', 'Portuguese', 'Russian', 'Urdu', 'Swedish', 'Dutch', 'Indonesian', 'Italian', 'Gujarati', 'Tamil', 'Telugu', 'Japanese', 'Punjabi']
                    target_language = st.selectbox("Select the target language:", languages)
                    translated_text = translator(extracted_text, target_language)
                    st.write("Extracted Text from Image:")
                    st.write(extracted_text)
                    st.write("Translated Text:")
                    st.write(translated_text)
