# Paraphrase Class

This notebook contains the class to paraphrase the paragraph or at least give recommendation of what to paraphrase just like Grammarly.

## Start Notebook
Run all of the command below to start the notebook training session of the model.

Uncomment and run the code below to kill the runtime in Google Colaboratory

In [None]:
# !kill -9 -1


Import necessary libraries

In [3]:
# Import libraries
import re
import nltk


## Paraphrasing Text (Under Development)
This function is used to paraphrase the sentences to avoid direct plagiarism.

In [124]:
def paraphrase_paragraph(paragraphs):
    '''

    Function to paraphrase list of paragraphs.

    @paragraph: A list containing the paragraphs

    return: Paraphrased paragraphs

    '''

    def tag(sentence):
        '''

        Function to tag a word with their type.

        @sentence: String sentence

        return: List of words with their tags

        '''

        words = nltk.tokenize.word_tokenize(sentence)
        words = nltk.tag.pos_tag(words)

        return words

    def paraphraseable(tag):
        return tag.startswith('NN') or tag == 'VB' or tag.startswith('JJ')

    def pos(tag):
        if tag.startswith('NN'):
            return nltk.corpus.wordnet.NOUN
        elif tag.startswith('V'):
            return nltk.corpus.wordnet.VERB

    def synonyms(word, tag): 
        lemma_lists = [ss.lemmas() for ss in nltk.corpus.wordnet.synsets(word, pos(tag))]
        lemmas = [lemma.name() for lemma in sum(lemma_lists, [])]
        return set(lemmas)

    def if_synonym_exists(sentence):
        for (word, t) in tag(sentence):
            if paraphraseable(t):
                syns = synonyms(word, t)
                if syns:
                    if len(syns) > 1:
                        yield [word, list(syns)[1]]
                        continue
            yield [word, '']

    def sentence_paraphrase(sentence):
        return [w for w in if_synonym_exists(sentence)]

    # 2D array of sentences in each paragraph
    list_sentences = []

    # Convert a list paragraph into lists of sentences
    for paragraph in paragraphs:
        sentences = [s for s in nltk.tokenize.sent_tokenize(paragraph)]

        # Loop over the sentences
        for sentence in sentences:
            sentence = sentence_paraphrase(sentence)

        list_sentences.append(sentences)

    return list_sentences


Testing the paraphraser function

In [125]:
test_paragraphs = ['At its core, AI is the branch of computer science that aims to answer Turing question in the affirmative. It is the endeavor to replicate or simulate human intelligence in machines.', 'The expansive goal of artificial intelligence has given rise to many questions and debates. So much so, that no singular definition of the field is universally accepted.']

print(paraphrase_paragraph(test_paragraphs))


[['At its core, AI is the branch of computer science that aims to answer Turing question in the affirmative.', 'It is the endeavor to replicate or simulate human intelligence in machines.'], ['The expansive goal of artificial intelligence has given rise to many questions and debates.', 'So much so, that no singular definition of the field is universally accepted.']]


## Grammar Check (Under Development)
This function is used to check and correct any grammatical error in the paragraph.