# NLP Miniproject

### Comprehensive Answer Evaluation system
**- Bodhisatya Ghosh (2021700026)** \
**- Anish Gade (2021700022)**

In [11]:
from functions import *

**Sytem inputs**

- Evaluation type
    - Subjective
    - Keyword based
- Synpotic Answer
- Submitted answer
- Maximum marks

### Functions used

<em><b>Evaluation function:</b></em>

    def evaluate(evaluation: str, synoptic: list[str] | str, submitted: str, marks: int):
    score_factor = None     
        
        if(evaluation == 'keyword'):
            score_factor = keyphrase_match_score(synoptic.split(','),submitted)
        
        elif(evaluation == 'subjective'):
            score_factor = get_similiarity(synoptic, submitted)

        return marks * score_factor

* Evaluates answer based on mentioned method
* Returns score out of maximum score 

<em><b>Keyword matching:</b></em>

    def keyphrase_match_score(list_of_key_phrases: list[str], answer: str):
        nlp = spacy.load('en_core_web_sm')
        answer = clean_text(answer)
        score = 0
        
        for phrase in list_of_key_phrases:
            phrase = clean_text(phrase)

            m_tool = Matcher(nlp.vocab)

            vocab = [{'LOWER':word} for word in phrase.split()]
            m_tool.add('matcher',[vocab,])
            sentence = nlp(answer)
            if len(m_tool(sentence)) > 0:
                score+=1


        return score/len(list_of_key_phrases)

* Takes all given synoptic keyphrases and uses Regular Expression to clean keyphrases
* Create a PhraseMatcher object from SpaCy
* Match phrases with submitted answer to check for number of phrases matched out of total needed phrases
* Returns ratio of number of phrases matched to total needed phrases

<em><b>Similiarity matching:</b></em>

    def get_similiarity(synoptic, answer):
        sentences =[synoptic, answer]
        model = SentenceTransformer('sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2')
        embeddings = model.encode(sentences)
        similiarity = cosine_similarity(embeddings[0].reshape(1,-1),embeddings[1].reshape(1,-1))

        return similiarity[0][0]

* Takes all given synoptic model answer and submitted answer 
* Create a SentenceTransformer Object that uses 'paraphrase-multilingual-MiniLM-L12-v2'
* Encode sentences into feature vectors as trained by the model to acquire embeddings
* Returns cosine similiarity between model answer and submitted answers' feature vector embeddings as ametric of similiarity

SentenceTransformer( \
&nbsp;(0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: BertModel \
&nbsp;(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
)


## Sample question and answer

#### What are the main causes of climate change, and what can be done to mitigate its effects?

**Synoptic:** Climate change is primarily caused by human activities such as burning fossil fuels, deforestation, and industrial processes, which release greenhouse gases like carbon dioxide and methane into the atmosphere. These gases trap heat, leading to global warming and subsequent climate disruptions. To mitigate its effects, a multifaceted approach is necessary. This includes transitioning to renewable energy sources, implementing policies to reduce emissions, promoting afforestation and reforestation, adopting sustainable agricultural practices, enhancing energy efficiency, and encouraging international cooperation to set and achieve ambitious climate targets.

**Given answer:** Climate change is caused by various factors, including human activities such as burning fossil fuels and deforestation, which release greenhouse gases into the atmosphere. These gases trap heat, resulting in global warming and climate disruptions. To address climate change, efforts are needed to reduce emissions through transitioning to renewable energy sources, implementing policies to limit carbon emissions, and promoting sustainable practices in agriculture and industry.

**List of sample keywords**
* Climate change
* Causes
* Human activities
* Fossil fuels
* Greenhouse gases
* Mitigation
* Renewable energy
* Policies
* Afforestation
* Reforestation
* Sustainable practices
* International cooperation

In [12]:
synoptic = "Climate change is primarily caused by human activities such as burning fossil fuels, deforestation, and industrial processes, which release greenhouse gases like carbon dioxide and methane into the atmosphere. These gases trap heat, leading to global warming and subsequent climate disruptions. To mitigate its effects, a multifaceted approach is necessary. This includes transitioning to renewable energy sources, implementing policies to reduce emissions, promoting afforestation and reforestation, adopting sustainable agricultural practices, enhancing energy efficiency, and encouraging international cooperation to set and achieve ambitious climate targets."
imperfect = "Climate change is caused by various factors, including human activities such as burning fossil fuels and deforestation, which release greenhouse gases into the atmosphere. These gases trap heat, resulting in global warming and climate disruptions. To address climate change, efforts are needed to reduce emissions through transitioning to renewable energy sources, implementing policies to limit carbon emissions, and promoting sustainable practices in agriculture and industry."
incorrect = "Climate change is mainly caused by natural fluctuations in the Earth's temperature and solar radiation. Human activities play a minor role, but they are not significant contributors to the overall climate change phenomenon. Therefore, there's little need for mitigation efforts beyond what nature can naturally adjust."
keyword_answer = "Climate change is predominantly driven by human activities, such as the excessive use of fossil fuels and deforestation, resulting in the emission of greenhouse gases like carbon dioxide and methane. These gases trap heat in the atmosphere, leading to global warming and climate disruptions. To address this issue, it's imperative to implement comprehensive mitigation strategies. These include transitioning to renewable energy sources, enacting stringent policies to limit emissions, promoting afforestation and reforestation efforts, adopting sustainable practices across various sectors, and fostering international cooperation to tackle this global challenge effectively."
keywords = [
    "Climate change",
    "Causes",
    "Human activities",
    "Fossil fuels",
    "Greenhouse gases",
    "Mitigation",
    "Renewable energy",
    "Policies",
    "Afforestation",
    "Reforestation",
    "Sustainable practices",
    "International cooperation"
]
# keywords_string = "Climate change, Causes, Human activities, Fossil fuels, Greenhouse gases, Mitigation, Renewable energy, Policies, Afforestation, Reforestation, Sustainable practices, International cooperation"

marks = 10

### For imperfect subjective answer

Keyword based marking

In [13]:
evaluate("keyword", keywords, imperfect, marks)

'Score is: 5.8/10'

Semantic similiarity based marking

In [14]:
evaluate("subjective", synoptic, imperfect, marks)

'Score is: 9.7/10'

### For keyword based answer

Keyword based marking

In [15]:
evaluate("keyword", keywords, keyword_answer, marks)

'Score is: 9.2/10'

Semantic similiarity based marking

In [16]:
evaluate("subjective", synoptic, keyword_answer, marks)

'Score is: 9.5/10'

### For incorrect answer

Keyword based marking

In [17]:
evaluate("keyword", keywords, incorrect, marks)

'Score is: 2.5/10'

Semantic similiarity based marking

In [18]:
evaluate("subjective", synoptic, incorrect, marks)

'Score is: 8.0/10'