# Öğrenci Cevapları Üzerinde Metin Ön İşleme
Bu notebook, açık uçlu öğrenci cevapları içeren veri setine çeşitli metin ön işleme teknikleri uygular.

In [1]:
import pandas as pd
import nltk
from nltk.tokenize import sent_tokenize, word_tokenize
from nltk.corpus import stopwords
from nltk.stem import WordNetLemmatizer, PorterStemmer
import string

nltk.download('punkt')
nltk.download('stopwords')
nltk.download('wordnet')


[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\melek\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\melek\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package wordnet to
[nltk_data]     C:\Users\melek\AppData\Roaming\nltk_data...
[nltk_data]   Package wordnet is already up-to-date!


True

## 1. Veri Setini Okuma

In [2]:
df = pd.read_csv("C:/Users/melek/Downloads/large_student_answer_dataset.csv")
df.head()


Unnamed: 0,Question,Key_Answer,Student_Answer,Label
0,Explain how gravity affects objects on Earth.,Gravity pulls objects toward the center of the...,Gravity keeps us on the ground and makes thing...,correct
1,Describe the function of the heart in the huma...,"The heart pumps blood throughout the body, sup...",It moves blood around so the body gets what it...,correct
2,Why is recycling important?,"Recycling reduces waste, conserves resources, ...",It helps the Earth and stops trash from piling...,correct
3,Explain how gravity affects objects on Earth.,Gravity pulls objects toward the center of the...,Gravity keeps us on the ground and makes thing...,correct
4,Why is recycling important?,"Recycling reduces waste, conserves resources, ...",It helps the Earth and stops trash from piling...,correct


## 2. Metin Ön İşleme Fonksiyonları

In [3]:
stop_words = set(stopwords.words("english"))
lemmatizer = WordNetLemmatizer()
stemmer = PorterStemmer()

def preprocess_text(text):
    sentences = sent_tokenize(str(text))
    processed_lemma = []
    processed_stem = []
    
    for sent in sentences:
        words = word_tokenize(sent)
        words = [w.lower() for w in words if w.isalpha() and w.lower() not in stop_words]
        
        lemmatized = [lemmatizer.lemmatize(w) for w in words]
        stemmed = [stemmer.stem(w) for w in words]
        
        processed_lemma.append(lemmatized)
        processed_stem.append(stemmed)
    
    return processed_lemma, processed_stem


## 3. Örnek Cevaplara Uygulama ve Karşılaştırma

In [4]:
sample_texts = df["Student_Answer"].dropna().sample(5, random_state=42).tolist()

sample_texts = df["Student_Answer"].dropna().sample(5, random_state=42).tolist()

for i, text in enumerate(sample_texts):
    lemmas, stems = preprocess_text(text)  # Burada preprocess_text fonksiyonun varsa
    print(f"\nText {i+1}:\n{text}\nLemmas: {lemmas}\nStems: {stems}")




Text 1:
It moves blood around so the body gets what it needs. - elaboration 64239
Lemmas: [['move', 'blood', 'around', 'body', 'get', 'need'], ['elaboration']]
Stems: [['move', 'blood', 'around', 'bodi', 'get', 'need'], ['elabor']]

Text 2:
It moves blood around so the body gets what it needs. - elaboration 35778
Lemmas: [['move', 'blood', 'around', 'body', 'get', 'need'], ['elaboration']]
Stems: [['move', 'blood', 'around', 'bodi', 'get', 'need'], ['elabor']]

Text 3:
It helps the Earth and stops trash from piling up. - elaboration 71241
Lemmas: [['help', 'earth', 'stop', 'trash', 'piling'], ['elaboration']]
Stems: [['help', 'earth', 'stop', 'trash', 'pile'], ['elabor']]

Text 4:
Gravity keeps us on the ground and makes things fall. - elaboration 74143
Lemmas: [['gravity', 'keep', 'u', 'ground', 'make', 'thing', 'fall'], ['elaboration']]
Stems: [['graviti', 'keep', 'us', 'ground', 'make', 'thing', 'fall'], ['elabor']]

Text 5:
It helps the Earth and stops trash from piling up. - elab