In this notebook, I experimented with creating some new features without fine-tuning bag of words, tf-idf, or sentence embeddings directly on the training data. I came up with a bunch of basic features using some libraries, different datasets, and just common sense. Here's a look at some of the features I used:
- Textstat features to measure readability, complexity, and grade level.
- NER, POS, and TAG features using spacy.
- Sentiment analysis and other length/ratio features using NLTK and basic functions.
- Features derived from feedback data to assess cohesion, syntax, vocabulary, phraseology, grammar, and conventions.
    - I used a basic Ridge regression model to get a quick sense of how well these features could predict scores. Then, I used a simple GBDT model with 5-fold cross-validation to generate the final predictions.
    
I'm sharing this to hopefully spark some new ideas or help you improve your current pipelines. I'm getting decent CV results, but I haven't reached the level I'm aiming for just yet.

In [1]:
!pip install --no-index --no-deps /kaggle/input/aes-whls/aes_whls/huggingface_hub-0.23.0-py3-none-any.whl
!pip install --no-index --no-deps /kaggle/input/aes-whls/aes_whls/safetensors-0.4.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
!pip install --no-index --no-deps /kaggle/input/aes-whls/aes_whls/sentence_transformers-2.8.0.dev0-py3-none-any.whl
!pip install --no-index --no-deps /kaggle/input/aes-whls/aes_whls/tokenizers-0.19.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
!pip install --no-index --no-deps /kaggle/input/aes-whls/aes_whls/transformers-4.40.2-py3-none-any.whl
!pip install --no-index --no-deps /kaggle/input/aes-whls/aes_whls/textstat-0.7.3-py3-none-any.whl
!pip install --no-index --no-deps /kaggle/input/aes-whls/aes_whls/pyphen-0.15.0-py3-none-any.whl
!pip install --no-index --no-deps /kaggle/input/aes-whls/aes_whls/einops-0.8.0-py3-none-any.whl
!pip install --no-index --no-deps /kaggle/input/aes-whls/aes_whls/pyspellchecker-0.8.1-py3-none-any.whl

Processing /kaggle/input/aes-whls/aes_whls/huggingface_hub-0.23.0-py3-none-any.whl
Installing collected packages: huggingface-hub
  Attempting uninstall: huggingface-hub
    Found existing installation: huggingface-hub 0.22.2
    Uninstalling huggingface-hub-0.22.2:
      Successfully uninstalled huggingface-hub-0.22.2
Successfully installed huggingface-hub-0.23.0
Processing /kaggle/input/aes-whls/aes_whls/safetensors-0.4.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
safetensors is already installed with the same version as the provided wheel. Use --force-reinstall to force an installation of the wheel.
Processing /kaggle/input/aes-whls/aes_whls/sentence_transformers-2.8.0.dev0-py3-none-any.whl
Installing collected packages: sentence-transformers
Successfully installed sentence-transformers-2.8.0.dev0
Processing /kaggle/input/aes-whls/aes_whls/tokenizers-0.19.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Installing collected packages: tokenize

In [2]:
import pandas as pd
import polars as pl
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

from tokenizers import (
    decoders,
    models,
    pre_tokenizers,
    normalizers,
    processors,
    trainers,
    Tokenizer
)

from datasets import Dataset
from tqdm.auto import tqdm
from transformers import PreTrainedTokenizerFast

import gc

import spacy
from collections import Counter

import nltk 

from nltk.sentiment.vader import SentimentIntensityAnalyzer
import textstat
from spellchecker import SpellChecker

from sentence_transformers import SentenceTransformer, models
from sklearn.linear_model import Ridge
from sklearn.multioutput import MultiOutputRegressor

import lightgbm as lgb
from sklearn.model_selection import StratifiedKFold
from sklearn.metrics import cohen_kappa_score
import torch

tqdm.pandas()

nlp = spacy.load("en_core_web_sm")



In [3]:
train = pd.read_csv('/kaggle/input/learning-agency-lab-automated-essay-scoring-2/train.csv')
test = pd.read_csv('/kaggle/input/learning-agency-lab-automated-essay-scoring-2/test.csv')
sample_submission = pd.read_csv('/kaggle/input/learning-agency-lab-automated-essay-scoring-2/sample_submission.csv')

## Textstat Features

In [4]:
def textstat_features(text):
    features = {}
    features['flesch_reading_ease'] = textstat.flesch_reading_ease(text)
    features['flesch_kincaid_grade'] = textstat.flesch_kincaid_grade(text)
    features['smog_index'] = textstat.smog_index(text)
    features['coleman_liau_index'] = textstat.coleman_liau_index(text)
    features['automated_readability_index'] = textstat.automated_readability_index(text)
    features['dale_chall_readability_score'] = textstat.dale_chall_readability_score(text)
    features['difficult_words'] = textstat.difficult_words(text)
    features['linsear_write_formula'] = textstat.linsear_write_formula(text)
    features['gunning_fog'] = textstat.gunning_fog(text)
    features['text_standard'] = textstat.text_standard(text, float_output=True)
    features['spache_readability'] = textstat.spache_readability(text)
    features['mcalpine_eflaw'] = textstat.mcalpine_eflaw(text)
    features['reading_time'] = textstat.reading_time(text)
    features['syllable_count'] = textstat.syllable_count(text)
    features['lexicon_count'] = textstat.lexicon_count(text)
    features['monosyllabcount'] = textstat.monosyllabcount(text)

    return features

train['textstat_features'] = train['full_text'].apply(textstat_features)
train_textstat = pd.DataFrame(train['textstat_features'].tolist())

test['textstat_features'] = test['full_text'].apply(textstat_features)
test_textstat = pd.DataFrame(test['textstat_features'].tolist())

train_textstat.head()

Unnamed: 0,flesch_reading_ease,flesch_kincaid_grade,smog_index,coleman_liau_index,automated_readability_index,dale_chall_readability_score,difficult_words,linsear_write_formula,gunning_fog,text_standard,spache_readability,mcalpine_eflaw,reading_time,syllable_count,lexicon_count,monosyllabcount
0,57.98,14.7,11.7,8.19,18.3,8.74,60,13.0,17.33,9.0,7.28,54.5,31.97,634,498,404
1,87.55,5.4,6.8,4.99,6.2,6.31,24,6.714286,7.48,7.0,3.92,25.7,19.6,398,332,275
2,65.15,9.9,11.5,8.94,11.6,7.24,67,15.5,11.49,12.0,5.12,32.6,36.96,767,550,417
3,58.32,10.4,13.2,10.97,12.9,8.5,78,15.75,11.91,11.0,5.34,29.6,33.01,685,448,291
4,54.66,11.8,13.0,10.57,13.9,7.79,55,19.666667,12.64,13.0,5.61,35.7,26.71,562,373,241


## Linguistic Features

In [5]:
def extract_linguistic_features(text):

    doc = nlp(text)
    features = {}

    # NER Features
    entity_counts = {"GPE": 0, "PERCENT": 0, "NORP": 0, "ORG": 0, "CARDINAL": 0, "MONEY": 0, "DATE": 0, 
                    "LOC": 0, "PERSON": 0, "QUANTITY": 0, "EVENT": 0, "ORDINAL": 0, "WORK_OF_ART": 0, 
                    "LAW": 0, "PRODUCT": 0, "TIME": 0, "FAC": 0, "LANGUAGE": 0}
    for entity in doc.ents:
        if entity.label_ in entity_counts:
            entity_counts[entity.label_] += 1
    features['NER_Features'] = entity_counts

    # POS Features
    pos_counts = {"ADJ": 0, "NOUN": 0, "VERB": 0, "SCONJ": 0, "PRON": 0, "PUNCT": 0, "DET": 0, "AUX": 0, 
                "PART": 0, "ADP": 0, "SPACE": 0, "CCONJ": 0, "PROPN": 0, "NUM": 0, "ADV": 0, 
                "SYM": 0, "INTJ": 0, "X": 0}
    for token in doc:
        if token.pos_ in pos_counts:
            pos_counts[token.pos_] += 1
    features['POS_Features'] = pos_counts

    # tag Features
    tags = {"RB": 0, "-RRB-": 0, "PRP$": 0, "JJ": 0, "TO": 0, "VBP": 0, "JJS": 0, "DT": 0, "''": 0, "UH": 0, "RBS": 0, "WRB": 0, ".": 0, 
        "HYPH": 0, "XX": 0, "``": 0, "SYM": 0, "VB": 0, "VBN": 0, "WP": 0, "CC": 0, "LS": 0, "POS": 0, "NN": 0, ",": 0, "NNPS": 0,
          "RP": 0, ":": 0, "$": 0, "PDT": 0, "VBZ": 0, "VBD": 0, "JJR": 0, "-LRB-": 0, "IN": 0, "RBR": 0, "WDT": 0, "EX": 0, "MD": 0,
            "_SP": 0, "NNP": 0, "CD": 0, "VBG": 0, "NNS": 0, "PRP": 0}
    
    for token in doc:
        if token.tag_ in tags:
            tags[token.tag_] += 1
    features['tag_Features'] = tags

    # tense features
    tenses = [i.morph.get("Tense") for i in doc]
    tenses = [i[0] for i in tenses if i]
    tense_counts = Counter(tenses)
    features['past_tense_ratio'] = tense_counts.get("Past", 0) / (tense_counts.get("Pres", 0) + tense_counts.get("Past", 0) + 1e-5)
    features['present_tense_ratio'] = tense_counts.get("Pres", 0) / (tense_counts.get("Pres", 0) + tense_counts.get("Past", 0) + 1e-5)
    
    
    # len features

    features['word_count'] = len(doc)
    features['sentence_count'] = len([sentence for sentence in doc.sents])
    features['words_per_sentence'] = features['word_count'] / features['sentence_count']
    features['std_words_per_sentence'] = np.std([len(sentence) for sentence in doc.sents])

    features['unique_words'] = len(set([token.text for token in doc]))
    features['lexical_diversity'] = features['unique_words'] / features['word_count']

    paragraph = text.split('\n\n')

    features['paragraph_count'] = len(paragraph)

    features['avg_chars_by_paragraph'] = np.mean([len(paragraph) for paragraph in paragraph])
    features['avg_words_by_paragraph'] = np.mean([len(nltk.word_tokenize(paragraph)) for paragraph in paragraph])
    features['avg_sentences_by_paragraph'] = np.mean([len(nltk.sent_tokenize(paragraph)) for paragraph in paragraph]) 

    # sentiment features
    analyzer = SentimentIntensityAnalyzer()
    sentences = nltk.sent_tokenize(text)

    compound_scores, negative_scores, positive_scores, neutral_scores = [], [], [], []
    for sentence in sentences:
        scores = analyzer.polarity_scores(sentence)
        compound_scores.append(scores['compound'])
        negative_scores.append(scores['neg'])
        positive_scores.append(scores['pos'])
        neutral_scores.append(scores['neu'])

    features["mean_compound"] = np.mean(compound_scores)
    features["mean_negative"] = np.mean(negative_scores)
    features["mean_positive"] = np.mean(positive_scores)
    features["mean_neutral"] = np.mean(neutral_scores)

    features["std_compound"] = np.std(compound_scores)
    features["std_negative"] = np.std(negative_scores)
    features["std_positive"] = np.std(positive_scores)
    features["std_neutral"] = np.std(neutral_scores)

    return features

train['linguistic_features'] = train['full_text'].progress_apply(extract_linguistic_features)

train_linguistic = pd.json_normalize(train['linguistic_features'])



test['linguistic_features'] = test['full_text'].progress_apply(extract_linguistic_features)

test_linguistic = pd.json_normalize(test['linguistic_features'])

train_linguistic.head()

  0%|          | 0/17307 [00:00<?, ?it/s]

  0%|          | 0/3 [00:00<?, ?it/s]

Unnamed: 0,past_tense_ratio,present_tense_ratio,word_count,sentence_count,words_per_sentence,std_words_per_sentence,unique_words,lexical_diversity,paragraph_count,avg_chars_by_paragraph,...,tag_Features.RBR,tag_Features.WDT,tag_Features.EX,tag_Features.MD,tag_Features._SP,tag_Features.NNP,tag_Features.CD,tag_Features.VBG,tag_Features.NNS,tag_Features.PRP
0,0.275362,0.724638,552,13,42.461538,34.078225,248,0.449275,1,2677.0,...,1,4,2,10,6,26,12,6,35,29
1,0.160714,0.839286,377,20,18.85,11.127781,169,0.448276,5,332.2,...,0,6,5,10,4,10,2,10,15,28
2,0.15873,0.84127,611,25,24.44,8.168623,246,0.402619,4,767.75,...,2,9,4,19,4,0,2,11,39,20
3,0.090909,0.909091,516,21,24.571429,10.135141,242,0.468992,5,538.6,...,1,2,2,11,4,20,6,14,27,19
4,0.183673,0.816326,428,16,26.75,18.122845,159,0.371495,6,366.333333,...,3,3,0,3,7,31,4,10,13,10


In [6]:
tag_cols = [col for col in train_linguistic.columns if col.startswith('tag')]
col_cols = [col for col in train_linguistic.columns if col.startswith('col')]
pos_cols = [col for col in train_linguistic.columns if col.startswith('pos')]

for col in tag_cols:
    train_linguistic[f"{col}_ratio"] = train_linguistic[col] / train_linguistic['word_count']
    test_linguistic[f"{col}_ratio"] = test_linguistic[col] / test_linguistic['word_count']

for col in col_cols:
    test_linguistic[f"{col}_ratio"] = test_linguistic[col] / test_linguistic['word_count']

for col in pos_cols:
    test_linguistic[f"{col}_ratio"] = test_linguistic[col] / test_linguistic['word_count']

train_linguistic.head()

Unnamed: 0,past_tense_ratio,present_tense_ratio,word_count,sentence_count,words_per_sentence,std_words_per_sentence,unique_words,lexical_diversity,paragraph_count,avg_chars_by_paragraph,...,tag_Features.RBR_ratio,tag_Features.WDT_ratio,tag_Features.EX_ratio,tag_Features.MD_ratio,tag_Features._SP_ratio,tag_Features.NNP_ratio,tag_Features.CD_ratio,tag_Features.VBG_ratio,tag_Features.NNS_ratio,tag_Features.PRP_ratio
0,0.275362,0.724638,552,13,42.461538,34.078225,248,0.449275,1,2677.0,...,0.001812,0.007246,0.003623,0.018116,0.01087,0.047101,0.021739,0.01087,0.063406,0.052536
1,0.160714,0.839286,377,20,18.85,11.127781,169,0.448276,5,332.2,...,0.0,0.015915,0.013263,0.026525,0.01061,0.026525,0.005305,0.026525,0.039788,0.074271
2,0.15873,0.84127,611,25,24.44,8.168623,246,0.402619,4,767.75,...,0.003273,0.01473,0.006547,0.031097,0.006547,0.0,0.003273,0.018003,0.06383,0.032733
3,0.090909,0.909091,516,21,24.571429,10.135141,242,0.468992,5,538.6,...,0.001938,0.003876,0.003876,0.021318,0.007752,0.03876,0.011628,0.027132,0.052326,0.036822
4,0.183673,0.816326,428,16,26.75,18.122845,159,0.371495,6,366.333333,...,0.007009,0.007009,0.0,0.007009,0.016355,0.07243,0.009346,0.023364,0.030374,0.023364


In [7]:
merged_df = pd.concat([train_textstat, train_linguistic], axis=1)

merged_df_test = pd.concat([test_textstat, test_linguistic], axis=1)

## Error Counts

In [8]:
spell = SpellChecker()

def spell_check(text):
    words = nltk.word_tokenize(text)
    misspelled = spell.unknown(words)

    mispelled_count = len(misspelled)
    misspelled_ratio = mispelled_count / len(words)

    return mispelled_count, misspelled_ratio

train['spell_check_features'] = train['full_text'].progress_apply(spell_check)

spell_check_df = pd.DataFrame(train['spell_check_features'].tolist(), columns=['misspelled_count', 'misspelled_ratio'])

test['spell_check_features'] = test['full_text'].progress_apply(spell_check)

test_check_df = pd.DataFrame(test['spell_check_features'].tolist(), columns=['misspelled_count', 'misspelled_ratio'])

spell_check_df.head()

  0%|          | 0/17307 [00:00<?, ?it/s]

  0%|          | 0/3 [00:00<?, ?it/s]

Unnamed: 0,misspelled_count,misspelled_ratio
0,30,0.055046
1,13,0.03504
2,12,0.019835
3,16,0.031311
4,15,0.035885


In [9]:
merged_df = pd.concat((merged_df, spell_check_df), axis=1)

merged_df_test = pd.concat((merged_df_test, test_check_df), axis=1)

# Feedback Features

In [10]:
feedback_df = pd.read_csv('/kaggle/input/feedback-data/feedback_data.csv')

feed_embeds = []

merged_embeds = []

test_embeds = []

for i in range(5):
    model_path = f'/kaggle/input/sent-debsmall/deberta_small_trained/temp_fold{i}_checkpoints'
    word_embedding_model = models.Transformer(model_path, max_seq_length=1024)
    pooling_model = models.Pooling(word_embedding_model.get_word_embedding_dimension())
    model = SentenceTransformer(modules=[word_embedding_model, pooling_model])

    model.half()
    model = model.to('cuda')
    
    feed_custom_embeddings_train = model.encode(feedback_df.loc[:, 'full_text'].values, device='cuda',
                                                show_progress_bar=True, normalize_embeddings=True)
    
    feed_embeds.append(feed_custom_embeddings_train)
    
    merged_custom_embeddings = model.encode(train.loc[:, 'full_text'].values, device='cuda',
                                            show_progress_bar=True, normalize_embeddings=True)

    merged_embeds.append(merged_custom_embeddings)
    
    
    test_custom_embeddings = model.encode(test.loc[:, 'full_text'].values, device='cuda',
                                            show_progress_bar=True, normalize_embeddings=True)
    
    test_embeds.append(test_custom_embeddings)
    
feed_embeds = np.mean(feed_embeds, axis=0)
merged_embeds = np.mean(merged_embeds, axis=0)
test_embeds = np.mean(test_embeds, axis=0)
    
    
    

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


Batches:   0%|          | 0/123 [00:00<?, ?it/s]

Batches:   0%|          | 0/541 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


Batches:   0%|          | 0/123 [00:00<?, ?it/s]

Batches:   0%|          | 0/541 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


Batches:   0%|          | 0/123 [00:00<?, ?it/s]

Batches:   0%|          | 0/541 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


Batches:   0%|          | 0/123 [00:00<?, ?it/s]

Batches:   0%|          | 0/541 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


Batches:   0%|          | 0/123 [00:00<?, ?it/s]

Batches:   0%|          | 0/541 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

In [11]:
targets = ['cohesion', 'syntax', 'vocabulary', 'phraseology', 'grammar', 'conventions']


ridge = Ridge(alpha=1.0)

multioutputregressor = MultiOutputRegressor(ridge)



multioutputregressor.fit(feed_embeds, feedback_df.loc[:, targets])

In [12]:
feedback_predictions = multioutputregressor.predict(merged_embeds)

feedback_predictions_df = pd.DataFrame(feedback_predictions, columns=targets)

test_feedback_predictions = multioutputregressor.predict(test_embeds)

test_feedback_predictions_df = pd.DataFrame(test_feedback_predictions, columns=targets)

feedback_predictions_df.head()

Unnamed: 0,cohesion,syntax,vocabulary,phraseology,grammar,conventions
0,2.768141,2.728853,3.093328,2.903788,2.816993,2.510288
1,3.330801,3.268913,3.471339,3.344035,3.400758,3.189666
2,4.136734,4.14654,4.322047,4.29542,4.234249,4.076426
3,3.830386,3.729886,4.011004,3.816863,3.779604,3.744659
4,3.383255,3.411402,3.551323,3.412766,3.53728,3.339459


In [13]:
merged_df = pd.concat((merged_df, feedback_predictions_df), axis=1)

merged_df_test = pd.concat((merged_df_test, test_feedback_predictions_df), axis=1)

In [14]:
merged_df.shape

(17307, 170)

In [15]:
merged_df_test.shape

(3, 170)

In [16]:
def quadratic_weighted_kappa(y_true, y_pred):
    y_true = y_true + a
    y_pred = (y_pred + a).clip(1, 6).round()
    qwk = cohen_kappa_score(y_true, y_pred, weights="quadratic")
    return 'QWK', qwk, True


# metric and objective based on public notebooks

def qwk_obj(y_true, y_pred):
    labels = y_true + a
    preds = y_pred + a
    preds = preds.clip(1, 6)
    f = 1/2*np.sum((preds-labels)**2)
    g = 1/2*np.sum((preds-a)**2+b)
    df = preds - labels
    dg = preds - a
    grad = (df/g - f*dg/g**2)*len(labels)
    hess = np.ones(len(labels))
    return grad, hess
a = 2.998
b = 1.092



skf = StratifiedKFold(n_splits=15, shuffle=True, random_state=42)

scores = []

train['oof'] = 0

test_preds = []

for fold, (train_idx, valid_idx) in enumerate(skf.split(train['full_text'], train['score'])):
    print(f"Fold: {fold}")
    print(f"Train size: {len(train_idx)}")
    print(f"Valid size: {len(valid_idx)}")
    print()


    X_train = merged_df.iloc[train_idx].values
    X_valid = merged_df.iloc[valid_idx].values


    y_train = train['score'].astype('float32').values[train_idx]
    y_valid = train['score'].astype('float32').values[valid_idx]


    y_train = y_train -a
    y_valid = y_valid -a

 

    model = lgb.LGBMRegressor(
                objective = qwk_obj,
                metrics = 'None',
                learning_rate = 0.01,
                n_estimators=10000,
                random_state=42,
                extra_trees=True,
                class_weight='balanced',
                verbosity = - 1)
    
    callbacks = [lgb.early_stopping(500, verbose=True, first_metric_only=True), lgb.log_evaluation(period=500)]

    
    predictor = model.fit(X_train,
                                  y_train,
                                  eval_names=['train', 'valid'],
                                  eval_set=[(X_train, y_train), (X_valid, y_valid)],
                                  eval_metric=quadratic_weighted_kappa,
                                  callbacks=callbacks,)

    valid_preds = predictor.predict(X_valid)

    train.loc[valid_idx, 'oof'] = valid_preds + a

    score = quadratic_weighted_kappa(y_valid, valid_preds)
    scores.append(score[1])
    
    test_preds.append(predictor.predict(merged_df_test) + a)

    print(f"Train QWK: {score}")

print(f"Mean QWK: {np.mean(scores)}")

Fold: 0
Train size: 16153
Valid size: 1154

[LightGBM] [Info] Using self-defined objective function
Training until validation scores don't improve for 500 rounds
[500]	train's QWK: 0.846852	valid's QWK: 0.824621
[1000]	train's QWK: 0.864896	valid's QWK: 0.832841
[1500]	train's QWK: 0.878967	valid's QWK: 0.837373
[2000]	train's QWK: 0.890063	valid's QWK: 0.839161
[2500]	train's QWK: 0.899628	valid's QWK: 0.839297
Early stopping, best iteration is:
[2309]	train's QWK: 0.896081	valid's QWK: 0.84143
Evaluated only: QWK


  train.loc[valid_idx, 'oof'] = valid_preds + a


Train QWK: ('QWK', 0.8414302089699038, True)
Fold: 1
Train size: 16153
Valid size: 1154

[LightGBM] [Info] Using self-defined objective function
Training until validation scores don't improve for 500 rounds
[500]	train's QWK: 0.84546	valid's QWK: 0.837994
[1000]	train's QWK: 0.864177	valid's QWK: 0.846486
[1500]	train's QWK: 0.877871	valid's QWK: 0.847725
[2000]	train's QWK: 0.889757	valid's QWK: 0.846564
Early stopping, best iteration is:
[1549]	train's QWK: 0.879226	valid's QWK: 0.849067
Evaluated only: QWK
Train QWK: ('QWK', 0.849066717135702, True)
Fold: 2
Train size: 16153
Valid size: 1154

[LightGBM] [Info] Using self-defined objective function
Training until validation scores don't improve for 500 rounds
[500]	train's QWK: 0.846838	valid's QWK: 0.835908
[1000]	train's QWK: 0.865269	valid's QWK: 0.837834
Early stopping, best iteration is:
[740]	train's QWK: 0.856842	valid's QWK: 0.839188
Evaluated only: QWK
Train QWK: ('QWK', 0.8391878158786482, True)
Fold: 3
Train size: 16153
Va

In [17]:
final_preds = (np.mean(test_preds, axis=0))

In [18]:
light_gbm_preds = np.round(np.clip(final_preds, 1, 6))

In [19]:
sample_submission['score'] = light_gbm_preds

sample_submission['score']

0    2.0
1    3.0
2    5.0
Name: score, dtype: float64

In [20]:
sample_submission.to_csv('submission.csv', index=False)