Sentiment classification tasks

Clone this repository:

git clone https://github.com/nvanva/filimdb_evaluation.git

run init.sh to prepare dataset:

./init.sh

create classifier.py and write the following functions:

def pretrain(texts):
   """
   Pretrain classifier on unlabeled texts. If your classifier cannot train on unlabeled data, skip this.
   :param texts: a list of texts (str objects), one str per example
   :return: learnt parameters, or any object you like (it will be passed to the train function)
   """
   
def train(texts, labels, pretrain_params=None):
    """
    Trains classifier on the given train set represented as parallel lists of texts and corresponding labels.
    :param texts: a list of texts (str objects), one str per example
    :param labels: a list of labels, one label per example
    :return: learnt parameters, or any object you like (it will be passed to the classify function) 
    """

def classify(texts, params):
    """
    Classify texts given previously learnt parameters.
    :param texts: texts to classify
    :param params: parameters received from train function
    :return: list of labels corresponding the the given list of texts
    """

place classifier.py in the same folder as evaluate.py and run evaluate.py. It will score your classifier and create file preds.tsv with predictions.

python evaluate.py

if you need to pretrain your model on all sets of texts (train, test, dev, unlabeled, dev-b, test-b), use --transductive command-line argument:

python evaluate.py --transductive

Language modeling tasks

Clone this repository:

git clone https://github.com/nvanva/filimdb_evaluation.git

run init.sh to prepare dataset:

./init.sh

Edit lm.py and write the following functions:

Run evaluate_lm.py

python evaluate_lm.py evaluate --ptb-path='PTB'

Sampling from lm

python evaluate_lm.py sampling --size=20 --start-text='the meaning of life is'

Transliteration task

Clone this repository:

git clone https://github.com/skoltech-nlp/filimdb_evaluation.git

run init.sh to prepare dataset:

./init.sh

Check baseline implementation in translit_baseline.py and evaluate it:

python evaluate_translit.py

Modify translit.py according to assignment description, change importing in evaluate_translit.py from from translit_baseline import train, classify to from translit import train, classify.
Check results again:

python evaluate_translit.py

Name		Name	Last commit message	Last commit date
Latest commit History 156 Commits
tests		tests
translit_utils		translit_utils
wsi		wsi
FILIMDB.tar.gz		FILIMDB.tar.gz
PTB.tar.gz		PTB.tar.gz
README.md		README.md
TRANSLIT.tar.gz		TRANSLIT.tar.gz
__init__.py		__init__.py
classifier.py		classifier.py
evaluate.py		evaluate.py
evaluate_ctrl.py		evaluate_ctrl.py
evaluate_lm.py		evaluate_lm.py
evaluate_translit.py		evaluate_translit.py
init.sh		init.sh
lm.py		lm.py
prepare_table.py		prepare_table.py
score.py		score.py
score_lm.py		score_lm.py
score_submissions.py		score_submissions.py
score_translit.py		score_translit.py
translit.py		translit.py
translit_baseline.py		translit_baseline.py

s-nlp/filimdb_evaluation

Folders and files

Latest commit

History

Repository files navigation

Sentiment classification tasks

Language modeling tasks

Transliteration task

About

Resources

Stars

Watchers

Forks

Languages