Skip to content

Pieter-Jacobs/bachelor-thesis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Bachelor Thesis: Active Learning and its applications to reduce text classification labeling effort

This repository contains the code used for my bachelor thesis on active learning. Which you can find here:

Installation

Python 3 is required to run this project. Moreover, use the package manager pip to install the following dependencies:

pip install dill hydra-core matplotlib nltk numba pandas sentence_transformers sklearn scipy torch torchtext transformers

Usage

For the experiment that compared active learning and random sampling, run:

    python main.py -m dataset=sst +heuristic=random,random,random
    python main.py -m dataset=sst query_function=variation_ratio,variation_ratio,variation_ratio,predictive_entropy,predictive_entropy,predictive_entropy,predictive_entropy,mutual_information,mutual_information,mutual_information
    python plot_data.py 1
    python compute_deficiencies.py 1

For the experiment that compared different query sizes, run:

    python main.py -m dataset=sst parameters.Q=85,85,85,42,42,42,425,425,425 metric_file=scaling
    python plot_data.py 2
    python compute_deficiencies.py 2

For the third experiment that examined the performance of different heuristics, run:

    python main.py -m query_function=variation_ratio,variation_ratio,variation_ratio
    python main.py -m dataset=sst +heuristic=ret,ret,ret,rect,rect,rect,sud,sud,sud metric_file=heuristics
    python plot_data.py 3
    python compute_deficiencies.py 3

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages