Skip to content

Additional files, such as Q&A system for Russian language itself + Training & Test Corpora

License

Notifications You must be signed in to change notification settings

Pythonimous/Q-A-System

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 

Repository files navigation

2016-2019 BA Thesis' Deep Learning folder:

///

Corpus folder: less classes, semi-normalized distribution, sentences are cut after average size

Classification.docx: classes' clarification;

Training.txt, Test.txt - corpora

TestFeatures.pkl, TestLabels.txt, TrainingFeatures.pkl, TrainingLabels.txt - preprocessed for 2D ML training files (reshaping in TrainModel.py, TestModel.py)

Preprocess.py - preprocessing algorithm TrainModel.py - training model TestModel.py - loading and testing model separately

Top Model - contains best model and .txt file, explaining the name CLASSIFICATION.txt - matches inside model (Python) classes and their respective real classes


With Dictionaries folder:


Dictionary Preprocessing: folder with differently extracted features and feature extraction algorithm: parts exceeding the average size are numpy.mean'ed; dictionaries are used.

To Word Vectors.py: represents sentences as lists of POS tags (creates ~Raw.pkl -> ~POS.pkl for training / test for any given file) + creates FeaturesDictionary.pkl: dictionary with pairs: POS-tagged word - embedding To Vectors.py, To Matrices.py, To Tensors.py: use created POS.pkl and Dictionary.pkl to make representations for 1D, 2D and 3D CNN respectively Training.txt, Test.txt - training and test corpora


New Corpus Dict folder: different preprocessing mechanism (clarified above) is used.

TestMatrices.pkl, TestLabels.txt, TrainingMatrices.pkl, TrainingLabels.txt - preprocessed for 2D ML training files

TrainModel 2D.py - training model TestModel 2D.py - loading and testing model separately Calibrate 2D.py - tuning parameters (GridSearchCV)

CLASSIFICATION.txt - similar to in New Corpus

About

Additional files, such as Q&A system for Russian language itself + Training & Test Corpora

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages