NLP_MathProblem

Since it is an Industry-University Cooperation Project,
no longer access to the dataset as the project ended :)

Note the project was worked as a team,
and I was mainly in charge of Text pre-processing / machine learning methods (XGBoost) / Ensembling / Result Analysis

Implementations:

Processing dirty text
Machine Learning methods to classify math word problems with equations (Korean + English + Math Symbols + Numbers)
with the help of pretrained KoBert and KoELECTRA models, classification used with fine tuning
Ensembling
Analysis of results: wrongs and rights

Results:

Comparison of different text preprocessing before applied in CountVectorizer and then XGBoost

From ex1 ~ ex4 the level of removing text and substitution increases

Comparison of different Models with fixed text preprocessing

Ensemble method is aggregation of outputs from different selected models with weights
After comparing...

using all model outputs and using a small neural network to learn the best way to combine the outputs
using only the best performing models for each type of method, best weight is found through trial and error

Additionally,

Another method tested, that is not discussed here, is pretraining BERT model with the data currently in hand.
However the dataset is too small, so doesn't show better result than any of the results above.
If there were enough data, it would be expected that such implementation would produce the best results.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
NLProejct_AnalysisandImprovedEnsembling.ipynb		NLProejct_AnalysisandImprovedEnsembling.ipynb
NLProject_TextPreprocessing.ipynb		NLProject_TextPreprocessing.ipynb
NLProject_XGBoostEnsembling.ipynb		NLProject_XGBoostEnsembling.ipynb
NLProject_graphModelCompare.PNG		NLProject_graphModelCompare.PNG
NLProject_graphModelCompare.ipynb		NLProject_graphModelCompare.ipynb
NLProject_graphTextPreprocessingCompare.PNG		NLProject_graphTextPreprocessingCompare.PNG
NLProject_graphTextPreprocessingCompare.ipynb		NLProject_graphTextPreprocessingCompare.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NLP_MathProblem

Implementations:

Results:

Additionally,

About

Releases

Packages

Languages

laphisboy/NLP_MathProblem

Folders and files

Latest commit

History

Repository files navigation

NLP_MathProblem

Implementations:

Results:

Additionally,

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages