InterviewQuestionTagging

This project is a helper project for InterviewInsightMine which taggs question of scrapped data

Problem

In the InterviewInsightMine there has been collected 9000+ data, however analyzing them is a difficult task since they are not tagged

Solution

Train a model from data StackOverflow and StackExchange website. They publish Stack Exchage Data Dumps
In this project, we are interested in the Posts file which contains the question and the tags. The first iteration of this project is done on stats.meta.stackexchange.com.7z. Because of the limited GPU power and also we don't need all tags I extracted only the top 50 tags.

Preprocessing

The preprocessing is basic

Removing StopWords
Making all strings lower
stemming the words
Removing the slashes and other symbols

Model

The data is then fitted with tfidf vectorizer and fed into the convolution model

Prediction

The prediction is a vector of 50 elements with each from 0 to 1 as a probability for a tag to be associated with the question

Loss Function

MSE loss between the actual vs predicted vector. Current Testing loss: 0.0268

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
BERT-Fine-Tune.ipynb		BERT-Fine-Tune.ipynb
Preprocessing.ipynb		Preprocessing.ipynb
README.md		README.md
explore tags.ipynb		explore tags.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

InterviewQuestionTagging

Problem

Solution

Preprocessing

Model

Prediction

Loss Function

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

AhmedTammaa/InterviewQuestionTagging

Folders and files

Latest commit

History

Repository files navigation

InterviewQuestionTagging

Problem

Solution

Preprocessing

Model

Prediction

Loss Function

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages