Skip to content

We have to predict the tags associated to the questions so that we can better create an ecosystem to send the questions to the right set of people to answer it

Notifications You must be signed in to change notification settings

rohitgurjar058/StackOverflow-Tag-Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

StackOverflow-Tag-Prediction

We have to predict the tags from the tile and description associated to the questions so that we can better create an ecosystem to send the questions to the right set of people to answer it.

It is a multi-label classification problem

Data Description

Source - https://www.kaggle.com/c/facebook-recruiting-iii-keyword-extraction/data

Attribute Information

Field Name Description
Id Unique idenfier for each question
Title Tile of the question as the name suggest
Body description of the question
Tags Tags assigned by the person who posted that question

Analysis

  • Frequency of the tags

  • Predict what's the most frequent number of tags found in most of the questions

  • WordCloud to represent which tag is more frequent

Model Comparision

Model Featurization Loss Micro F1 Score
OneVsRest+SGD Classifier Tf-idf log 0.3738
OneVsRest+Log Reg. Classifier Tf-idf log 0.395
OneVsRest+SGD Classifier Bag-of-words log 0.292
OneVsRest+SGD Classifier Bag-of-words Hinge 0.3026

References

Applied AI Course

About

We have to predict the tags associated to the questions so that we can better create an ecosystem to send the questions to the right set of people to answer it

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published