Skip to content
Task: Suggest the tags based on the content that was there in the question posted on Stackoverflow.
Jupyter Notebook
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.



Stack Overflow is the largest, most trusted online community for developers to learn, share their programming knowledge, and build their careers. Stack Overflow is something which every programmer use one way or another. Each month, over 50 million developers come to Stack Overflow to learn, share their knowledge, and build their careers. It features questions and answers on a wide range of topics in computer programming. The website serves as a platform for users to ask and answer questions, and, through membership and active participation, to vote questions and answers up or down and edit questions and answers in a fashion similar to a wiki or Digg. As of April 2014 Stack Overflow has over 4,000,000 registered users, and it exceeded 10,000,000 questions in late August 2015. Based on the type of tags assigned to questions, the top eight most discussed topics on the site are: Java, JavaScript, C#, PHP, Android, jQuery, Python and HTML.

Data Source:

Steps that has been followed:

Step_1: As the dataset was present in Kaggle, I took some reference from web and use kaggle dataset in Google Colab.

Step_2: Analyze data and determine features that can be built using the dataset

Step_3: For feature extraction TFIDF and BOW is used.

Step_4: Split data into Test and Train data.(80:20)

Step_5: Used bag of words upto 4 grams and computed the micro f1 score with Logistic regression(OvR)

Step_6: Applied Logistic Regression and Linear-SVM.

Step_7: Found the corresponding Micro F1 score in each model.

You can’t perform that action at this time.