Identify which questions asked on Quora are duplicates of questions that have already been asked.This could be useful to instantly provide answers to questions that have already been answered.We will predict whether a pair of questions are duplicates or not using various machine learning techniques.
https://www.kaggle.com/c/quora-question-pairs/data
Identify which questions asked on Quora are duplicates of questions that have already been asked.This could be useful to instantly provide answers to questions that have already been answered.We want to predict whether a pair of questions are duplicates or not.
- The cost of a mis-classification can be very high.
- You would want a probability of a pair of questions to be duplicates so that you can choose any threshold of choice.
- No strict latency concerns.
- Interpretability is partially important.
I have written a blog explaining the approach i used to solve this sentiment analysis problem from basic EDA to model creation. You can read my medium blog for that.