Questions Similarity using Siamese Neural Network

Problem

The problem is the identification of similar questions on Quora.
With millions of questions posted, it can be challenging for the user to find similar questions that have already been answered or addressed.

Solution

To solve this problem, I employed natural language processing to clean and preprocess the data, and a Siamese Neural Network with LSTM layers and Manhattan distance metric to measure the similarity between pairs of questions.

Implementation

Data import and Exploratory data analysis
Splitting the data,cleaning & preprocessing (Tokenize sentences, Remove capital letters, Remove stopwords, Remove non-alphanumeric characters, Lemmatize the tokens)
Vectorizing the Train and Test Sets using Word2Vec.
Train the Siamese Neural Network model & Test.
Visualization of the preformance.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.gitignore		.gitignore
LSTM.png		LSTM.png
QuestionSimilaritySiameseNN.ipynb		QuestionSimilaritySiameseNN.ipynb
README.md		README.md
quora_duplicate_questions.tsv		quora_duplicate_questions.tsv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Questions Similarity using Siamese Neural Network

Problem

Solution

Implementation

About

Releases

Packages

Languages

Salma-Gaam/QuestionsSimilarity_Siamese

Folders and files

Latest commit

History

Repository files navigation

Questions Similarity using Siamese Neural Network

Problem

Solution

Implementation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages