Skip to content

This repository contains the solution to the Quora Question Pair Challenge, a natural language processing task that consists of predicting whether a pair of questions asked on Quora are duplicates or not.

Notifications You must be signed in to change notification settings

davidrosado4/quora-question-pairs

 
 

Repository files navigation

Quora Question Pair Challenge Solution

This repository contains the solution to the Quora Question Pair Challenge, a natural language processing task that consists of predicting whether a pair of questions asked on Quora are duplicates or not.

Collaborators

The solution was developed by the following collaborators:

Repository Structure

The repository is organized as follows:

  • train_models.ipynb: a Jupyter notebook that contains the code to preprocess the data and train the models. When executed, it creates a folder called model_artifacts that contains all the necessary information to reproduce the results.
  • reproduce_results.ipynb: a Jupyter notebook that contains the code to reproduce the results obtained by our models. It reads from the model_artifacts folder.
  • utils.py: a Python module that contains all the helper functions for the two previous notebooks.

Reproducing the Results

To reproduce the results obtained by our models, follow these steps:

  1. Clone this repository:
git clone https://github.com/sarabase/quora-question-pairs.git
  1. Create a conda environment and install the necessary requirements. Activate the environment:
conda create --name quora_test_env --file requirements.txt
conda activate quora_test_env
  1. Run the train_models.ipynb notebook.
  2. Open the reproduce_results.ipynb notebook in Jupyter and execute the cells.

About

This repository contains the solution to the Quora Question Pair Challenge, a natural language processing task that consists of predicting whether a pair of questions asked on Quora are duplicates or not.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 85.7%
  • Python 13.5%
  • Cython 0.8%