Skip to content

banyous/Quora-and-Twitter-crawler-and-user-matcher

Repository files navigation

This repository helps in collecting true matching accounts between Quora and Twitter networks.

Please note that Quora and Twitter do change the layout/structure of its website every now and then. So please update the code in case it doesn't work as expected.

You can find a data-set of 27k true matching Quora-Twitter accounts produced by ULSN in the following link : https://zenodo.org/record/3837711#.Xvr1uJZRU-I


Our matching process is composed of three steps (modules):

1-Quora-scrapping

The goal of this module is to retieve Quora users IDs. We complete this by doing two steps: 1- We crawl Quora Questions URLs based on a set of Quora topics keywords file. 2- We crawl all Question's answers and extract the authors ( Quora users IDs)

2-Matching-Quora-twitter-users

Based on the Quora users IDs extracted from the previous module, we perform an image matching algorithm in order to link the Quora accounts with their corresponding Twitter accounts.

3-Crawl-twitter-quora-profiles

All the true matching Qu/Tw pairs verified from the 2-Matching-Quora-twitter-users will have their Twitter and Quora accounts crawled in this module.

Please note that each module can be used separetly. A more detailed description of each module can be found in the notes.txt file within each correspondant folder.


About

Code for linking users between Quora and Twitter

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages