Skip to content

Code and data accompanying paper "Twitter Homophily: Network Based Prediction of User’s Occupation"

Notifications You must be signed in to change notification settings

jqnap/Twitter-Occupation-Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Twitter-Occupation-Prediction

Code and data accompanying paper "Twitter Homophily: Network Based Prediction of User’s Occupation": ACL 2019.

Dataset

  • Twi_data folder contains the training/development/test sets split
  • Twi_data folder contains the processed dataset used in GCN model and Deep Walk model, extracted in around February 2018.
  • The dataset is processed from collected Twitter ego-network for a sample of Twitter users whose occupational classes are labeled. Please refer to the paper for collection and processing details.
  • Due to privacy concerns, we are not releasing raw Twitter network with Bio.

Statistics

  • Total number of edges: 586303
  • Total number of main users (with real labels): 4557
  • Total number of users (including main users): 34603

Code

  • src folder contains code for running GCN model on the processed dataset.
  • To execute, please cd src to navigate to src folder and then python train_model.py.

Data Processing Script

We include the jupyter notebook that processes the raw network into a densenly connected network which is then used with GCN and DeepWalk.

Requirements

  • 'python=3.6'
  • 'torch>=1.0'
  • 'scipy'
  • 'numpy'
  • 'pandas'

Acknowledgments

The code for GCN model is forked and modified from https://github.com/tkipf/gcn

Citation

If you find the resource useful to you, please cite:

@InProceedings{pan-19-homophily,
author = {Pan, Jiaqi and Bhardwaj, Rishabh and Lu, Wei and Chieu, Hai Leong and Pan, Xinghao and Puay, Ni Yi},
title = {Twitter Homophily: Network Based Prediction of User’s Occupation},
booktitle = {Proceedings of ACL},
year = {2019}
}

Contact

If you have any questions, please feel free to contact jiaqi.pan1019@gmail.com or rishabhbhardwaj15@gmail.com.

About

Code and data accompanying paper "Twitter Homophily: Network Based Prediction of User’s Occupation"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published