CoSyn

Implementation of CoSyn: Detecting Implicit Hate Speech in Online Conversations Using a Context Synergized Hyperbolic Network.

🧱 Dependencies and Setup

🧰 Dependencies

torch 1.11.0
geoopt 0.1.0
dgl 1.0.0
other required packages in requirements.txt

🛠️ Setup

# git clone this repository
git clone https://github.com/MananSuri27/CoSyn
cd CoSyn

# install python dependencies
pip3 install -r requirements.txt

We will need to move the code for Hyperbolic Graph Comvolution models/hgconv.py to the dgl library.

mv CoSyn/models/hgconv.py path-to-dgl/dgl/python/dgl/nn/pytorch/conv/

and correspondingly update the export statememt at path-to-dgl/dgl/python/dgl/nn/pytorch/conv/__init__.py /.

🔌 Dataset Processing

💬 Conversation Trees

Required Files:

Node and edge features stored in members and interactions directory for each split train, dev, test. For each conversation tree, corresponding to a given parent node id tweet_id, there will be a members/tweet_id.csv and interactions/tweet_id.csv having the node features and edge list respectively.
A file, username2id.csv which maps usernames to an id between [0,n) where n is number of users.
Post embeddings for each post saved in the embeds directory.

To generate the conversation trees and load them as a pickle file, run the following code:

python3 utils/graphs.py

🌐 Social Graph

Required Files:

User relation matrices stored as [test/train/val]/matrix/file, where multiple files can exist in each split, and each file is an adjacency list representation of edges between given users.
Post embeddings of the last m(=100 in our paper) posts posted by the user in embeds directory, referenced by user ID.
A file, username2id.csv which maps usernames to an id between [0,n) where n is number of users.

To generate the conversation trees and load them as a pickle file, run the following code:

python3 utils/socialgraph.py

🏋️ Training

To run the training script, run the following code:

python3 main.py

The arguments for main.py are as follows:

 Arguments:  
  --x-size DIM          Embedding Dimension of Post
  --u-size DIM          Embedding Dimension of User 
  --g-size DIM          Output Dimension of HGCN
  --h-size DIM          Hidden Dimension of CHST
  --c C                 Curvature of Hyperbolic Space
  --batch-size  BS      Batch size
  --data-dir DIR        Directory for data
  --device DEVICE       Device
  --lr LR               Learning rate
  --dropout DROPOUT     Dropout probability
  --epochs EPOCHS       Maximum number of epochs to train for
  --weight-decay WEIGHT_DECAY
                        L2 regularization strength
  --optimizer OPTIMIZER
                        Which optimizer to use
  --patience PATIENCE   Patience for early stopping
  --save                Save computed results
  --save-dir SAVE_DIR   Path to save results
  --min-epochs MIN_EPOCHS
                        Do not early stop before min-epochs

🪛 Bias Invarient Encoder

📑 Citation

@misc{ghosh2023cosyn,
      title={CoSyn: Detecting Implicit Hate Speech in Online Conversations Using a Context Synergized Hyperbolic Network}, 
      author={Sreyan Ghosh and Manan Suri and Purva Chiniya and Utkarsh Tyagi and Sonal Kumar and Dinesh Manocha},
      year={2023},
      eprint={2303.03387},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
Baseline		Baseline
models		models
utils		utils
README.md		README.md
dataset.py		dataset.py
gen_social.py		gen_social.py
graphs.py		graphs.py
initializer.py		initializer.py
loss.py		loss.py
main.py		main.py
node.py		node.py
requirements.txt		requirements.txt
socialnode.py		socialnode.py

Sreyan88/CoSyn

Folders and files

Latest commit

History

Repository files navigation

CoSyn

🧱 Dependencies and Setup

🧰 Dependencies

🛠️ Setup

🔌 Dataset Processing

💬 Conversation Trees

🌐 Social Graph

🏋️ Training

🪛 Bias Invarient Encoder

📑 Citation

About

Resources

Stars

Watchers

Forks

Languages