Beyond Digital “Echo Chambers”: The Role of Viewpoint Diversity in Political Discussion

Code, sample data, and other supplementary material for the paper "Beyond Digital “Echo Chambers”: The Role of Viewpoint Diversity in Political Discussion" accepted at WSDM'23

The paper can be found here:

ACM DL: Beyond Digital “Echo Chambers”: The Role of Viewpoint Diversity in Political Discussion
arXiv: Beyond Digital “Echo Chambers”: The Role of Viewpoint Diversity in Political Discussion

Our online talk for WSDM 2023 can be found in the ACM DL page.

If you use our work, please cite us:

@inproceedings{10.1145/3539597.3570487,
author = {Hada, Rishav and Ebrahimi Fard, Amir and Shugars, Sarah and Bianchi, Federico and Rossini, Patricia and Hovy, Dirk and Tromble, Rebekah and Tintarev, Nava},
title = {Beyond Digital "Echo Chambers": The Role of Viewpoint Diversity in Political Discussion},
year = {2023},
isbn = {9781450394079},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3539597.3570487},
doi = {10.1145/3539597.3570487},
booktitle = {Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining},
pages = {33–41},
numpages = {9},
keywords = {conversation network, Twitter, viewpoint diversity, echo chambers},
location = {Singapore, Singapore},
series = {WSDM '23}
}

Each folder in this repository contains separate readme with instructions.

fragmentation_computation.py: code to compute fragmentation values. Takes conversation network constructed in “conversation_retrieval/3_conversation_reconstruction.py” as input.

python fragmentation_computaion.py

representation.py: code to compute representation values. Takes a list of conversations as input. Each conversation in the list is a list of labels per tweet. Ex. [[L1,L2,L2,L4],…..,[L4,L3,L1,L1,L2]].

python representation.py

dyadic_interaction.py: code to compute dyadic interaction values.

Classifiers Training

To train the 4 classifiers (immigration relevance, immigration claim, daylight relevance, daylight claims) we make use of the standard HuggingFace fine-tuning interface. The model we fine-tuned is BERTweet. Note that for the immigration claim prediction, we forced dataset balancing during training. Nonetheless, all our models are trained using weighted cross entropy loss, that can be replicated with the following tuner:

import torch
from torch import nn
from transformers import Trainer

class WeightedTrainer(Trainer):
    def __init__(self, internal_weights=None, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.internal_weights = internal_weights

    def compute_loss(self, model, inputs, return_outputs=False):
        labels = inputs.get("labels").to("cuda")
        # forward pass
        outputs = model(**inputs)
        logits = outputs.get('logits').to("cuda")
        logits = logits.double()
        # compute custom loss
        loss_fct = nn.CrossEntropyLoss(weight=torch.tensor(self.internal_weights).to("cuda"))
        loss = loss_fct(logits.view(-1, self.model.config.num_labels), labels.view(-1))
        return (loss, outputs) if return_outputs else loss

To create the weights for the labels, you can use sklearn

from sklearn.preprocessing import LabelEncoder
import sklearn
import pandas as pd

train = pd.read_csv("train_data.csv")

le = LabelEncoder()

train["labels"] = le.fit_transform(train["labels"])

class_labels_for_w = list(range(0, len(le.classes_)))
weights = sklearn.utils.class_weight.compute_class_weight(class_weight="balanced",
                                                            classes=class_labels_for_w,
                                                            y=train["labels"].values.tolist())

These weights can be then passed to the Trainer

trainer = WeightedTrainer(
    model=model,  
    args=training_args,  
    train_dataset=tokenized_train, 
    eval_dataset=tokenized_valid,  
    compute_metrics=compute_metrics,
    internal_weights=weights,
    
)
trainer.train()

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
conversation retrieval		conversation retrieval
sample data		sample data
Beyond digital echo chambers presentation.pdf		Beyond digital echo chambers presentation.pdf
README.md		README.md
dyadic_interaction.py		dyadic_interaction.py
fragmentation_computation.py		fragmentation_computation.py
representation.py		representation.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

conversation retrieval

conversation retrieval

sample data

sample data

Beyond digital echo chambers presentation.pdf

Beyond digital echo chambers presentation.pdf

README.md

README.md

dyadic_interaction.py

dyadic_interaction.py

fragmentation_computation.py

fragmentation_computation.py

representation.py

representation.py

Repository files navigation

Beyond Digital “Echo Chambers”: The Role of Viewpoint Diversity in Political Discussion

Classifiers Training

About

Releases

Packages

Contributors 3

Languages

hadarishav/beyond-digital-echo-chambers

Folders and files

Latest commit

History

Repository files navigation

Beyond Digital “Echo Chambers”: The Role of Viewpoint Diversity in Political Discussion

Classifiers Training

About

Resources

Stars

Watchers

Forks

Languages