Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request for datasets #1

Closed
lizhou21 opened this issue Feb 2, 2023 · 4 comments
Closed

Request for datasets #1

lizhou21 opened this issue Feb 2, 2023 · 4 comments

Comments

@lizhou21
Copy link

lizhou21 commented Feb 2, 2023

Hi, I'm very interested in your work. Can you provide your dataset? The dataset I downloaded from the OSF website has no data description and looks incomplete, only label.

Looking forward to your reply.

@vinid vinid assigned vinid and unassigned vinid Feb 2, 2023
@vinid
Copy link
Owner

vinid commented Feb 2, 2023

Hello!

the data shared contains tweet ids and labels. Due to Twitter guidelines we cannot share the text directly, you need to reconstruct the tweet yourself.

Let me know if you have further questions!

@lizhou21
Copy link
Author

lizhou21 commented Feb 3, 2023

Thank you for your reply,
I see that there are many columns in the datasets,
can you provide a README file to describe the information in the datasets,
so as to prevent my wrong understanding for these columns?

@vinid
Copy link
Owner

vinid commented Feb 6, 2023

You are right that is something I have to do. In the meantime, the most important column to consider should be the following:

tweet_id to reconstruct the tweet

and

'Aggregated - Incivility - Insults',
'Aggregated - Incivility - Character Assassination',
'Aggregated - Incivility - Outrage',
'Aggregated - Hostility - Hateful Speech',
'Aggregated - Hostility - Dehumanisation',
'Aggregated - Hostility - Serious Threat, Personal Abuse & Harassment',
'Aggregated - Discrimination',
'Aggregated - Hostility - Democratic Threat',

These are the aggregated annotations (the other columns mainly refer to the score of specific annotators) and corresponds to the labels described in the paper. Note that in the paper the following four labels

'Aggregated - Discrimination',
'Aggregated - Hostility - Democratic Threat',
'Aggregated - Hostility - Serious Threat, Personal Abuse & Harassment',
'Aggregated - Hostility - Hateful Speech',

are aggregated in a single label referred to as Hostility. You'll find information about this at the end of the paper.

Let me know if you have additional questions! Happy to help!

@vinid vinid closed this as completed Feb 10, 2023
@lizhou21
Copy link
Author

Thank you very much for your help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants