Skip to content

WeimingWen/CCRV

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.

Cross-Lingual Cross-Platform Rumor Verification Pivoting on Multimedia Content

Overview

This repository contains the implementation of methods in "Cross-Lingual Cross-Platform Rumor Verification Pivoting on Multimedia Content".

Library Dependencies

  • Python 2.7
  • Pytorch
  • scikit-learn
  • Theano
  • Keras (with Theano backend)
  • Pandas
  • ...

Data

Three sub-datasets of our CCMR dataset are saved in the folder CCMR as three json files (lists of json objects), "CCMR/CCMR_Twitter.txt", "CCMR_Google.txt" and "CCMR_Baidu.txt".

For CCMR Twitter, each tweet is saved as a json object with keys "tweet_id", "content", "image_id", "event", and "timestamp". For CCMR Google and Baidu, each webpage is saved as a json object with keys "url", "title", "image_id", and "event". The values of "image_id" are lists of image or video names from VMU 2015 dataset. All of those image files and video URLs are available in "images.zip".

Procedure

  1. To reproduce experiments results, simply run main.py.

  2. Download parallel English and Mandarin sentence of news and microblogs from UM-Corpus and save them in a folder named 'UM_Corpus'.

  3. Run prepare_UM_Corpus.py to split and tokenize the data in UM-Corpus.

  4. Run train_multilingual_embedding.py to train the multilingual sentence embedding.

  5. Run prepare_FNC_split.py to tokenize, embed and split the data from Fake News Challenge.

  6. Run train_agreement_classifier.py to train the agreement classifier.

  7. Run prepare_CCMR.py to tokenize the CCMR dataset.

  8. Run extract_clcp_feats.py to extract all cross-lingual cross-platform features and splits of the data we need for experiments. CLCP saves the available output file.

  9. Play with main.py and other scripts to test everything from the Paper.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages