Skip to content

LCS2-IIITD/MOMENTA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MOMENTA

This is the repo for "MOMENTA: A Multimodal Framework for Detecting Harmful Memes and Their Targets" accepted at Findings of EMNLP '21.

setting up dependencies

if CUDA_version == "10.0":
    torch_version_suffix = "+cu100"    
elif CUDA_version == "10.1":
    torch_version_suffix = "+cu101"    
elif CUDA_version == "10.2":
    torch_version_suffix = ""    
else:
    torch_version_suffix = "+cu110"

For installing CLIP

! pip3 install torch==1.7.1{torch_version_suffix} torchvision==0.8.2{torch_version_suffix} -f https://download.pytorch.org/whl/torch_stable.html ftfy regex --user
! wget https://openaipublic.azureedge.net/clip/bpe_simple_vocab_16e6.txt.gz -O bpe_simple_vocab_16e6.txt.gz

For sentence transformer: Follow steps from https://github.com/UKPLab/sentence-transformers

Instructions

The .py contains the exhaustive set of steps required to be run in sequence.

  1. It contains code for loading pre-saved ROI and entity features, which can be loaded if available.
  2. Otherwise the code for extracting features on-demand is also included.
  3. For initializing dataset and data loader for pytorch: Load the data-set for training and testing as per the requirement of the run.
  4. Experimental settings:
    Configurations for the binary/multi-class setting (training/testing/evaluation) has to be considered as per the requirement, code blocks for which are provided and suitably commented out.

Dataset, Features and Meta-info:

Please note: TWO versions of Harm-P data for "Harmfulness" are provided as part of this repo -- HarMeme-V0 (has duplicates in Harm-P) and HarMeme-V1 (completed set for Harm-P), respectively. We recommend using HarMeme-V1 for updated and correct version for "Harmfulness" data for US Politics category (both V0 and V1 contain original-ReadyToUse-data for Harm-C, which has Covid-19 category. While "Target" data for both categories can be found as part of HarMeme-V0 link given below.

  1. HarMeme Images
  2. HarMeme-V0: CAUTION! OBSOLETE FOR HARM-P "Harmfulness" - Contains duplicates in Harm-P. See the upgraded version (V1) below for the deduplicated version of Harm-P (Harmfulness) data. HarMeme-V0 content (including Target data) can be accessed via the following links:
  3. HarMeme-V1: Updated + Complete Version (for "Harmfulness"). For additional details about HarMeme-V1, refer the README in "HarMeme_V1" folder of this repo. Contents of "HarMeme_V1":
    • Annotations (Same format as V0: [id, image, labels, text]) - Duplicates Removed.
    • Meta-info (Collected using GCV API): Meme id, OCR Text, Web Entities, Best labels, Titles, Objects, ROI Info.

Acknowledgement: Thanks to mingshanhee and uprihtness for pointing out the discrepancies.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages