Skip to content

9th position in Online-safety-prize-challenge (Ai singapore) [Low-Resource Detection of Harmful Memes with Social Bias]

Notifications You must be signed in to change notification settings

pratzohol/harmful-meme-detection-BAP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

74 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

hateful-meme-detection

Resources

  1. Official site of OSPC
  2. Submission Guide for OSPC

Reference Repos

Research Papers
Model Name Description Link to Paper and Git repo
NA
  1. Used VisualBERT.

Paper | Git Repo

MOMENTA momenta
Use of online google vision APIs for OCR, object detection, attribute detection.

Paper | Git Repo

PromptHate
  1. Extracts image text using EasyOCR
  2. This is followed by in-painting to remove the text from the image using MMEditing
  3. Generates image caption using ClipCap (pre-trained model : works well for low-res img)
  4. Then, it uses Google vision web-entity detection API and FairFace(pre-trained model : extract demographic information of the person from image)
  5. Then, the image caption and image text are passed through RoBERTa model to get the final prediction using MLM prompting.

Paper | Git Repo

Hate-CLIPper
  1. Image i and text t is passed through CLIP image and text encoders to obtain unimodal features $f_i$ and $f_t$.
  2. To align the text and image feature space, $f_i$ and $f_t$ are passed through a trainable projection layer.
  3. We then get $p_i$ and $p_t$ which have the same dimensionality of n.
  4. Then, a feature interaction matrix(FIM) is computed by taking the outer product of $p_i$ and $p_t$, i.e., FIM = $p_i \otimes p_t$.
  5. We can do 3 things now :
    • Concat : concat the $p_i$ and $p_t$ to get a vector of dimension $2n$
    • Cross-fusion : Flatten the FIM to get a vector of dimension $n^2$
    • Align-fusion : Take the diagonal of the FIM to get a vector of dimension $n$.
  6. It is then passed through FFN to obtain final classification.

Doesn't use additional input features like object bounding boxes, face detection and text attributes.

Paper | Git Repo

Datasets

All the Datasets for OSPC AI Singapore can be found here. It contains the following datasets:

  1. Facebook harmful meme detection challenge dataset
  2. Total defence memes - singapore
  3. Palash's Sir Dataset (RMMHS)
  4. Propanganda Meme Dataset

Notes

  1. Tesseract OCR (tessdata_best) : Takes around "1 hrs 30 mins" (2.9 it/s) for 1800 images. Quite slow !!!

  2. Tesseract OCR (tessdata) : Takes around "1 hr 10 mins" (2.3 it/s). Faster than tessdata_best.

  3. In above two cases, turbo-boost was on. Now, turning off the turbo-boost, ran the tessdata_best on 272 images. Using multiprocessing.Pool(4), it took "8 mins 33 secs". Using simple for-loop, it takes ">20 mins". Using multiprocessing.Pool(3), it took "9 mins 06 secs".

  4. Scaling the time taken above to 1800 images, using multiprocess.Pool(4), it would take around "1 hr" only.

  5. CLIP can handle images of size 224x224 upto 336x336.

  6. Using HateCLIPper, the (auroc, acc) obtained on fb-meme data validation-set are:

    • run_4_easyocr : (0.729, 0.614)
    • run-3-easyocr : (0.739, 0.634)
    • run-2-easyocr : (0.743, 0.646)
    • run-1-easyocr : (0.740, 0.632)
    • run-10 : (0.70, 0.656)
    • run-9 : (0.733, 0.634)
    • run-8 : (0.5, 0.5)
    • run-7 : (0.737, 0.642)
    • run-6 : (0.7408, 0.666)
  7. Using HateCLIPper, the (auroc, acc) obtained on RMMHS data are:

    • run_4_easyocr : (0.815, 0.68)
    • run-3-easyocr : (0.843, 0.67)
    • run-2-easyocr : (0.865, 0.75)
    • run-1-easyocr : (0.87, 0.73)
    • run-10 : (0.888, 0.789)
    • run-9 : (0.854, 0.68)
    • run-8 : (0.5, 0.45)
    • run-7 : (0.847, 0.789)
    • run-6 : (0.872, 0.835)
  8. So, based on above data and charts from wandb, I decided to go with run-1-easyocr (run-9 was second best contender).

  9. Running the above model on translated val-set of fb-meme data, the (auroc, acc) obtained was (0.7456, 0.632)

  10. Running the above model on translated val-set (sampling randomly 500) of fb-meme data, the (auroc, acc) obtained was (0.761, 0.722)

About

9th position in Online-safety-prize-challenge (Ai singapore) [Low-Resource Detection of Harmful Memes with Social Bias]

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages