# Input Image Pipeline

This notebook contains a collection of commands to collect images from the internet, using the scripts in the directory, `Input Image Pipeline`.

Import necessary modules.

In [2]:
import os
import sys
from extract_frame import filter_faces

## Download images from the internet

In [None]:
# !pip install git+https://github.com/ostrolucky/Bulk-Bing-Image-downloader

Download images from the internet with a search query. Refer to `bing_scraper_better.ipynb` for detailed instructions.

In [8]:
download_dir = "../images/input_images/food/Myanmar"
!bbid.py "Myanmar People eating Burmese Traditional Food" -o $download_dir --limit 100

  import imghdr
{'search_string': ['Myanmar People eating Burmese Traditional Food'], 'search_file': False, 'output': '../images/input_images/food/Myanmar', 'adult_filter_off': False, 'filters': '', 'limit': 100, 'threads': 20}
 OK : Burmese-Street-Food-Featured-Image.jpg
 OK : traditional-burmese-food-in-bagan-my.jpg
 OK : people-having-meal-in-traditional-bu.jpg
 OK : 8d03059ccb26f47372092fccec94d3bb.jpg
 OK : Myanmar-Traditional-Dishes-To-Try-Na.jpg
 OK : 5511189957_d050fae898_o.jpg
 OK : alimentos-fritos-myanmar-678x500.jpg
 OK : Myanmar-Traditional-Dishes-To-Try-Ma.jpg
 OK : b979662260ef514033b369b4a90e3bac.jpg
SKIP: Already checked url, skipping
 OK : 5b9e5d712100003000c5e4f8.jpg
FAIL: yangon-myanmar-people-eating-burmese HTTP Error 400: Bad Request
 OK : 6d67a25123cd750e618e8255caf313a3.jpg
 OK : a086328fa9a43bf5467dfa51a6662024.jpg
 OK : e775afebea718e4f74cc40ce2b271bed08e7.jpg
 OK : burmese-local-food-set-lunch-meal-se.jpg
 OK : CIMG8047.jpg
 OK : slide_378046_4456822_free.jpg

Filter out images without faces.

In [9]:
filter_faces(download_dir)

## Extract Frames from YouTube

In [None]:
# !pip install yt-dlp ffmpeg-python python-dotenv

Search YouTube videos relevant to a search query and extract frames matching with a text prompt. Run `!python3 extract_frame.py -h` to see the options for the command.

> If you don't pass YouTube Data API key as an argument with `-ya` flag, the script will search for it in the `.env` file, which hasn't been uploaded to our Github repo for safety practice. 

In [None]:
# list of traditional food
foods = ['Laphet Thoke', 'Mohinga']
for food in foods:
    food = food.replace(' ', '_')
    extract_dir = f'../images/input_images/food/Myanmar/{food}'
    os.makedirs(extract_dir, exist_ok=True)
    !python3 extract_frame.py -s 'Burmese Mukbang' -f $food -o $extract_dir
    # !python3 extract_frame.py -s 'Burmese Mukbang' -f $food -ya 'AIzaSyA3wgu0KGul2mBOxD-3v8cavfamhVZfzEw' -o $extract_dir

Filter out images without faces.

In [None]:
for food in foods:
    food = food.replace(' ', '_')
    filter_faces(f'../images/Input_Images/clothes/Myanmar/{food}')

## Mask Images

In [None]:
!{sys.executable} -m pip install opencv-python
!{sys.executable} -m pip install 'git+https://github.com/facebookresearch/segment-anything.git'
!pip install ipympl

sam_checkpoint = "sam_vit_h_4b8939.pth"
face_detection_model = "face_detection_yunet_2023mar.onnx"

os.makedirs('../models', exist_ok=True)
os.makedirs('../images/input_images', exist_ok=True)
os.makedirs('../images/masks_images', exist_ok=True)

# download SAM
if not os.path.isfile(f'../models/{sam_checkpoint}'):
    !wget https://dl.fbaipublicfiles.com/segment_anything/$sam_checkpoint
    !mv $sam_checkpoint ../models/

# download YuNet Faec Detection Model
if not os.path.isfile(f'../models/{face_detection_model}'):
    !wget https://github.com/astaileyyoung/CineFace/raw/main/research/data/face_detection_yunet_2023mar.onnx
    !mv $face_detection_model ../models/

Mask images for inpainting.

In [None]:
!python3 mask.py -i '../images/input_images/clothes/Myanmar' -o '../images/mask_images/clothes/Myanmar'