# Downloading Filipino Food Images

This is a helper function to supress printing in an imported function

In [1]:
import os, sys

class HiddenPrints:
    def __enter__(self):
        self._original_stdout = sys.stdout
        sys.stdout = open(os.devnull, 'w')

    def __exit__(self, exc_type, exc_val, exc_tb):
        sys.stdout.close()
        sys.stdout = self._original_stdout

In [1]:
from fastai.vision import *

This is the list of foods that we will find images for. They are a combination of foods that I know are common in the Philippines and from lists found on the internet.

In [3]:
filipino_foods = ['lumpia', 'sinigang', 'afritada', 'cassava_cake', 
                  'pancit_palabok', 'ube_milkshake', 'pork_adobo', 
                  'chicken_bistek', 'bistek_talagog', 'chicharon', 
                  'bibingka', 'pork_sisig', 'kare-kare', 
                  'halo-halo', 'lechon', 'kaldereta', 'arroz_caldo', 
                  'longganisa', 'tocino', 'leche_flan', 'filipino_spaghetti', 
                  'chicken_adobo', 'crispy_pata', 'chicken_inasal', 
                  'bulalo', 'tinola', 'tapa', 'laing', 
                  'pinakbet', 'bagnet', 'bicol_express', 'balut', 
                  'kinilaw', 'liempo', 'champorado', 'buco_pie', 
                  'turon', 'pandesal', 'taho', 'pancit_guisado', 
                  'pork_barbecue', 'ginataang_gulay',
                  'lechon_kawali']
filipino_foods = sorted(filipino_foods)
len(filipino_foods)

43

To create csv files with a list of URLs to download, go to Google images, type the search you want to perform, eg "lumpia filipino food" and scroll down until you think you have enough. Then type this javascript code in the console:

`urls=Array.from(document.querySelectorAll('.rg_i')).map(el=> el.hasAttribute('data-src')?el.getAttribute('data-src'):el.getAttribute('data-iurl'));
window.open('data:text/csv;charset=utf-8,' + escape(urls.join('\n')));`

A csv file will be saved in your default download folder which you would put inside "data/filipino_food" where this notebook is located.

In [4]:
path = Path('data/filipino_food')

We download the imgaes from the csv file. Only 100 are downloaded despite more being available. The classifier did not work as well for more images as those further down the search were poor quality. In some instances they were completely different foods than were searched for

In [6]:
for food in filipino_foods:
    file = food + '.csv'
    dest = path/food
    dest.mkdir(parents=True, exist_ok=True)
    with HiddenPrints():
        download_images(path/file, dest, max_pics=100)
    print(f"{food} done")

afritada done


arroz_caldo done


bagnet done


balut done


bibingka done


bicol_express done


bistek_talagog done


buco_pie done


bulalo done


cassava_cake done


champorado done


chicharon done


chicken_adobo done


chicken_bistek done


chicken_inasal done


crispy_pata done


filipino_spaghetti done


ginataang_gulay done


halo-halo done


kaldereta done


kare-kare done


kinilaw done


laing done


leche_flan done


lechon done


lechon_kawali done


liempo done


longganisa done


lumpia done


pancit_guisado done


pancit_palabok done


pandesal done


pinakbet done


pork_adobo done


pork_barbecue done


pork_sisig done


sinigang done


taho done


tapa done


tinola done


tocino done


turon done
ube_milkshake done


We check for duplicates, check the images aren't "broken" and do a little resizing.

In [7]:
for food in filipino_foods:
    verify_images(path/food, delete=True, max_size=500)

And we are done. Onto the classifier model