# Batik Images Scraping Using BeautifulSoup4

In [None]:
import requests
from bs4 import BeautifulSoup
import os

We would like to save all images that we got into a folder in Google Drive, so we need to set the path in order to directly save the folder to Google Drive.
The current working directory is at `/content`. 

In [None]:
os.getcwd()

'/content'

We simply name the folder `Scrapped Images`, in which we will store all the scrapped images. Then, we can use `os` module to set it as our current working directory.

In [None]:
folder_name = 'Scrapped Images'
path = os.path.join('/content/drive/MyDrive/BOOTCAMP FINAL SEASON/Scholarship/Batik Research', folder_name)
os.mkdir(path)
os.chdir(path)

In [None]:
os.getcwd()

'/content/drive/MyDrive/BOOTCAMP FINAL SEASON/Scholarship/Batik Research/Scrapped Images'

After setting the working directory, we can begin scraping several websites that consist of Batik images. To make it simpler, we define a function that takes a website `url` as parameter and scrape all of the images in it. The `web_number` parameter is just a number that is unique for each website.

In [None]:
def scrape_images(url, web_number):
    # Get HTML Parser
    index_html = requests.get(url).text
    soup = BeautifulSoup(index_html, 'html.parser')

    # Find All Images
    images = soup.find_all('img')

    # Write Images To Directory
    i = 1
    for im in images:
        file_name = 'web' + str(web_number) + '_' + str(i) + '.jpg'
        with open(file_name, 'wb') as f:
            image = requests.get(im['src'])
            f.write(image.content)
            i += 1

Using the defined function above, we just directly call it to get all images in several websites.

### Web 1

In [None]:
scrape_images('https://review.bukalapak.com/fashion/motif-batik-populer-1542', 1)

### Web 2

In [None]:
scrape_images('https://seruni.id/batik-indonesia/', 2)

### Web 3

In [None]:
scrape_images('https://obatrindu.com/motif-batik-tradisional-indonesia/', 3)

### Web 4

In [None]:
scrape_images('https://hidupsimpel.com/macam-macam-motif-batik-nusantara/', 4)

### Web 5

In [None]:
scrape_images('https://carakus.com/macam-motif-batik-modern/', 5)

#### Note:
The function `scrape_images()` gets all images in a website because we use the `find_all()` method, in order to make sure that the images are all Batik images, we still have to do some cleaning by removing unwanted images in Google Drive.