# Image Scraping and Classification Project

### Problem Statement: 
    
Images are one of the major sources of data in the field of data science and AI. This field is making appropriate use of information that can be gathered through images by examining its features and details. We are trying to give you an exposure of how an end to end project is developed in this field. 

The idea behind this project is to build a deep learning-based Image Classification model on images that will be scraped from e-commerce portal. This is done to make the model more and more robust. 

This task is divided into two phases: Data Collection and Model Building. 
    
1. Data Collection Phase: In this section, you need to scrape images from e-commerce portal, Amazon.com. The clothing categories used for scraping will be:

-	Sarees (women)
-	Trousers (men)
-	Jeans (men)

You need to scrape images of these 3 categories and build your data from it. That data will be provided as an input to your deep learning problem. You need to scrape minimum 200 images of each categories. There is no maximum limit to the data collection.  You are free to apply image augmentation techniques to increase the size of your data but make sure the quality of data is not compromised. 

Remember, in case of deep learning models, the data needs to be big for building a good performing model. More the data, better the results.  


2. Model Building Phase: After the data collection and preparation is done, you need to build an image classification model that will classify between these 3 categories mentioned above. You can play around with optimizers and learning rates for improving your model’s performance.  


## Importing Libraries

In [1]:
import selenium
import pandas as pd
import time
import  requests
from  bs4 import BeautifulSoup
from selenium import webdriver
from selenium.common.exceptions import NoSuchElementException, StaleElementReferenceException
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
import warnings
warnings.filterwarnings("ignore")

In [2]:
driver=webdriver.Chrome('chromedriver.exe')
driver.maximize_window()
url = "https://www.amazon.in/"
driver.get(url)
time.sleep(3)

In [4]:
img_url1 = []
srchbar = driver.find_element(By.ID, "twotabsearchtextbox").send_keys('Sarees for Women')
driver.find_element(By.ID, "nav-search-submit-button").click()  #clicking on search button
time.sleep(1)
# scrolling the web page to get more images
for j in range(0,10):
    driver.execute_script("window.scrollBy(0,1000)")
    try:
        imgs = driver.find_elements(By.XPATH,"//div[@class='a-section aok-relative s-image-tall-aspect']/img")
        for image in imgs:
            source = image.get_attribute('src')
            img_url1.append(source)
    except:
        pass
    time.sleep(5)

In [5]:
home = driver.find_element(By.XPATH, '//a[@class="nav-logo-link nav-progressive-attribute"]').click()
time.sleep(2)

In [6]:
img_url2 = []
driver.find_element(By.ID, "twotabsearchtextbox").send_keys('Trousers for Men')
driver.find_element(By.ID, "nav-search-submit-button").click()  #clicking on search button
time.sleep(1)
# scrolling the web page to get more images
for j in range(0,10):
    driver.execute_script("window.scrollBy(0,1000)")
    try:
        imgs = driver.find_elements(By.XPATH,"//div[@class='a-section aok-relative s-image-tall-aspect']/img")
        for image in imgs:
            source = image.get_attribute('src')
            img_url2.append(source)
    except:
        pass
    time.sleep(5)

In [7]:
home = driver.find_element(By.XPATH, '//a[@class="nav-logo-link nav-progressive-attribute"]').click()
time.sleep(2)

In [8]:
img_url3 = []
srchbar = driver.find_element(By.ID, "twotabsearchtextbox").send_keys('Jeans for Men')
driver.find_element(By.ID, "nav-search-submit-button").click()  #clicking on search button
time.sleep(1)
# scrolling the web page to get more images
for j in range(0,10):
    driver.execute_script("window.scrollBy(0,1000)")
    try:
        imgs = driver.find_elements(By.XPATH,"//div[@class='a-section aok-relative s-image-tall-aspect']/img")
        for image in imgs:
            source = image.get_attribute('src')
            img_url3.append(source)
    except:
        pass
    time.sleep(5)

In [10]:
img_url=img_url1+img_url2+img_url3
len(img_url)

2196

In [11]:
img_url

['https://m.media-amazon.com/images/I/71gcK8AHO0L._AC_UL320_.jpg',
 'https://m.media-amazon.com/images/I/51tQ9Tog6nL._AC_UL320_.jpg',
 'https://m.media-amazon.com/images/I/9104TMMu43L._AC_UL320_.jpg',
 'https://m.media-amazon.com/images/I/810fvmKzXML._AC_UL320_.jpg',
 'https://m.media-amazon.com/images/I/71XAzGJXvkL._AC_UL320_.jpg',
 'https://m.media-amazon.com/images/I/81dkSgs7nsL._AC_UL320_.jpg',
 'https://m.media-amazon.com/images/I/711vlC6N6bL._AC_UL320_.jpg',
 'https://m.media-amazon.com/images/I/913bt-+KNEL._AC_UL320_.jpg',
 'https://m.media-amazon.com/images/I/919lNOaU-GL._AC_UL320_.jpg',
 'https://m.media-amazon.com/images/I/6154v1-RXBL._AC_UL320_.jpg',
 'https://m.media-amazon.com/images/I/71rjp9EB2wL._AC_UL320_.jpg',
 'https://m.media-amazon.com/images/I/71Qwc3PkyYL._AC_UL320_.jpg',
 'https://m.media-amazon.com/images/I/71Mf7Es+qiL._AC_UL320_.jpg',
 'https://m.media-amazon.com/images/I/711PQqpJ-rS._AC_UL320_.jpg',
 'https://m.media-amazon.com/images/I/8146BLEzqPL._AC_UL320_.j

In [17]:
print(" Length of img_url1: ",len(img_url1),"\n Length of img_url2: ",len(img_url2),"\n Length of img_url3: ",len(img_url3))

 Length of img_url1:  796 
 Length of img_url2:  720 
 Length of img_url3:  680


In [18]:
Sarees=img_url1[:680]
Trousers=img_url2[:680]
Jeans=img_url3[:680]

In [19]:
# Creating Directories
import os
import shutil
import requests

def directory(dir):
    current_path=os.getcwd()
    new=os.path.join(current_path,dir)
    if not os.path.exists(new):
        os.makedirs(new)
        
directory('Sarees_Images')
directory('Trousers_Images')
directory('Jeans_Images')

In [20]:
# Dowloading images
for index, link in enumerate(Sarees):
    print('Downloading {0} of 680 saree images'.format(index+1))
    response=requests.get(link)
    with open('Sarees_Images/img{0}.jpeg'.format(index+1),"wb") as file:
        file.write(response.content)

Downloading 1 of 680 saree images
Downloading 2 of 680 saree images
Downloading 3 of 680 saree images
Downloading 4 of 680 saree images
Downloading 5 of 680 saree images
Downloading 6 of 680 saree images
Downloading 7 of 680 saree images
Downloading 8 of 680 saree images
Downloading 9 of 680 saree images
Downloading 10 of 680 saree images
Downloading 11 of 680 saree images
Downloading 12 of 680 saree images
Downloading 13 of 680 saree images
Downloading 14 of 680 saree images
Downloading 15 of 680 saree images
Downloading 16 of 680 saree images
Downloading 17 of 680 saree images
Downloading 18 of 680 saree images
Downloading 19 of 680 saree images
Downloading 20 of 680 saree images
Downloading 21 of 680 saree images
Downloading 22 of 680 saree images
Downloading 23 of 680 saree images
Downloading 24 of 680 saree images
Downloading 25 of 680 saree images
Downloading 26 of 680 saree images
Downloading 27 of 680 saree images
Downloading 28 of 680 saree images
Downloading 29 of 680 saree i

Downloading 233 of 680 saree images
Downloading 234 of 680 saree images
Downloading 235 of 680 saree images
Downloading 236 of 680 saree images
Downloading 237 of 680 saree images
Downloading 238 of 680 saree images
Downloading 239 of 680 saree images
Downloading 240 of 680 saree images
Downloading 241 of 680 saree images
Downloading 242 of 680 saree images
Downloading 243 of 680 saree images
Downloading 244 of 680 saree images
Downloading 245 of 680 saree images
Downloading 246 of 680 saree images
Downloading 247 of 680 saree images
Downloading 248 of 680 saree images
Downloading 249 of 680 saree images
Downloading 250 of 680 saree images
Downloading 251 of 680 saree images
Downloading 252 of 680 saree images
Downloading 253 of 680 saree images
Downloading 254 of 680 saree images
Downloading 255 of 680 saree images
Downloading 256 of 680 saree images
Downloading 257 of 680 saree images
Downloading 258 of 680 saree images
Downloading 259 of 680 saree images
Downloading 260 of 680 saree

Downloading 461 of 680 saree images
Downloading 462 of 680 saree images
Downloading 463 of 680 saree images
Downloading 464 of 680 saree images
Downloading 465 of 680 saree images
Downloading 466 of 680 saree images
Downloading 467 of 680 saree images
Downloading 468 of 680 saree images
Downloading 469 of 680 saree images
Downloading 470 of 680 saree images
Downloading 471 of 680 saree images
Downloading 472 of 680 saree images
Downloading 473 of 680 saree images
Downloading 474 of 680 saree images
Downloading 475 of 680 saree images
Downloading 476 of 680 saree images
Downloading 477 of 680 saree images
Downloading 478 of 680 saree images
Downloading 479 of 680 saree images
Downloading 480 of 680 saree images
Downloading 481 of 680 saree images
Downloading 482 of 680 saree images
Downloading 483 of 680 saree images
Downloading 484 of 680 saree images
Downloading 485 of 680 saree images
Downloading 486 of 680 saree images
Downloading 487 of 680 saree images
Downloading 488 of 680 saree

In [21]:
for index, link in enumerate(Trousers):
    print('Downloading {0} of 680 trouser images'.format(index+1))
    response=requests.get(link)
    with open('Trousers_Images/img{0}.jpeg'.format(index+1),"wb") as file:
        file.write(response.content)

Downloading 1 of 680 trouser images
Downloading 2 of 680 trouser images
Downloading 3 of 680 trouser images
Downloading 4 of 680 trouser images
Downloading 5 of 680 trouser images
Downloading 6 of 680 trouser images
Downloading 7 of 680 trouser images
Downloading 8 of 680 trouser images
Downloading 9 of 680 trouser images
Downloading 10 of 680 trouser images
Downloading 11 of 680 trouser images
Downloading 12 of 680 trouser images
Downloading 13 of 680 trouser images
Downloading 14 of 680 trouser images
Downloading 15 of 680 trouser images
Downloading 16 of 680 trouser images
Downloading 17 of 680 trouser images
Downloading 18 of 680 trouser images
Downloading 19 of 680 trouser images
Downloading 20 of 680 trouser images
Downloading 21 of 680 trouser images
Downloading 22 of 680 trouser images
Downloading 23 of 680 trouser images
Downloading 24 of 680 trouser images
Downloading 25 of 680 trouser images
Downloading 26 of 680 trouser images
Downloading 27 of 680 trouser images
Downloadin

Downloading 221 of 680 trouser images
Downloading 222 of 680 trouser images
Downloading 223 of 680 trouser images
Downloading 224 of 680 trouser images
Downloading 225 of 680 trouser images
Downloading 226 of 680 trouser images
Downloading 227 of 680 trouser images
Downloading 228 of 680 trouser images
Downloading 229 of 680 trouser images
Downloading 230 of 680 trouser images
Downloading 231 of 680 trouser images
Downloading 232 of 680 trouser images
Downloading 233 of 680 trouser images
Downloading 234 of 680 trouser images
Downloading 235 of 680 trouser images
Downloading 236 of 680 trouser images
Downloading 237 of 680 trouser images
Downloading 238 of 680 trouser images
Downloading 239 of 680 trouser images
Downloading 240 of 680 trouser images
Downloading 241 of 680 trouser images
Downloading 242 of 680 trouser images
Downloading 243 of 680 trouser images
Downloading 244 of 680 trouser images
Downloading 245 of 680 trouser images
Downloading 246 of 680 trouser images
Downloading 

Downloading 440 of 680 trouser images
Downloading 441 of 680 trouser images
Downloading 442 of 680 trouser images
Downloading 443 of 680 trouser images
Downloading 444 of 680 trouser images
Downloading 445 of 680 trouser images
Downloading 446 of 680 trouser images
Downloading 447 of 680 trouser images
Downloading 448 of 680 trouser images
Downloading 449 of 680 trouser images
Downloading 450 of 680 trouser images
Downloading 451 of 680 trouser images
Downloading 452 of 680 trouser images
Downloading 453 of 680 trouser images
Downloading 454 of 680 trouser images
Downloading 455 of 680 trouser images
Downloading 456 of 680 trouser images
Downloading 457 of 680 trouser images
Downloading 458 of 680 trouser images
Downloading 459 of 680 trouser images
Downloading 460 of 680 trouser images
Downloading 461 of 680 trouser images
Downloading 462 of 680 trouser images
Downloading 463 of 680 trouser images
Downloading 464 of 680 trouser images
Downloading 465 of 680 trouser images
Downloading 

Downloading 657 of 680 trouser images
Downloading 658 of 680 trouser images
Downloading 659 of 680 trouser images
Downloading 660 of 680 trouser images
Downloading 661 of 680 trouser images
Downloading 662 of 680 trouser images
Downloading 663 of 680 trouser images
Downloading 664 of 680 trouser images
Downloading 665 of 680 trouser images
Downloading 666 of 680 trouser images
Downloading 667 of 680 trouser images
Downloading 668 of 680 trouser images
Downloading 669 of 680 trouser images
Downloading 670 of 680 trouser images
Downloading 671 of 680 trouser images
Downloading 672 of 680 trouser images
Downloading 673 of 680 trouser images
Downloading 674 of 680 trouser images
Downloading 675 of 680 trouser images
Downloading 676 of 680 trouser images
Downloading 677 of 680 trouser images
Downloading 678 of 680 trouser images
Downloading 679 of 680 trouser images
Downloading 680 of 680 trouser images


In [23]:
for index, link in enumerate(Jeans):
    print('Downloading {0} of 680 jeans images'.format(index+1))
    response=requests.get(link)
    with open('Jeans_Images/img{0}.jpeg'.format(index+1),"wb") as file:
        file.write(response.content)

Downloading 1 of 680 jeans images
Downloading 2 of 680 jeans images
Downloading 3 of 680 jeans images
Downloading 4 of 680 jeans images
Downloading 5 of 680 jeans images
Downloading 6 of 680 jeans images
Downloading 7 of 680 jeans images
Downloading 8 of 680 jeans images
Downloading 9 of 680 jeans images
Downloading 10 of 680 jeans images
Downloading 11 of 680 jeans images
Downloading 12 of 680 jeans images
Downloading 13 of 680 jeans images
Downloading 14 of 680 jeans images
Downloading 15 of 680 jeans images
Downloading 16 of 680 jeans images
Downloading 17 of 680 jeans images
Downloading 18 of 680 jeans images
Downloading 19 of 680 jeans images
Downloading 20 of 680 jeans images
Downloading 21 of 680 jeans images
Downloading 22 of 680 jeans images
Downloading 23 of 680 jeans images
Downloading 24 of 680 jeans images
Downloading 25 of 680 jeans images
Downloading 26 of 680 jeans images
Downloading 27 of 680 jeans images
Downloading 28 of 680 jeans images
Downloading 29 of 680 jeans i

Downloading 234 of 680 jeans images
Downloading 235 of 680 jeans images
Downloading 236 of 680 jeans images
Downloading 237 of 680 jeans images
Downloading 238 of 680 jeans images
Downloading 239 of 680 jeans images
Downloading 240 of 680 jeans images
Downloading 241 of 680 jeans images
Downloading 242 of 680 jeans images
Downloading 243 of 680 jeans images
Downloading 244 of 680 jeans images
Downloading 245 of 680 jeans images
Downloading 246 of 680 jeans images
Downloading 247 of 680 jeans images
Downloading 248 of 680 jeans images
Downloading 249 of 680 jeans images
Downloading 250 of 680 jeans images
Downloading 251 of 680 jeans images
Downloading 252 of 680 jeans images
Downloading 253 of 680 jeans images
Downloading 254 of 680 jeans images
Downloading 255 of 680 jeans images
Downloading 256 of 680 jeans images
Downloading 257 of 680 jeans images
Downloading 258 of 680 jeans images
Downloading 259 of 680 jeans images
Downloading 260 of 680 jeans images
Downloading 261 of 680 jeans

Downloading 464 of 680 jeans images
Downloading 465 of 680 jeans images
Downloading 466 of 680 jeans images
Downloading 467 of 680 jeans images
Downloading 468 of 680 jeans images
Downloading 469 of 680 jeans images
Downloading 470 of 680 jeans images
Downloading 471 of 680 jeans images
Downloading 472 of 680 jeans images
Downloading 473 of 680 jeans images
Downloading 474 of 680 jeans images
Downloading 475 of 680 jeans images
Downloading 476 of 680 jeans images
Downloading 477 of 680 jeans images
Downloading 478 of 680 jeans images
Downloading 479 of 680 jeans images
Downloading 480 of 680 jeans images
Downloading 481 of 680 jeans images
Downloading 482 of 680 jeans images
Downloading 483 of 680 jeans images
Downloading 484 of 680 jeans images
Downloading 485 of 680 jeans images
Downloading 486 of 680 jeans images
Downloading 487 of 680 jeans images
Downloading 488 of 680 jeans images
Downloading 489 of 680 jeans images
Downloading 490 of 680 jeans images
Downloading 491 of 680 jeans

#### All images have been downloaded and saved