#Kuler: Scraping XML

Using a developer tool from a web browser (e.g., Chrome) I can locate the information about the color themes in the <a href = "https://color.adobe.com/explore/most-popular/?time=all">Kuler</a> website by checking the DOM tree. Using browser automation (Selenium in my case), I can scrape RGB color codes from color themes.
<p>
However, the problem is that in the "Explore" page, XML does not contain much information. For more information, you have to click each theme for scraping. For this reason, I prefer scraping JSON response.
<p>
For more information, check out [this post](http://www.hongsup.com/blog/2015/6/12/data-wrangling).

####1. Open a browser (Firefox) using Selenium and go to the Kuler page

In [1]:
from selenium import webdriver
# default: Firefox 
driver = webdriver.Firefox() 
# We scrape Top n popular themes
driver.get("https://color.adobe.com/explore/most-popular/?time=all")

####2. Go to the bottom of the page n_reloads times to load more data

In [2]:
import time
n_reloads = 5
pause = 0
for _ in range(n_reloads):
    driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
    time.sleep(pause)

####3. Find an XML element by using its Xpath address (check Chrome Inspector)
- example here is the 4th color from the 6th theme

In [3]:
# Color information is stored in "style"
driver.find_element_by_xpath('//*[@id="content"]/div/div/div[6]/div/div/div[4]').get_attribute('style')

u'background: rgb(243, 255, 226) none repeat scroll 0% 0%;'

####4. Loop over themes and colors to scrape RGB codes

In [4]:
for j in range(1,3):
    for i in range(1,6):
        string = '//*[@id="content"]/div/div/div[' + str(j) + ']/div/div/div[' + str(i) + ']'
        print driver.find_element_by_xpath(string).get_attribute('style')

background: rgb(230, 226, 175) none repeat scroll 0% 0%;
background: rgb(167, 163, 126) none repeat scroll 0% 0%;
background: rgb(239, 236, 202) none repeat scroll 0% 0%;
background: rgb(4, 99, 128) none repeat scroll 0% 0%;
background: rgb(0, 47, 47) none repeat scroll 0% 0%;
background: rgb(70, 137, 102) none repeat scroll 0% 0%;
background: rgb(255, 240, 165) none repeat scroll 0% 0%;
background: rgb(255, 176, 59) none repeat scroll 0% 0%;
background: rgb(182, 73, 38) none repeat scroll 0% 0%;
background: rgb(142, 40, 0) none repeat scroll 0% 0%;
