# Selenium On Docker
This notebook does the following:
1. Spin up an exernal Selenium Docker container on the host.
2. Configures the remote Selenium Webdriver.
3. Sends commands to the Selenium Webdriver:
    We will be downloading mp3s from the bbc: wake up to money daily podcast.
4. Remove the container.

## Imports

In [8]:
import docker
import os
import time
from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
from selenium.webdriver.chrome.options import Options

In [9]:
entrypoint ={}

## Start Selenium container

In [30]:
cwd = os.getcwd()
local_downloads = '{}/downloads'.format(cwd)
sel_downloads = '/home/seluser/downloads'
client = docker.from_env()
container = client.containers.run('selenium/standalone-chrome', \
        command=['mkdir -p {}'.format(sel_downloads)], \
        volumes=['{}:{}'.format(local_downloads, sel_downloads),
                 '/dev/shm:/dev/shm'], \
        ports={'4444/tcp':4444},
        network='container_bridge',
        detach=True)
cli = docker.APIClient()

APIError: 400 Client Error: Bad Request ("OCI runtime create failed: container_linux.go:345: starting container process caused "exec: \"mkdir -p /home/seluser/downloads\": stat mkdir -p /home/seluser/downloads: no such file or directory": unknown")

## Configure webdriver

In [3]:
options = Options()
options.add_argument("--headless")
options.add_argument("--window-size=1920x1080")
chrome_driver = '{}:4444/wd/hub'.format('http://127.0.0.1') # This is only required for local development

# wait for remote, unless timeout.
while True:
    try:
        driver = webdriver.Remote(
            command_executor=chrome_driver,
            desired_capabilities=DesiredCapabilities.CHROME, options=options)
        print('remote ready')
        break
    except:
        print('remote not ready, sleeping for ten seconds.')
        time.sleep(10)
        
# Enable downloads in headless chrome.
driver.command_executor._commands["send_command"] = ("POST", '/session/$sessionId/chromium/send_command')
params = {'cmd': 'Page.setDownloadBehavior', 'params': {'behavior': 'allow', 'downloadPath': sel_downloads}}
command_result = driver.execute("send_command", params)

remote not ready, sleeping for ten seconds.
remote ready


## Download Wake up to Money MP3 files
The configured webdriver will be used to download the mp3s from the BBC podcast Wake up to Money

In [4]:
from selenium_scripts.wake_up_to_money import download_podcast

In [6]:
download_podcast(driver,
                 'https://www.bbc.co.uk/programmes/b0070lr5/episodes/downloads',
                 local_downloads,
                 '20191008')

file successfully downloded to /Users/harry.daniels/Documents/medium/airflow_selenium/downloads/WakeUpToMoney-20191008.mp3


## Remove docker container

In [7]:
container.remove(force=True)
print('Removed container: {}'.format(container.id))

Removed container: 805815c59b0a8c161d491415ca5d064bc5d9f51fbef88a94e4047a8cbcfeee52
