Skip to content

sgreszcz/Selenium-Python-Docker-scraper

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Useful Docker Selenium container image for automation

Forked from: dimmg/dockselpy

BUILD IMAGE:

docker build -t selenium . (in the same path as the Dockerfile)

Running with fabric

Run fab help for tasks

  • fab run
    • Runs a container, mounting ./code and local /etc/hosts, runs ./code/main.py and removes the container when done
  • fab test
    • Runs a container, mounting ./code and local /etc/hosts, runs ./code/test.py and removes the container when done

RUN CONTAINER:

// run container, mount the python code, mount our local etc/hosts file, image to run (selenium) execute python
docker run --rm -v $(pwd)/code:/code -v /etc/hosts:/etc/hosts -it selenium python /code/main.py

EXAMPLE OF CODE WITH SELENIUM AND FIREFOX:

from pyvirtualdisplay import Display
from selenium import webdriver

display = Display(visible=0, size=(800, 600))
display.start()

# now Firefox will run in a virtual display. 
# you will not see the browser.
browser = webdriver.Firefox()
browser.get('http://www.google.com')
print browser.title
browser.quit()

display.stop()

EXAMPLE OF CODE WITH GOOGLE HEADLESS

from pyvirtualdisplay import Display
from selenium import webdriver
from selenium.webdriver.chrome.options import Options

display = Display(visible=0, size=(1200, 800))
display.start()
# start Chrome
chrome_options = Options()
chrome_options.add_argument("--headless")
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument("--window-size=1200x800")
browser = webdriver.Chrome(chrome_options=chrome_options)
browser.get('http://google.com/')
print(browser.title)
browser.quit()
display.stop()

Example with phantomjs

 browser = webdriver.PhantomJS("phantomjs")
browser.get("https://twitter.com/StackStatus")
print(browser.title)

pause = 3

lastHeight = browser.execute_script("return document.body.scrollHeight")
print(lastHeight)
i = 0
browser.save_screenshot("/code/screenshots/test03_1_" + str(i) + ".png")
while True:
    browser.execute_script("window.scrollTo(0, document.body.scrollHeight);")
    time.sleep(pause)
    newHeight = browser.execute_script("return document.body.scrollHeight")
    print(newHeight)
    if newHeight == lastHeight:
        break
    lastHeight = newHeight
    i += 1
    browser.save_screenshot("/code/screenshots/test03_1_" + str(i) + ".png")

browser.quit()

About

A docker image for writing selenium stuffs

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 53.9%
  • Dockerfile 43.3%
  • Shell 2.8%