Skip to content

Render a page with Chrome

Niko Carpenter edited this page Feb 1, 2024 · 2 revisions

Render pages with Chrome

When encountering a page that Edbrowse just cannot render, it is possible to ask Chrome to render the page, and send that back to Edbrowse. To accomplish this, we make use of Selenium to ask Chrome to load a page, wait for the page to load, or for a maximum timeout, whichever comes first, then give us the resulting HTML.

Requirements

You will need to have the following installed:

  • Google Chrome
  • Python
  • Selenium (pip3 install selenium)

Setup

You will need to add the files chrome-render.ebrc and chrome-render.py somewhere, E.G. ~/.config/edbrowse/plugins. If you place these files somewhere else, you'll need to modify the relevant paths in chrome-render.ebrc and your .ebrc.

Files

Place the following two files in a directory on your system:

chrome-render.ebrc

plugin {
    type = */*
    desc = Render page with Chrome
    protocol = chrmrndr
    program = ~/.config/edbrowse/plugins/chrome-render.py %i
    outtype = h
}

function+chrome {
    db0
    0A
    2,d
    s;^<br><a href='\(.*\)'>$;b chrmrndr://$1;f
    bw
    up
    db1
    <*-1
}

chrome-render.py

#!/usr/bin/env python3

import os
import pathlib
import sys
import time

from selenium import webdriver
from selenium.webdriver.chrome.options import Options

if len(sys.argv) < 2:
    print(f"Usage: {sys.argv[0]} <url>", file=sys.stderr)
    sys.exit(1)

# Edbrowse prepends `chrmrndr://` to the URL to call the plugin that executes this script.
url = sys.argv[1].removeprefix('chrmrndr://')
if os.path.exists(url):
    # This is a local file.
    # Chrome expects `file://...`
    url = pathlib.Path(os.path.abspath(url)).as_uri()

chrome_options = Options()
chrome_options.add_argument("--headless=new")
chrome_options.add_argument("--blink-settings=imagesEnabled=false")
driver = webdriver.Chrome(options=chrome_options)
driver.get(url)

# Try to wait until the page is fully loaded, by continuing to pull the page
# source until it hasn't changed for a certain period of time. Return what we
# have, if it's been 4 seconds, even if the page is still changing.
start = time.perf_counter()
html = ''
last_changed = start
while (now := time.perf_counter()) - start < 4.0 and now - last_changed < 0.2:
    time.sleep(0.05)
    new_html = driver.page_source
    if new_html != html:
        html = new_html
        last_changed = time.perf_counter()

sys.stdout.write(html)

Edbrowse config

In your .ebrc, add the following line:

include = ~/.config/edbrowse/plugins/chrome-render.ebrc

Usage

To render a page in Chrome, simply type <chrome. The URL will change to something like chrmrndr://https://www.example.com. This is so you can refresh the rendered page, if you want to reload and have Chrome rerender it.

Limitations

Currently, this script opens a new hidden Chrome instance each time you render a page. As a result, load times can be a bit slow -- on the order of a couple seconds. This only renders a snapshot of the page. If Edbrowse is unable to handle JavaScript that responds to user interaction, that will still be the case in pages rendered by Chrome.