New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get Datasette Lite working without loading external resources #40
Comments
For simplicity here I'm inclined to have this https://github.com/simonw/datasette-lite Git repository contain vendored copies of all of the dependencies needed to run Datasette Lite. That way anyone who clones this repo will have everything they need to run Datasette Lite themselves. I like the privacy benefits of having everything loaded from the same domain, with no requests made at all to other domains. |
I used Firefox to fully load Datasette Lite, then copied out a HAR file from the network pane (13.5MB of JSON). Then I ran this:
Resulting in this:
|
Downloading all those wheels:
|
I tried this: diff --git a/webworker.js b/webworker.js
index e27ff78..6abe837 100644
--- a/webworker.js
+++ b/webworker.js
@@ -50,9 +50,28 @@ async function startDatasette(settings) {
names.append(name)
import micropip
- # Workaround for Requested 'h11<0.13,>=0.11', but h11==0.13.0 is already installed
- await micropip.install("h11==0.12.0")
- await micropip.install("datasette")
+ await micropip.install("wheels/certifi-2022.6.15-py3-none-any.whl")
+ await micropip.install("wheels/python_multipart-0.0.4-py3-none-any.whl")
+ await micropip.install("wheels/itsdangerous-2.1.2-py3-none-any.whl")
+ await micropip.install("wheels/click-8.1.3-py3-none-any.whl")
+ await micropip.install("wheels/click_default_group_wheel-1.2.2-py3-none-any.whl")
+ await micropip.install("wheels/asgiref-3.5.2-py3-none-any.whl")
+ await micropip.install("wheels/h11-0.12.0-py3-none-any.whl")
+ await micropip.install("wheels/idna-3.3-py3-none-any.whl")
+ await micropip.install("wheels/sniffio-1.2.0-py3-none-any.whl")
+ await micropip.install("wheels/anyio-3.6.1-py3-none-any.whl")
+ await micropip.install("wheels/aiofiles-0.8.0-py3-none-any.whl")
+ await micropip.install("wheels/asgi_csrf-0.9-py3-none-any.whl")
+ await micropip.install("wheels/Pint-0.18-py2.py3-none-any.whl")
+ await micropip.install("wheels/uvicorn-0.18.2-py3-none-any.whl")
+ await micropip.install("wheels/Jinja2-3.0.3-py3-none-any.whl")
+ await micropip.install("wheels/mergedeep-1.3.4-py3-none-any.whl")
+ await micropip.install("wheels/hupper-1.10.3-py2.py3-none-any.whl")
+ await micropip.install("wheels/httpcore-0.15.0-py3-none-any.whl")
+ await micropip.install("wheels/janus-1.0.0-py3-none-any.whl")
+ await micropip.install("wheels/rfc3986-1.5.0-py2.py3-none-any.whl")
+ await micropip.install("wheels/httpx-0.23.0-py3-none-any.whl")
+ await micropip.install("wheels/datasette-0.62-py3-none-any.whl")
# Install any extra ?install= dependencies
install_urls = ${JSON.stringify(settings.installUrls)}
if install_urls: You have to get the order of installation exactly right, or one of the wheels will trigger another wheel to be loaded from PyPI. I kept tweaking the order and watching the network pane in Firefox. The above diff almost gets it but there are still three wheels coming from PyPI: |
Not sure how best to automate the process of figuring out those dependencies, grabbing the right wheel versions, caching them locally in the repo AND figuring out the right import order for them. |
Just spotted this in the docs:
Looks like that's new in Pyodide 0.21.0 (I am on 0.20.0 right now): https://pyodide.org/en/stable/project/changelog.html#version-0-21-0 |
Upgrading to 0.21.0 and using
My guess is that This looks like a useful way to look up files in there: https://cdn.jsdelivr.net/npm/pyodide@0.21.0/repodata.json "markupsafe": {
"name": "MarkupSafe",
"version": "2.1.1",
"file_name": "MarkupSafe-2.1.1-cp310-cp310-emscripten_3_1_14_wasm32.whl",
"install_dir": "site",
"sha256": "c3cfbcc5f7927add3de4b3c698afe234730452cb0a2e566336b55ecdf16857c5",
"depends": [],
"imports": [
"markupsafe"
]
}, And sure enough this is a working URL: https://cdn.jsdelivr.net/pyodide/v0.21.0/full/MarkupSafe-2.1.1-cp310-cp310-emscripten_3_1_14_wasm32.whl |
Same problem for:
|
I'm just going to grab all of these wheels:
|
This loaded successfully (once I had the diff --git a/webworker.js b/webworker.js
index e27ff78..cc9578e 100644
--- a/webworker.js
+++ b/webworker.js
@@ -1,4 +1,4 @@
-importScripts("https://cdn.jsdelivr.net/pyodide/v0.20.0/full/pyodide.js");
+importScripts("https://cdn.jsdelivr.net/pyodide/v0.21.0/full/pyodide.js");
function log(line) {
console.log({line})
@@ -29,7 +29,7 @@ async function startDatasette(settings) {
toLoad.push(["content.db", "https://datasette.io/content.db"]);
}
self.pyodide = await loadPyodide({
- indexURL: "https://cdn.jsdelivr.net/pyodide/v0.20.0/full/"
+ indexURL: "https://cdn.jsdelivr.net/pyodide/v0.21.0/full/"
});
await pyodide.loadPackage('micropip', log);
await pyodide.loadPackage('ssl', log);
@@ -50,9 +50,35 @@ async function startDatasette(settings) {
names.append(name)
import micropip
- # Workaround for Requested 'h11<0.13,>=0.11', but h11==0.13.0 is already installed
- await micropip.install("h11==0.12.0")
- await micropip.install("datasette")
+ await micropip.install("wheels/packaging-21.3-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/pyparsing-3.0.7-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/typing_extensions-4.1.1-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/six-1.16.0-py2.py3-none-any.whl", deps=False)
+ await micropip.install("wheels/MarkupSafe-2.1.1-cp310-cp310-emscripten_3_1_14_wasm32.whl", deps=False)
+ await micropip.install("wheels/PyYAML-6.0-cp310-cp310-emscripten_3_1_14_wasm32.whl", deps=False)
+ await micropip.install("wheels/pluggy-1.0.0-py2.py3-none-any.whl", deps=False)
+ await micropip.install("wheels/certifi-2022.6.15-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/python_multipart-0.0.4-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/itsdangerous-2.1.2-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/click-8.1.3-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/click_default_group_wheel-1.2.2-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/asgiref-3.5.2-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/h11-0.12.0-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/idna-3.3-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/sniffio-1.2.0-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/anyio-3.6.1-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/aiofiles-0.8.0-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/asgi_csrf-0.9-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/Pint-0.18-py2.py3-none-any.whl", deps=False)
+ await micropip.install("wheels/uvicorn-0.18.2-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/Jinja2-3.0.3-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/mergedeep-1.3.4-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/hupper-1.10.3-py2.py3-none-any.whl", deps=False)
+ await micropip.install("wheels/httpcore-0.15.0-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/janus-1.0.0-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/rfc3986-1.5.0-py2.py3-none-any.whl", deps=False)
+ await micropip.install("wheels/httpx-0.23.0-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/datasette-0.62-py3-none-any.whl", deps=False)
# Install any extra ?install= dependencies
install_urls = ${JSON.stringify(settings.installUrls)}
if install_urls: |
I got it to work offline with my WiFi turned off! I had to run this in the root directory:
Then I had to grab the latest Pyodide release from https://github.com/pyodide/pyodide/releases/tag/0.21.0 - then uncompress it and move the Then I applied this diff and started the server and it worked: diff --git a/index.html b/index.html
index de860bb..e2b1ede 100644
--- a/index.html
+++ b/index.html
@@ -2,7 +2,7 @@
<html>
<head>
<title>Datasette</title>
- <link rel="stylesheet" href="https://latest.datasette.io/-/static/app.css?cead5a">
+ <link rel="stylesheet" href="app.css">
<style>
#loading-indicator {
text-align: center;
diff --git a/webworker.js b/webworker.js
index e27ff78..8b3a444 100644
--- a/webworker.js
+++ b/webworker.js
@@ -1,4 +1,4 @@
-importScripts("https://cdn.jsdelivr.net/pyodide/v0.20.0/full/pyodide.js");
+importScripts("/pyodide/pyodide.js");
function log(line) {
console.log({line})
@@ -25,11 +25,10 @@ async function startDatasette(settings) {
if (needsDataDb) {
toLoad.push(["data.db", 0]);
} else {
- toLoad.push(["fixtures.db", "https://latest.datasette.io/fixtures.db"]);
- toLoad.push(["content.db", "https://datasette.io/content.db"]);
+ toLoad.push(["fixtures.db", "/fixtures.db"]);
}
self.pyodide = await loadPyodide({
- indexURL: "https://cdn.jsdelivr.net/pyodide/v0.20.0/full/"
+ indexURL: "/pyodide/"
});
await pyodide.loadPackage('micropip', log);
await pyodide.loadPackage('ssl', log);
@@ -50,9 +49,35 @@ async function startDatasette(settings) {
names.append(name)
import micropip
- # Workaround for Requested 'h11<0.13,>=0.11', but h11==0.13.0 is already installed
- await micropip.install("h11==0.12.0")
- await micropip.install("datasette")
+ await micropip.install("wheels/packaging-21.3-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/pyparsing-3.0.7-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/typing_extensions-4.1.1-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/six-1.16.0-py2.py3-none-any.whl", deps=False)
+ await micropip.install("wheels/MarkupSafe-2.1.1-cp310-cp310-emscripten_3_1_14_wasm32.whl", deps=False)
+ await micropip.install("wheels/PyYAML-6.0-cp310-cp310-emscripten_3_1_14_wasm32.whl", deps=False)
+ await micropip.install("wheels/pluggy-1.0.0-py2.py3-none-any.whl", deps=False)
+ await micropip.install("wheels/certifi-2022.6.15-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/python_multipart-0.0.4-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/itsdangerous-2.1.2-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/click-8.1.3-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/click_default_group_wheel-1.2.2-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/asgiref-3.5.2-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/h11-0.12.0-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/idna-3.3-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/sniffio-1.2.0-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/anyio-3.6.1-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/aiofiles-0.8.0-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/asgi_csrf-0.9-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/Pint-0.18-py2.py3-none-any.whl", deps=False)
+ await micropip.install("wheels/uvicorn-0.18.2-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/Jinja2-3.0.3-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/mergedeep-1.3.4-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/hupper-1.10.3-py2.py3-none-any.whl", deps=False)
+ await micropip.install("wheels/httpcore-0.15.0-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/janus-1.0.0-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/rfc3986-1.5.0-py2.py3-none-any.whl", deps=False)
+ await micropip.install("wheels/httpx-0.23.0-py3-none-any.whl", deps=False)
+ await micropip.install("wheels/datasette-0.62-py3-none-any.whl", deps=False)
# Install any extra ?install= dependencies
install_urls = ${JSON.stringify(settings.installUrls)}
if install_urls: |
One catch: the decompressed Pyodide folder is HUGE:
Because it has wheels for every Python wheel that Pyodide provides special WASM compiled versions of. I only need a fraction of those. Maybe I don't need any of the wheels at all? |
I tried deleting the fonts and ALL of the Got it down to 21M for the
Evidently not every |
I think the better way to approach this will be to create a |
I used diff --git a/shot_scraper/cli.py b/shot_scraper/cli.py
index 018581c..d3e2fc6 100644
--- a/shot_scraper/cli.py
+++ b/shot_scraper/cli.py
@@ -663,6 +663,7 @@ def take_shot(
if not use_existing_page:
page = context_or_page.new_page()
+ page.on("request", lambda request: print(">>", request.method, request.url))
else:
page = context_or_page
|
I shipped that feature in
To generate this SQL file: https://gist.github.com/simonw/7f41a43ba0f177238ed7bdd95078a0d4 Which I can then open in Datasette Lite like this: https://lite.datasette.io/?sql=https://gist.githubusercontent.com/simonw/7f41a43ba0f177238ed7bdd95078a0d4/raw/4fc0f80decce4e1ea1e925cdc2bf3f05d73034ed/datasette-lite.sql#/data/stdin And run this query to see just Which returns:
|
I think the next step is to build a script which loops through those URLs and downloads them into the |
Downloading all of those wheels produces a folder with 2.5M of total data in it. They don't compress well (since they are compressed already) - running |
The size of those files:
|
Looking at those, it would be nice if I could get rid of:
Also
|
Honestly though, it's not really worth the effort right now - those four only add up to ~400KB. And if this all works out they will be cached for the long term by the service worker. |
Loading https://lite.datasette.io/ a second time only transfers 22.55KB! Of 25.5MB total - but the vast majority of those assets are correctly cached already, which is nice. |
Here's a fun thing: I took a Firefox Profile of the initial load experience and uploaded it here: https://profiler.firefox.com/public/48aw3a07gj8s0y81f4wyqywa69q4csbmykbs550/calltree/?globalTrackOrder=xu0wxt&hiddenGlobalTracks=1wxs&hiddenLocalTracksByPid=1272-02w8~48832-0~76615-0~2594-0~15591-0~72954-0~8975-0~8852-0~773-0~73168-01~55239-0w5~7928-0&implementation=js&thread=ym&v=7 Then I deleted it because I wasn't sure if the profile I had uploaded would include sensitive data like cookies and suchlike. |
If we're willing to let the client fetch external resources for themselves on the first load, we can use the service-worker cache. I implemented one in my fork: |
If I'm going to eventually have it work offline as a PWA:
The first step is going to be having it work based entirely on files in this repository - with no external HTTP requests made at all.
I need that for both Pyodide and its WASM build as well as all of the Python wheels that Datasette Lite needs to install using
micropip
.Originally posted by @simonw in #26 (comment)
The text was updated successfully, but these errors were encountered: