High-fidelity capture of Twitter threads as sealed PDFs @ social.perma.cc.
An experiment of the Harvard Library Innovation Lab.
- Google Chrome (
npx playwright install --force chromemay be used).
⚠️ For now: Python dependencies are installed at machine level, as a post-install step ofnpm install.
curl bash gcc g++ python3 python3-pip python3-dev zlib1g zlib1g-dev libjpeg-dev libssl-dev libffi-dev ghostscript poppler-utils
⚠️ On Linux, this project is only compatible with Ubuntu at the time, because it uses Playwright + Chrome.- Node may be sourced from Nodesource.
A brewfile is available. Run brew bundle to install machine-level dependencies that can be provided by homebrew.
Run the following commands to initialize the project and start the development server.
brew bundle # (Mac OS only) - See Linux dependencies above.
npm install # To install npm packages
npx playwright install chrome # To ensure Playwright has a version of Chrome to talk to
npm run generate-dev-cert # Will generate a certificate for self-signing PDFs. For testing purposes only.
npm run dev # Starts the development server on port 3000The "Signatures Verification Page" page lists the certificates that were used for signing PDFs with the app. You may provide that history by creating two files under /data:
signing-certs-history.jsontimestamping-certs-history.json
Expected format:
[
{
"from": "2022-11-18 13:07:56 UTC",
"to": "present",
"domain": "domain.ext",
"info": "https://...",
"cert": "https://..."
},
...
]npm run startStarts the app's server on port 3000 with warning-level logs.
npm run devStarts the app's server on port 3000 with info-level logs. Watches for file changes.
npm run generate-dev-certGenerate a certs/cert.pem and certs/key.pem for local development purposes.
npm run docgenGenerates JSDoc-based code documentation under /docs.
npm run testRuns the test suite. Requires test fixtures (see fixtures folder).
⚠️ At the moment, this codebase only features a very limited set of high-level integration tests.
| Name | Required? | Description |
|---|---|---|
CERTS_PATH |
No | If set, will be used as path to .pem files used for signing .PDF files. |
DATA_PATH |
No | If set, will be used as path to folder used for storing app data. |
TEMPLATES_PATH |
No | If set, will be used as a templates path. Can be used to replace the website's UI with a custom one. |
REQUIRE_ACCESS_KEY |
No | If set and "1", an access key will be required to make capture. |
MAX_PARALLEL_CAPTURES_TOTAL |
No | If set and contains an integer, determines the maximum of captures that the server can run in parallel. |
MAX_PARALLEL_CAPTURES_PER_IP |
No | If set and contains an integer, determines the maximum of captures that a single client can run in parallel. |
If the REQUIRE_ACCESS_KEY environment variable is on, users will be required to use an access key to make captures.
Keys can be stored in a file named access-key.json under the "data" folder.
Example: app/data/access-keys.json:
{
"BB67BBC4-1F4B-4353-8E6D-9927A10F4509": true
}$ uuidgen
BB67BBC4-1F4B-4353-8E6D-9927A10F4509
