The html-to-file service converts an html page to a file (image or pdf) using puppeteer.
The goal for this project is to be able to convert any html page into a file, or to convert custom ejs templates into a file.
File types that are currently supported are images .png and PDF's .pdf.
This service is currently hosted on Heroku here: https://html-to-file.herokuapp.com/generate?url=https://www.google.com (you may change the url to the one you would like to capture). Here are the services supported endpoints:
-
/generate:This generates a file from the webpage for the specified url and responds with a link to the file. The supported query params for this endpoint are:-
url(required): the url of the page to be converted into a file -
type: defaults to image, unless otherwise specified -
selector(for images only): a css selector that targets an html element to be captured -
responseKind: specifies what the service should respond with. It may be one of the following:json(default): will return a json object with aresourceLinkanddownloadLinkthat may be used to preview the resource or download the resource, respectively.download: will download the file to the browser that loaded the generate linkresource: will return the file itself for preview in the browserbuffer: will return the file as a buffer object. The shape of the response body is{ buffer: Buffer }
-
fallbackUrl: allows you to specify a url that accessors of the link will be shown when they try to access a link to a generated resource that is no longer available. Such a link could be to your custom page allowing the user to regenerate the resource, or directly be a link to regenerate the same resource with this service. Eg. https://html-to-file.herokuapp.com/generate?url=https://www.google.com?fallbackUrl=https://html-to-file.herokuapp.com/generate?url=https://www.google.com -
autoRegenerate: defaults to 'true'. When 'true', the resource will be auto-regenerated for the user if it has been deleted or no longer exists. This functionality relies on the specific formatting of the file name. i.e'htf_***_***.(png|pdf)'
-
Files created by this service are deleted immediately after they are accessed, or shortly after they have been created (at least 30 seconds).
Embedded in the names of files that are generated are the instructions needed to regenerate the file on request, in case it is no longer available.
This functionality is available by default, unless autoRegenerate=false is provided in the query string on the /generate endpoint.
If necessary, you may consider caching generated files or uploading to your own server for longer file persistence.
Here are the steps to get you up and running in a development environment:
- Install
node(andnpm) if you haven't already from here, as well astsc,ts-nodeandnodemon, using
npm install -g tsc ts-node nodemon
Do not add -g flag if you only want to install them for this project.
- Install dependencies for the project using the command,
npm install
- Start the server in the development environment, using
npm run dev
Make sure to set BASE_URL in .env file when deployed. This env variable will be sent back to clients so that they can access files they have generated.
If deploying to heroku, you may have to manually set some buildpacks for puppeteer to run.
When deployed, the service can be started with the command npm run start, which will compile .tsc files and start-up the node server.
To demo this locally, start the server and enter this url into a browser
http://localhost:4000/template/test?url=http://www.google.com&type=image
To demo on live heroku site, enter this url into browser
https://html-to-file.herokuapp.com/template/test?url=http://www.google.com&type=image