-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Screenshot / webshot online service #63
Comments
I started something simple a while back ( https://github.com/rossjones/urlshotserver ) as a replacement for the ScraperWiki screenshot app (which uses PyQT and embedded webkit). It isn't complete, but I seem to recall it did actually work. Should only be a few more lines to get the callback working. node-webshot is likely to be more complete. |
Having worked with node-webshot and node.js extensively in the past, I feel this is something I could take on and contribute back. Is Heroku the desired platform for okfn-related services or would something that could be ran in a stand-alone mode be preferred? I'm doing this partly to support @okfn and partly as a fun way to learn new stacks/services so I wouldn't mind doing it on a totally new stack. |
@simong great to hear you could contribute here! In terms of the setup our strong preference would be nodejs or python and for nodejs to use express framework (and for python flask). Let's assume you go with nodejs for the moment (seems a natural fit here - its an IO heavy, async style app) then:
If you are looking for existing nodejs apps which run on Heroku here are a couple people have built in labs:
|
I happened to need something similar, I built this on top of node-webshot over the weekend, https://github.com/opsb/node-webshot-server . It includes imagemagick for resizing, heroku config and I use Amazon Cloudfront in front of it for caching control. |
@grp sure, I've added a BSD license. |
Apologies for responding so late, this just got buried under some other work. I think I can built upon @opsb's work. I'd be happy to deploy the app for okfn on Heroku. In case we want something more "robust" we could also do the following things:
WDYT? |
@simong all sounds very good - suggest we start with simplest thing possible and progressively enhance. |
I use amazon cloudfront in front of the service which makes reuse nice and fast (it allows cache key to include query string). Was thinking of using a standard format in the query string to pass options through to webshot, something along the lines of: ?webshot[windowSize.width]=300&webshot[windowSize.height]=150 you could make a prettier version but this way the translation from query string to webshot options would be simple. Adding in a queue sounds like a good idea. I don't have any experience with integrating node.js with queues, is it possible to keep a request open while a queue work does the job or would you need to use a web hook for when the image was ready? |
Although I feel that we should be using POSTs for this kind of thing, it certainly makes testing/using from the url bar easier.
I don't think it would be that much work to translate the query/form parameters into webshot parameters. Adding a queue in node can be done with something like amqp. The downside of adding a queue, is that we'll need a way to transfer the image from the worker node to the user doing the request. This can open a bit of a can of worms as then you need:
which moves away from the simple (but way less overhead) app that there is now. @rgrp Deploying the app (as it is now) is just a matter of following @opsb's README. |
@simong it would be nice to have the app at webshot.okfnlabs.org. A CNAME has been set up from webshot.okfnlabs.org to your herokuapp so all you need to do is:
Could you also do (so other labs folks can have access as needed):
Caching etc@simong @opsb I'm wondering about caching - what happens if a website changes in a week - i'd want to get the webshot from today not last week. If one wants to default for ease of use perhaps one some kind of refresh or latest flag to force a redo (e.g. ?refresh=1. If one is storing the content into s3 (for permanence and caching) we might want to have some structure like: url/width/height/date. You could then just drop the date if you don't need it. Feature idea - specify your own filenameRelating to caching stuff but somewhat different (and more of a feature) would be allowing users to specify a short name to save their screenshot at (a bit like bit.ly but for screenshots). E.g. you could do ?filename=... and then you could get that screenshot forever at:
QueueLet's keep it simple for the moment. If we get a lot of traffic we can start worrying about it but i think it should be fine for now. |
@simong thanks for adding the domain alias. I wonder whether it would be worth making the base page of the site be a proper homepage with a short intro and instructions (perhaps with a form where you can post a url and an instructions about the "api" ...) |
@simong any thoughts re the above? Also where's your repo - if I or others would like to contribute it would be nice to know what repo to fork :-) |
OK, official repo from for use in reporting issues/suggestions with webshot.okfnlabs.org etc is now at https://github.com/okfn/webshot/issues |
FIXED. Marking as fixed - we now have a functional service (major props to @simong and especially @opsb) and we have a repo where we can raise specific issues - viz https://github.com/okfn/webshot/issues - feel it is now time to mark as FIXED. w00t! |
I want a service for taking screenshots.
Research
What exists? Is there something open and free?
Existing libs we could use
brenden/node-webshot#25 (comment)
The text was updated successfully, but these errors were encountered: