Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow taking an snapshot of a local file #35

Closed
pjamargh opened this issue Mar 13, 2022 · 7 comments
Closed

Allow taking an snapshot of a local file #35

pjamargh opened this issue Mar 13, 2022 · 7 comments
Labels
enhancement New feature or request research

Comments

@pjamargh
Copy link

For an .html file stored locally I use python -m http.server 80 and then shot-scraper http://localhost/file but it would be handy to be able to just point shot-scraper to the file directly.

@simonw simonw added the enhancement New feature or request label Mar 13, 2022
@simonw
Copy link
Owner

simonw commented Mar 13, 2022

I agree, this would definitely be useful.

@simonw
Copy link
Owner

simonw commented Mar 13, 2022

Deleting this check:

if not (url.startswith("http://") or url.startswith("https://")):
raise click.ClickException(
"'url' must start http:// or https:// - got: \n{}".format(url)
)

And then running the following worked:

shot-scraper file:/tmp/index.html

I had to provide both file:/ and a full path to the file though.

So the actual implementation of that should probably notice when the user provides a URL that doesn't start with http:// or https:// AND is a file that exists on disk and, if so, do the work of turning that into a file://full/path/to/filename.html.

@pjamargh
Copy link
Author

Thanks for the quick response!

Shot-scraper has been a great help for transforming complex epub ebooks into easier to display series of images. Thanks a lot for this tool!

@simonw
Copy link
Owner

simonw commented Mar 13, 2022

I'm intrigued! Hadn't considered there might be epub applications.

@pjamargh
Copy link
Author

They use XHTML. Some are simple (say some images and texts) totally ok to reflow, some are complex. On these last group you have some comics that have images but also transparent text on top so it is readable for accessibility, some others have images without the text and text on top with complex transformations (tilting and such). There are rarely displayed properly in a simple epub reader. I'm still testing but with a little fine fiddling shot-scraper is giving me exactly what I needed (rendering one HTML page into an image).

@simonw
Copy link
Owner

simonw commented Mar 13, 2022

Tested this out using the Firefox "save complete web page" option on https://daringfireball.net/ and then running:

shot-scraper /tmp/Daring\ Fireball.html --height 1200

Got this output (because I had run it a few times before so the filename had to change):

Screenshot of 'file:/tmp/Daring Fireball.html' written to 'tmp-DaringFireball-html.2.png'

Result:

tmp-DaringFireball-html 2

@simonw simonw closed this as completed in df4498c Mar 13, 2022
simonw added a commit that referenced this issue Mar 13, 2022
@pjamargh
Copy link
Author

pjamargh commented Mar 13, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request research
Projects
None yet
Development

No branches or pull requests

2 participants