-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
default WAYBACK_BASEURL may be incorrect #31
Comments
Looks like the thumbnails may need VNC, and I hadn't installed that. |
…ozzler dashboard (#31); tweak arg parsing related stuff
Thanks for the report. c3b637d should fix the WAYBACK_BASEURL mismatch.
I still need to code up some pywb support for thumbnail and screenshot urls. I'll leave this issue open to track that.
The vnc thing is for watching the brozzler-controlled browsers in action. You can kinda see how to set that up if you look at ansible/roles/brozzler-worker. It almost but doesn't work out of the box with the vagrant setup (iirc because the vagrant vm's idea of its hostname is not resolvable from outside). In any case, it's not related to the archived thumbnails. (n.b. 8091 vs 8901) |
I eventually found code in brozzler where it looks like you need to enable warcprox features to get screenshots, but I was archiving a big site didn't get a chance to see if flipping that switch made it work
…On January 21, 2017 12:04:24 AM PST, Noah Levitt ***@***.***> wrote:
Thanks for the report.
c3b637d
should fix the WAYBACK_BASEURL mismatch.
> Doing that allowed the wayback links to work, but the thumbnail &
screenshot urls are still 404ing.
I still need to code up some pywb support for thumbnail and screenshot
urls. I'll leave this issue open to track that.
> Looks like the thumbnails may need VNC, and I hadn't installed that.
The vnc thing is for watching the brozzler-controlled browsers in
action. You can kinda see how to set that up if you look at
ansible/roles/brozzler-worker. It _almost_ but doesn't work out of the
box with the vagrant setup (iirc because the vagrant vm's idea of its
hostname is not resolvable from outside). In any case, it's not related
to the archived thumbnails. (n.b. 8091 vs 8901)
--
You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub:
#31 (comment)
|
Oh, yeah, that too. But even if you get the screenshots I don't think replay will work. |
* master: restore ping_timeout argument to WebSocketApp.run_forever to fix problem of leaking websocket receiver threads hanging forever on select() missed a spot improve brozzler-dashboard logging; fix default wayback baseurl in brozzler dashboard (#31); tweak arg parsing related stuff avoid js errors in case site or job is not configured to keep stats add travis-ci slack notification to internetarchive/brozzler channel
Added support for screenshot: and thumbnail: urls. |
I installed brozzler via pip and launched it with brozzler-easy in a Debian Jessie VM and was able to scrape a site. (Brozzler 1.1b8, pywb 0.33.0, python3.4)
However the default page links in the dashboard on the job detail page were pointing to http://localhost:8091/brozzler/ As far as I can tell there was nothing started by default listening on port 8091.
After some investigation I found there was something listening on port 8880 that looked like a wayback process, so I tried launching brozzler-easy like this:
WAYBACK_BASEURL=http://192.168.122.152:8880/brozzler brozzler-easy -d warc/ --dashboard-address 0.0.0.0
(the ip addresses were so I could use my regular browser instead of the VM browser to use the site)
Doing that allowed the wayback links to work, but the thumbnail & screenshot urls are still 404ing.
The text was updated successfully, but these errors were encountered: