Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing to record (or replay) http://datagenetics.com/blog/july12019/index.html #195

Closed
Binarus opened this issue Nov 8, 2023 · 2 comments

Comments

@Binarus
Copy link

Binarus commented Nov 8, 2023

At first, thank you very much for archiveweb.page and replayweb.page! They are the only tools that allow accurate archiving and viewing of nearly all web pages I have encountered so far.

However, I'd like to report that I couldn't record (archive) the following page: http://datagenetics.com/blog/july12019/index.html

Actually, I don't know whether it is not recorded correctly or whether there is a problem when replaying it. Since I am not that deep in web technologies, maybe I am doing something wrong.

I am using the Webrecorder ArchiveWeb.page 0.11.3 extension in Chrome on Windows 10 Enterprise x64 and have tried recording with and without autopilot. I have activated and inactivated the animations on the page and have clicked around everywhere while recording. However, when replaying the page, some of the graphics and buttons are missing.

It would be very nice if this could be fixed. I am willing to assist as far as I am able to.

Best regards,

Binarus

ikreymer added a commit to webrecorder/wombat that referenced this issue Jun 17, 2024
- use Reflect.set / Reflect.get for doc proxy overrides (fixes document.head assign error on http://datagenetics.com/blog/july12019/index.html, webrecorder/archiveweb.page#195)
- better override for window.frames - resolves to window, so override numeric property on 'window' to fetch frames and add wombat override
- document.querySelector() override, rewrite 'src^=https?', 'href^=https?' query matches to instead be 'src*=' / 'href*=' to match rewritten URLs (fixes shorthand.com sites, eg: https://www.cambridge.org/news-and-insights/five-books-on-climate-action-for-cop28, webrecorder/replayweb.page#272)
- uses wombat.URL instead of window.URL to avoid infinite loop when window.URL is replaced with custom function (fixes webrecorder/replayweb.page#330)
- for document.write override, still assign even in SW to avoid synchronous code failing when expecting written HTML to be there, then still load from blob URL (fixes webrecorder/replayweb.page#332)
- don't intercept assign to '.search' / '.port', as those aren't being rewritten
- check if 'WB_wombat_' is in string before doing regex rewrite, speed up rewrite of very large string
ikreymer added a commit to webrecorder/wombat that referenced this issue Jun 17, 2024
…nt.head assign error on http://datagenetics.com/blog/july12019/index.html, webrecorder/archiveweb.page#195)

- better override for window.frames using Proxy object - resolves to window, so override numeric property on 'window' to fetch frames and add wombat override, remove overrideFramesAccess() function
- check if 'WB_wombat_' is in string before doing regex rewrite, speed up rewrite of very large string (fix potentially slow replay on https://www.berlingske.dk/)
ikreymer added a commit to webrecorder/wombat that referenced this issue Jun 17, 2024
…nt.head assign error on http://datagenetics.com/blog/july12019/index.html, webrecorder/archiveweb.page#195) (#148)

- better override for window.frames using Proxy object - resolves to window, so override numeric property on 'window' to fetch frames and add wombat override, remove overrideFramesAccess() function
- check if 'WB_wombat_' is in string before doing regex rewrite, speed up rewrite of very large string (fix potentially slow replay on https://www.berlingske.dk/)
@ikreymer
Copy link
Member

The replay for this is fixed in replayweb.page 2.0.2, and will be in next update of archiveweb.page

@ikreymer
Copy link
Member

ikreymer commented Jul 8, 2024

This should now work!

@ikreymer ikreymer closed this as completed Jul 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants