You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add feature to scrape from archive site. Using that flag will detect for archive.today (theres a few backup domains ppl use so dont hardcode domain) and if it finds it, edit the html and remove the divs that contain the scraper stuff leaving behind just site contents. I did this manually and im sure it could be automated. And for archive.org you can parse out some html field on the site that contains a link to the un-archive.orgified webpage just as it was originally.
Also, another flag to disable the behaviour of converting links on the page if this archiving archive option is on. Converting links can work by looking for a second https:// or http:// after start of link
You could support other archive sites with this feature but i only know of these two. I did this manually with a site i archived using monolith and I havent seen any tool for parsing archive.org or archive.today sites into original format
The text was updated successfully, but these errors were encountered:
Add feature to scrape from archive site. Using that flag will detect for archive.today (theres a few backup domains ppl use so dont hardcode domain) and if it finds it, edit the html and remove the divs that contain the scraper stuff leaving behind just site contents. I did this manually and im sure it could be automated. And for archive.org you can parse out some html field on the site that contains a link to the un-archive.orgified webpage just as it was originally.
Also, another flag to disable the behaviour of converting links on the page if this archiving archive option is on. Converting links can work by looking for a second https:// or http:// after start of link
You could support other archive sites with this feature but i only know of these two. I did this manually with a site i archived using monolith and I havent seen any tool for parsing archive.org or archive.today sites into original format
The text was updated successfully, but these errors were encountered: