-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compatibility Web archives (WaybackMachines) #372
Comments
Nice. |
The BNF's archives work a bit differently and would require different changes. Identified so far:
|
boogheta
added a commit
that referenced
this issue
May 20, 2021
boogheta
added a commit
that referenced
this issue
May 20, 2021
boogheta
added a commit
that referenced
this issue
May 20, 2021
boogheta
added a commit
that referenced
this issue
May 26, 2021
…in html + rewrite all links found + store archive date as meta + skip internal archive links (WIP #372)
boogheta
added a commit
that referenced
this issue
May 26, 2021
boogheta
added a commit
that referenced
this issue
May 26, 2021
boogheta
added a commit
that referenced
this issue
May 26, 2021
boogheta
added a commit
that referenced
this issue
May 26, 2021
Complementary ideas or todos:
|
boogheta
added a commit
that referenced
this issue
Jun 1, 2021
boogheta
added a commit
that referenced
this issue
Jun 1, 2021
boogheta
added a commit
that referenced
this issue
Jun 1, 2021
boogheta
added a commit
that referenced
this issue
Jun 7, 2021
boogheta
added a commit
that referenced
this issue
Jun 7, 2021
boogheta
added a commit
that referenced
this issue
Jun 7, 2021
boogheta
pushed a commit
that referenced
this issue
Jun 10, 2021
boogheta
added a commit
that referenced
this issue
Jun 10, 2021
boogheta
pushed a commit
that referenced
this issue
Jun 17, 2021
boogheta
pushed a commit
that referenced
this issue
Jun 17, 2021
boogheta
pushed a commit
that referenced
this issue
Jun 17, 2021
boogheta
added a commit
that referenced
this issue
Jun 18, 2021
…imestamp and trash it if outside desired range (#372)
boogheta
added a commit
that referenced
this issue
Jun 22, 2021
boogheta
added a commit
that referenced
this issue
Jun 22, 2021
boogheta
pushed a commit
that referenced
this issue
Jun 25, 2021
boogheta
pushed a commit
that referenced
this issue
Jun 25, 2021
boogheta
pushed a commit
that referenced
this issue
Jun 25, 2021
boogheta
pushed a commit
that referenced
this issue
Jun 25, 2021
boogheta
added a commit
that referenced
this issue
Jun 25, 2021
boogheta
added a commit
that referenced
this issue
Jun 28, 2021
boogheta
added a commit
that referenced
this issue
Jun 28, 2021
boogheta
added a commit
that referenced
this issue
Jun 28, 2021
boogheta
added a commit
that referenced
this issue
Jun 28, 2021
boogheta
added a commit
that referenced
this issue
Jun 30, 2021
boogheta
added a commit
that referenced
this issue
Jun 30, 2021
boogheta
added a commit
that referenced
this issue
Jun 30, 2021
boogheta
added a commit
that referenced
this issue
Jul 5, 2021
boogheta
added a commit
that referenced
this issue
Jul 6, 2021
boogheta
added a commit
that referenced
this issue
Jul 6, 2021
…r single archives crawls in corpus not set for archives (#372)
boogheta
added a commit
that referenced
this issue
Jul 6, 2021
boogheta
added a commit
that referenced
this issue
Jul 6, 2021
… from front are valid and respect it (#372)
boogheta
pushed a commit
that referenced
this issue
Jul 7, 2021
boogheta
added a commit
that referenced
this issue
Jul 7, 2021
boogheta
added a commit
that referenced
this issue
Jul 7, 2021
boogheta
added a commit
that referenced
this issue
Jul 7, 2021
boogheta
added a commit
that referenced
this issue
Jul 9, 2021
boogheta
added a commit
that referenced
this issue
Jul 9, 2021
…afternoon of first and last day! (#372)
boogheta
added a commit
that referenced
this issue
Sep 7, 2021
boogheta
added a commit
that referenced
this issue
Sep 8, 2021
boogheta
added a commit
that referenced
this issue
Sep 10, 2021
boogheta
added a commit
that referenced
this issue
Sep 10, 2021
Ideas left aside :
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
To allow to crawl the past using some kind of Internet Archive relying on OpenWayback (such as web.archive.org) just a few changes shall be required:
archive_host_prefix
(i.e. https://web.archive.org/web/) and anarchive_timestamp
(such as 20190319191212) around which pages should be crawledarchive_host_prefix/archive_timestamp/
archive_host_prefix/\d{14}/
The text was updated successfully, but these errors were encountered: