internet-archiving

Here are 11 public repositories matching this topic...

ArchiveBox / ArchiveBox

🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...

Updated Jun 21, 2024
Python

A suite of tools for mirroring and hoarding web pages you visit for later offline viewing. I.e. your own personal Wayback Machine that can also archive HTTP POST requests and responses, as well as most other HTTP-level data, which also follows "archive everything now, figure out what to do with it later" philosophy.

backups internet self-hosted archive web-archiving wayback-machine internet-archiving

Updated Jun 18, 2024
Python

TheLovinator1 / FeedVault.se

Sponsor

Star

FeedVault is an open-source web application that allows users to archive and search their favorite web feeds.

rss backup archive internet-archive atom-feed rss-aggregator wayback-machine internet-archiving archivebox rss-archive feed-archive

Updated Jun 5, 2024
Python

ArchiveBox / debian-archivebox

Sponsor

Star

Home of the official apt/deb package for Ubuntu/Debian-based systems.

package debian apt ubuntu web-archiving aptitude digipres internet-archiving archivebox stdeb

Updated May 20, 2024
Python

akamhy / waybackpy

Star

Wayback Machine API interface & a command-line tool

osint internet-archive web-archiving wayback-machine webarchiving cdx-api internet-archiving savepagenow archive-webpage archive-webpages wayback-machine-api wayback-machine-python

Updated Feb 26, 2024
Python

ArchiveBox / archivebox-proxy

Sponsor

Star

Official ArchiveBox MITM proxy: saves URLs of all requests passing through to an ArchiveBox server for archival.

proxy https-proxy web-archiving web-proxy digital-preservation mitmproxy digipres internet-archiving archivebox

Updated Jan 23, 2024
Python

Fooftilly / RSS_archiver

Star

Download and archive RSS feeds to Wayback Machine. Save a list of archived feed in locad db.

rss archive internet-archive rss-feed archiver wayback-machine webarchive link-archiver internet-archiving rss-archive link-archive

Updated Oct 19, 2023
Python

mikwielgus / forum-dl

Sponsor

Star

Scrape posts, threads from forums, news aggregators, mail archives, export to JSONL, mailbox, WARC

python scraper forum discourse phpbb warc data-fetching simplemachines internet-archiving

Updated Sep 19, 2023
Python

itsliamdowd / WaybackBrowserWindows

Star

Pick a date and explore websites from the early days of the internet to now all in an easy-to-use browser format! 💻

Updated Jun 14, 2022
Python

httpreserve / conventoarchiver

Star

Repository for collecting scripts to help capture MyConvento newsroom press-releases from the MyConvento PR management suite. The README provides an analysis of the MyConvento URL architecture for users hoping to develop a solution for themselves.

internet-archive web-archiving digipres webarchives internet-archiving press-releases myconvento pr-newsroom my-convento

Updated Jan 5, 2022
Python

Quoorex / archive-file-urls

Star

Submit URLs listed inside a file to website archival services

archiving internet-archive internet-archiving

Updated Aug 26, 2021
Python

Improve this page

Add a description, image, and links to the internet-archiving topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the internet-archiving topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

internet-archiving

Here are 11 public repositories matching this topic...

ArchiveBox / ArchiveBox

Own-Data-Privateer / pwebarc

TheLovinator1 / FeedVault.se

ArchiveBox / debian-archivebox

akamhy / waybackpy

ArchiveBox / archivebox-proxy

Fooftilly / RSS_archiver

mikwielgus / forum-dl

itsliamdowd / WaybackBrowserWindows

httpreserve / conventoarchiver

Quoorex / archive-file-urls

Improve this page

Add this topic to your repo