WWW: Archive old aboutcode.org websites

Before we publish www.aboutcode.org as a "production" website (on Dreamhost) we need to archive a copy of the current WordPress-based website. 
The objective is to download the content to a set of HTML (or similar?) files that we can archive for future reference - likely to be stored in an archive in this repo. We will not need to actually operate the website after archiving - we just need to be able to view the page content.

My Gemini-Google search surfaced two primary FOSS options to create this archive:

**Wget**
Best For: Advanced users who need a powerful command-line tool for precise mirroring.
Features: Using the `--mirror` command, Wget can create a complete local copy of a site’s directory structure. It is versatile, supporting HTTP, HTTPS, and FTP protocols.

Command Example: `wget --mirror --convert-links --adjust-extension --page-requisites --no-parent <http://example.com>`

Some relevant links are:
- https://dheinemann.com/archiving-a-website-with-wget/
- https://superuser.com/questions/1596117/how-do-you-download-an-entire-website-for-offline-viewing-with-wget

**HTTrack**
Best For: Creating a functional, offline mirror of a website with its original link structure intact.
Features: It crawls a site recursively, downloading pages, images, and scripts, and converts absolute links to relative ones for offline browsing. It is available for Windows (WinHTTrack), Linux, and Android.
Note: It is highly effective for static sites but may struggle with modern, JavaScript-heavy dynamic content.

Please start with Wget and see what we can get. -: )

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

WWW: Archive old aboutcode.org websites #147

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

WWW: Archive old aboutcode.org websites #147

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions