-
Notifications
You must be signed in to change notification settings - Fork 9
Closed
Labels
topic-sitespecificIssues that are specific to archiving certain domains, which may not generalize to other domains.Issues that are specific to archiving certain domains, which may not generalize to other domains.type-taskNon-coding taskNon-coding task
Description
Some large sites like YRE and KC can require the download of 2+ TB of content. That can be troublesome when my effective bandwidth cap per month is about 500 GB (0.5 TB). For sites like these, it may make sense to download them from a datacenter location rather than my usual location.
Sketch of how to use Crystal in a datacenter location:
- Create EC2 instance
- Create detachable EBS volume with hopefully enough space reserved to download the target site
- This EBS volume can be increased in size later if needed, with some time cost
- Install Crystal on EC2 instance. Setup X11 forwarding to my laptop so that I can see Crystal's UI locally.
- Launch Crystal. Start downloading site to EBS volume.
- If pause needed, stop the EC2 instance, retaining the EBS volume
- To view downloaded pages:
- Ensure can manually connect to HTTP server hosted by Crystal on remote EC2 instance, opening firewall ports as needed.
- Crystal will need to run its server on 0.0.0.0 rather than 127.0.0.1
- May need a preferences option to enable this behavior
- Crystal will need to run its server on 0.0.0.0 rather than 127.0.0.1
- Ensure can easily view downloaded page using the usual View button:
- Crystal will need to generate URLs pointing to the correct remote domain
- May need a preferences option to configure what the remote domain is
- Crystal should not try to open a webbrowser on the remote server
- May need the View button to display a clickable blue link instead of opening a web browser directly
- Crystal will need to generate URLs pointing to the correct remote domain
- Ensure can manually connect to HTTP server hosted by Crystal on remote EC2 instance, opening firewall ports as needed.
- Initiate upload of fully downloaded site to Glacier Deep Archive, using the usual s3cmd
Metadata
Metadata
Assignees
Labels
topic-sitespecificIssues that are specific to archiving certain domains, which may not generalize to other domains.Issues that are specific to archiving certain domains, which may not generalize to other domains.type-taskNon-coding taskNon-coding task