-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Readme of climate-mirror/datasets needs updating #278
Comments
feel free to submit a PR. 👍 |
a Torrent RSS Feed managed by the maintainers would be nice, Would allow for some Auto Seeding of data to get more data spread out |
What wget commands are you guys using? |
@baobrien , Here is an example for http crawls: |
wget -N -m /* |
Some notes on wget options: -r/--recursive - The maximum number of levels defaults to 5! -l/--level - Specifies how deep the recursion should go. You can specify "inf" for infinite recursion. -N/--timestamping - For files previously downloaded, downloads them again if the remote file timestamp has changed. -m/--mirror - Equivalent to -r -N -l inf --no-remove-listing. (The last option keeps .listing files, which contain the raw directory listings from the FTP server.) -c/--continue - Treats each previously downloaded file as possibly incomplete and requests downloading any data past the end of the file (if the server supports it). This is good for resuming the download of a single large file where the download was interrupted (as long as the file has not changed). For files which have changed, unless the changes are only appended to the ends of the files, this option could result in a corrupted files (by combining the first half of a file from a previous download with a second half that doesn't match the first half). |
Agreed with mxplusb - we'll definitely act on a Pull Request that improves any of the documentation, including one on specific commands to run and tools to use. It'd probably be best to make a new markdown file and reference it in the main readme as a table of contents, but whatever you submit that improves it is welcomed. |
https://github.com/climate-mirror/datasets is missing infomation about the wget commands we are using and needs updating in general
The text was updated successfully, but these errors were encountered: