Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when attempting to download anything #190

Open
Fahrenheit opened this issue May 30, 2021 · 4 comments · May be fixed by #191
Open

Error when attempting to download anything #190

Fahrenheit opened this issue May 30, 2021 · 4 comments · May be fixed by #191

Comments

@Fahrenheit
Copy link

Fahrenheit commented May 30, 2021

The script was working fine for months. I get this output exactly when it tries to download anything from a website I need to archive part of. I have no idea of how to fix it. I've tried reinstalling macoS, reinstalling ruby, updating ruby, and removing wayback-machine-downloader and reinstalling it. No luck so any help would be extremely appreciated. I need this tool working for a project.

This is the command I used wayback_machine_downloader -sa 'igui.ru' --only "/\.(pxl|deb|ipa|rar|zip|7z|dmg|exe)$/i" and it works fine until I start attempting to download the file types specified in the regex. This exact command worked perfectly for weeks on other sites.

#<Thread:0x00007fa69c9ea7d8@/Library/Ruby/Gems/2.6.0/gems/wayback_machine_downloader-2.2.1/lib/wayback_machine_downloader.rb:209 run> terminated with exception (report_on_exception is true):
Traceback (most recent call last):
	1: from /Library/Ruby/Gems/2.6.0/gems/wayback_machine_downloader-2.2.1/lib/wayback_machine_downloader.rb:212:in `block (2 levels) in download_files'
/Library/Ruby/Gems/2.6.0/gems/wayback_machine_downloader-2.2.1/lib/wayback_machine_downloader.rb:251:in `download_file': undefined method `split' for nil:NilClass (NoMethodError)
Traceback (most recent call last):
	1: from /Library/Ruby/Gems/2.6.0/gems/wayback_machine_downloader-2.2.1/lib/wayback_machine_downloader.rb:212:in `block (2 levels) in download_files'
/Library/Ruby/Gems/2.6.0/gems/wayback_machine_downloader-2.2.1/lib/wayback_machine_downloader.rb:251:in `download_file': undefined method `split' for nil:NilClass `(NoMethodError)```
@pabs3
Copy link
Contributor

pabs3 commented Jun 7, 2021

The workaround for this appears to be to change if file_id.nil? to if file_id_and_timestamp.nil? in the get_file_list_all_timestamps function in lib/wayback_machine_downloader.rb. I'll submit a pull request for this.

A more correct solution would be to use bytes instead of UTF-8 for filenames, since at least on Linux, filenames are bytes not UTF-8.

On Windows you might need to detect the file name encoding and then convert file names to UTF-16 instead. The CharlockHolmes and rchardet projects can be used to detect the encoding and calling .encode("UTF-16", encoding) can convert from one encoding to another.

pabs3 added a commit to pabs3/wayback-machine-downloader that referenced this issue Jun 7, 2021
pabs3 added a commit to pabs3/wayback-machine-downloader that referenced this issue Jun 7, 2021
…stamps

file_list_curated array uses file_id_and_timestamp as the index not file_id.

Fixes: hartator#190
@pabs3
Copy link
Contributor

pabs3 commented Jun 9, 2021

@Fahrenheit This issue isn't fixed yet so it shouldn't have been closed.

@Fahrenheit
Copy link
Author

Fahrenheit commented Jun 13, 2021

Sorry, I assumed you were going to fix it afterwards. My bad!

@Fahrenheit Fahrenheit reopened this Jun 13, 2021
@pabs3
Copy link
Contributor

pabs3 commented Jun 13, 2021 via email

pabs3 added a commit to pabs3/wayback-machine-downloader that referenced this issue Sep 4, 2021
…stamps

file_list_curated array uses file_id_and_timestamp as the index not file_id.

Fixes: hartator#190
pabs3 added a commit to pabs3/wayback-machine-downloader that referenced this issue Oct 31, 2022
…stamps

file_list_curated array uses file_id_and_timestamp as the index not file_id.

Fixes: hartator#190
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants