Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Downloader appears to hang with tiktok url #724

Closed
3 tasks done
fergie4000 opened this issue Dec 19, 2022 · 5 comments · Fixed by #728
Closed
3 tasks done

[BUG] Downloader appears to hang with tiktok url #724

fergie4000 opened this issue Dec 19, 2022 · 5 comments · Fixed by #728
Assignees
Labels
bug Something isn't working

Comments

@fergie4000
Copy link

  • I am reporting a bug.
  • I am running the latest version of BDfR
  • I have read the Opening an issue

Description

Downloader appears to hang consistently with this ID (NSFW). 3+ hours the first time before I noticed but consistently every time I run it.

Command

python3 -m bdfr clone /mnt/r/reddit/ --folder-scheme {REDDITOR}/{SUBREDDIT} --link wttmgs -v -v

Environment (please complete the following information)

  • OS: Ubuntu 20.04.5 LTS
  • Python version: Python 3.10.9

Logs

[2022-12-19 21:10:42,425 - bdfr.connector - DEBUG] - Setting maximum download wait time to 120 seconds
[2022-12-19 21:10:42,425 - bdfr.connector - DEBUG] - Setting datetime format string to ISO
[2022-12-19 21:10:42,427 - bdfr.connector - DEBUG] - Disabling the following modules:
[2022-12-19 21:10:42,427 - bdfr.connector - Level 9] - Created download filter
[2022-12-19 21:10:42,427 - bdfr.connector - Level 9] - Created time filter
[2022-12-19 21:10:42,427 - bdfr.connector - Level 9] - Created sort filter
[2022-12-19 21:10:42,427 - bdfr.connector - Level 9] - Create file name formatter
[2022-12-19 21:10:42,427 - bdfr.connector - DEBUG] - Using unauthenticated Reddit instance
[2022-12-19 21:10:42,429 - bdfr.connector - Level 9] - Created site authenticator
[2022-12-19 21:10:42,429 - bdfr.connector - Level 9] - Retrieved subreddits
[2022-12-19 21:10:42,429 - bdfr.connector - Level 9] - Retrieved multireddits
[2022-12-19 21:10:42,429 - bdfr.connector - Level 9] - Retrieved user data
[2022-12-19 21:10:42,430 - bdfr.connector - Level 9] - Retrieved submissions for given links
[2022-12-19 21:10:43,067 - bdfr.downloader - DEBUG] - Attempting to download submission wttmgs
[2022-12-19 21:10:43,068 - bdfr.downloader - DEBUG] - Using Direct with url https://www.tiktok.com/@keriberry.420?_t=8V0q4wrW0Gw&_r=1
@fergie4000 fergie4000 added the bug Something isn't working label Dec 19, 2022
@Serene-Arc Serene-Arc self-assigned this Dec 19, 2022
@OMEGARAZER
Copy link
Contributor

OMEGARAZER commented Dec 20, 2022

Just as an FYI it looks like that's a link to a full profile rather than a specific video so unsure how would be best to handle something like that.

But the main part is it seems to get picked up by the direct downloader because of the .420 in the username. I think I know what will fix it but will need to test it.

@Serene-Arc
Copy link
Owner

Yeah there's no way to download that but it definitely shouldn't hang

OMEGARAZER referenced this issue in OMEGARAZER/bulk-downloader-for-reddit-x Dec 20, 2022
Attempt to fix #724

Narrows down characters available to extensions in the regex. Outside of  3 and 4, the only extensions that I can think of this doesn't hit are bz2 and 7z (which wasn't caught before).
Serene-Arc added a commit that referenced this issue Dec 20, 2022
@fergie4000
Copy link
Author

Doesn't appear to be fixed on my end.

(venv) user@DESKTOP:~$python3 -m pip install git+https://github.com/aliparlakci/bulk-downloader-for-reddit.git@development
Collecting git+https://github.com/aliparlakci/bulk-downloader-for-reddit.git@development
  Cloning https://github.com/aliparlakci/bulk-downloader-for-reddit.git (to revision development) to /tmp/pip-req-build-mrh_7dq2
  Running command git clone --filter=blob:none --quiet https://github.com/aliparlakci/bulk-downloader-for-reddit.git /tmp/pip-req-build-mrh_7dq2
  Resolved https://github.com/aliparlakci/bulk-downloader-for-reddit.git to commit c63a8842d9ab5fd474645c62593c4460837a7f15
  Running command git submodule update --init --recursive -q
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: yt-dlp>=2022.11.11 in ./python/reddit/venv/lib/python3.10/site-packages (from bdfr==2.6.2) (2022.11.11)
Requirement already satisfied: pyyaml>=5.4.1 in ./python/reddit/venv/lib/python3.10/site-packages (from bdfr==2.6.2) (6.0)
Requirement already satisfied: appdirs>=1.4.4 in ./python/reddit/venv/lib/python3.10/site-packages (from bdfr==2.6.2) (1.4.4)
Requirement already satisfied: requests>=2.25.1 in ./python/reddit/venv/lib/python3.10/site-packages (from bdfr==2.6.2) (2.27.1)
Requirement already satisfied: dict2xml>=1.7.0 in ./python/reddit/venv/lib/python3.10/site-packages (from bdfr==2.6.2) (1.7.0)
Requirement already satisfied: praw>=7.2.0 in ./python/reddit/venv/lib/python3.10/site-packages (from bdfr==2.6.2) (7.6.1)
Requirement already satisfied: beautifulsoup4>=4.10.0 in ./python/reddit/venv/lib/python3.10/site-packages (from bdfr==2.6.2) (4.10.0)
Requirement already satisfied: click>=8.0.0 in ./python/reddit/venv/lib/python3.10/site-packages (from bdfr==2.6.2) (8.0.3)
Requirement already satisfied: soupsieve>1.2 in ./python/reddit/venv/lib/python3.10/site-packages (from beautifulsoup4>=4.10.0->bdfr==2.6.2) (2.3.1)
Requirement already satisfied: prawcore<3,>=2.1 in ./python/reddit/venv/lib/python3.10/site-packages (from praw>=7.2.0->bdfr==2.6.2) (2.3.0)
Requirement already satisfied: websocket-client>=0.54.0 in ./python/reddit/venv/lib/python3.10/site-packages (from praw>=7.2.0->bdfr==2.6.2) (1.2.3)
Requirement already satisfied: update-checker>=0.18 in ./python/reddit/venv/lib/python3.10/site-packages (from praw>=7.2.0->bdfr==2.6.2) (0.18.0)
Requirement already satisfied: idna<4,>=2.5 in ./python/reddit/venv/lib/python3.10/site-packages (from requests>=2.25.1->bdfr==2.6.2) (3.3)
Requirement already satisfied: certifi>=2017.4.17 in ./python/reddit/venv/lib/python3.10/site-packages (from requests>=2.25.1->bdfr==2.6.2) (2021.10.8)
Requirement already satisfied: charset-normalizer~=2.0.0 in ./python/reddit/venv/lib/python3.10/site-packages (from requests>=2.25.1->bdfr==2.6.2) (2.0.10)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in ./python/reddit/venv/lib/python3.10/site-packages (from requests>=2.25.1->bdfr==2.6.2) (1.26.8)
Requirement already satisfied: mutagen in ./python/reddit/venv/lib/python3.10/site-packages (from yt-dlp>=2022.11.11->bdfr==2.6.2) (1.45.1)
Requirement already satisfied: pycryptodomex in ./python/reddit/venv/lib/python3.10/site-packages (from yt-dlp>=2022.11.11->bdfr==2.6.2) (3.13.0)
Requirement already satisfied: brotli in ./python/reddit/venv/lib/python3.10/site-packages (from yt-dlp>=2022.11.11->bdfr==2.6.2) (1.0.9)
Requirement already satisfied: websockets in ./python/reddit/venv/lib/python3.10/site-packages (from yt-dlp>=2022.11.11->bdfr==2.6.2) (10.1)
(venv) user@DESKTOP:~$
(venv) user@DESKTOP:~$python3 -m bdfr clone /mnt/r/reddit/ --folder-scheme {REDDITOR}/{SUBREDDIT} --link wttmgs -v -v
[2022-12-22 21:06:49,258 - bdfr.connector - DEBUG] - Loading configuration from /home/user/.config/bdfr/default_config.cfg
[2022-12-22 21:06:49,260 - bdfr.connector - DEBUG] - Setting maximum download wait time to 120 seconds
[2022-12-22 21:06:49,260 - bdfr.connector - DEBUG] - Setting datetime format string to ISO
[2022-12-22 21:06:49,263 - bdfr.connector - DEBUG] - Disabling the following modules:
[2022-12-22 21:06:49,263 - bdfr.connector - Level 9] - Created download filter
[2022-12-22 21:06:49,263 - bdfr.connector - Level 9] - Created time filter
[2022-12-22 21:06:49,264 - bdfr.connector - Level 9] - Created sort filter
[2022-12-22 21:06:49,264 - bdfr.connector - Level 9] - Create file name formatter
[2022-12-22 21:06:49,265 - bdfr.connector - DEBUG] - Using unauthenticated Reddit instance
[2022-12-22 21:06:49,266 - bdfr.connector - Level 9] - Created site authenticator
[2022-12-22 21:06:49,266 - bdfr.connector - Level 9] - Retrieved subreddits
[2022-12-22 21:06:49,266 - bdfr.connector - Level 9] - Retrieved multireddits
[2022-12-22 21:06:49,267 - bdfr.connector - Level 9] - Retrieved user data
[2022-12-22 21:06:49,267 - bdfr.connector - Level 9] - Retrieved submissions for given links
[2022-12-22 21:06:50,330 - bdfr.downloader - DEBUG] - Attempting to download submission wttmgs
[2022-12-22 21:06:50,331 - bdfr.downloader - DEBUG] - Using Direct with url https://www.tiktok.com/@keriberry.420?_t=8V0q4wrW0Gw&_r=1
^C
Aborted!

@Serene-Arc Serene-Arc reopened this Dec 22, 2022
@OMEGARAZER
Copy link
Contributor

$  bdfr clone test/ --folder-scheme {REDDITOR}/{SUBREDDIT} --link wttmgs -vvv
[2022-12-22 20:48:16,941 - bdfr.connector - DEBUG] - Loading configuration from /home/omegarazer/.config/bdfr/config.cfg
[2022-12-22 20:48:16,941 - bdfr.connector - DEBUG] - Setting maximum download wait time to 120 seconds
[2022-12-22 20:48:16,941 - bdfr.connector - DEBUG] - Setting datetime format string to ISO
[2022-12-22 20:48:16,942 - bdfr.connector - DEBUG] - Disabling the following modules:
[2022-12-22 20:48:16,942 - bdfr.connector - Level 9] - Created download filter
[2022-12-22 20:48:16,942 - bdfr.connector - Level 9] - Created time filter
[2022-12-22 20:48:16,942 - bdfr.connector - Level 9] - Created sort filter
[2022-12-22 20:48:16,942 - bdfr.connector - Level 9] - Create file name formatter
[2022-12-22 20:48:16,944 - bdfr.connector - DEBUG] - Using unauthenticated Reddit instance
[2022-12-22 20:48:16,945 - bdfr.connector - Level 9] - Created site authenticator
[2022-12-22 20:48:16,945 - bdfr.connector - Level 9] - Retrieved subreddits
[2022-12-22 20:48:16,945 - bdfr.connector - Level 9] - Retrieved multireddits
[2022-12-22 20:48:16,945 - bdfr.connector - Level 9] - Retrieved user data
[2022-12-22 20:48:16,945 - bdfr.connector - Level 9] - Retrieved submissions for given links
[2022-12-22 20:48:17,319 - bdfr.downloader - DEBUG] - Attempting to download submission wttmgs
[2022-12-22 20:48:38,051 - bdfr.downloader - ERROR] - Could not download submission wttmgs: No downloader module exists for url https://www.tiktok.com/@keriberry.420?_t=8V0q4wrW0Gw&_r=1
[2022-12-22 20:48:38,051 - bdfr.archive_entry.submission_archive_entry - DEBUG] - Retrieving full comment tree for submission wttmgs
[2022-12-22 20:48:38,132 - bdfr.archiver - DEBUG] - Writing entry wttmgs to file in JSON format at /home/omegarazer/Reddit/keriberry_420/keriberry_420/keriberry_420_At 1k I can go live 🖤_wttmgs.json
[2022-12-22 20:48:38,133 - bdfr.archiver - INFO] - Record for entry item wttmgs written to disk
[2022-12-22 20:48:38,133 - root - INFO] - Program complete

Double check the download_factory you have has the updates? it appears to be working as expected for me. Maybe a bytecompiled cache of the old version?

@fergie4000
Copy link
Author

Maybe a bytecompiled cache of the old version?

That's seems to be what it was. download_factory had the updates. Cleared __pycache__ and ran it again and it works fine. Sorry about that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants