Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[extractor/tiktok] Fix TikTokUserIE #4996

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

redraskal
Copy link
Contributor

@redraskal redraskal commented Sep 22, 2022

IMPORTANT: PRs without the template will be CLOSED

Description of your pull request and other information

Fixes #3776
Resolves #1923

Explanation

This PR fixes TikTok user extraction by adding code to fetch the user embed page, pulling the latest user video id (bypassing captcha, contains basic user info + a few videos). Video details are fetched to obtain a secuid (tiktok user identifier). Then, we can use the video listing api with the secuid. I run this api request through a headless browser (see below).

Headless browser

This method requires a headless browser. PhantomJS does not properly function nor does Deno. I use playwright (previously pyppeteer, playwright is better supported). TikTok signature headers do not seem to affect the response and can be ignored. My guess is TikTok fingerprints browsers based on tls hello packets. Playwright prompts you to install browser binaries upon running the tiktok extractor and can be done in advance by running playwright install.

Fingerprinting?

Looks like the web api is using fingerprinting to block automation like yt-dlp because web api urls that work in browsers were not displaying json, but instead whitespace, with python (identical requests besides tls implementation). Someone suggested ja3 fingerprinting is the cause, meaning we would have to send custom hello packets to tiktok for mitigation.

(Non-issue while using a headless browser)

Template

Before submitting a pull request make sure you have:

In order to be accepted and merged into yt-dlp each piece of code must be in public domain or released under Unlicense. Check one of the following options:

  • I am the original author of this code and I am willing to release it under Unlicense
  • I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

@redraskal
Copy link
Contributor Author

Will look into tests again tomorrow, just wanted to put this out there

@redraskal
Copy link
Contributor Author

redraskal commented Sep 22, 2022

I switched the first test to therock because there is an issue where some videos cannot be extracted even though they are available. This is a video extraction problem- unrelated to user extraction logic

@redraskal
Copy link
Contributor Author

redraskal commented Sep 22, 2022

Some regressions with this new method:

@redraskal redraskal marked this pull request as ready for review September 22, 2022 20:22
Copy link

@Jadoo4QFan Jadoo4QFan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

“Heart count” should be changed to the like count.

@bahamas10
Copy link

Hello! I'm interested in this change and am not having any luck running this fork on either macOS or Linux. It's totally possible that I'm doing something wrong here and just need to be pointed in the right direction - my goal is to be able to pull/archive all of my own videos (i currently use yt-dlp to do this for my youtube and twitch accounts, wanted to add tiktok into the mix).

Install steps (both systems):

git clone git@github.com:redraskal/yt-dlp.git
cd yt-dlp
git checkout fix/tiktok-user
python3 -m pip install -U pyinstaller -r requirements.txt
python3 devscripts/make_lazy_extractors.py
python3 pyinst.py

Linux:

dave - void linux ~/dev/yt-dlp/foo (git:fix/tiktok-user) $ ../yt-dlp.sh https://www.tiktok.com/@bahamas10_
[tiktok:user] bahamas10_: Downloading user embed
[tiktok:user] 7146825795997093166: Downloading video feed
[tiktok:user] Downloading signature function
[INFO] Starting Chromium download.
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 109M/109M [00:01<00:00, 72.9Mb/s]
[INFO] Beginning extraction
[INFO] Chromium extracted to: /home/dave/.local/share/pyppeteer/local-chromium/588429
Executing <Task finished name='Task-1' coro=<TikTokUserIE._video_entries_api() done, defined at /home/dave/dev/yt-dlp/yt_dlp/extractor/tiktok.py:616> exception=BrowserError('Browser closed unexpectedly:\n') created at /usr/lib/python3.10/asyncio/tasks.py:636> took 33.709 seconds
ERROR: Browser closed unexpectedly:

Exception ignored in atexit callback: <function Launcher.launch.<locals>._close_process at 0x7f9cf99af540>
Traceback (most recent call last):
  File "/home/dave/.local/lib/python3.10/site-packages/pyppeteer/launcher.py", line 153, in _close_process
    self._loop.run_until_complete(self.killChrome())
  File "/usr/lib/python3.10/asyncio/base_events.py", line 621, in run_until_complete
    self._check_closed()
  File "/usr/lib/python3.10/asyncio/base_events.py", line 515, in _check_closed
    raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed
Exception ignored in: <coroutine object Launcher.killChrome at 0x7f9cf9008a70>
Traceback (most recent call last):
  File "/usr/lib/python3.10/warnings.py", line 506, in _warn_unawaited_coroutine
    warn(msg, category=RuntimeWarning, stacklevel=2, source=coro)
RuntimeWarning: coroutine 'Launcher.killChrome' was never awaited
Exception ignored in: <function Popen.__del__ at 0x7f9cff1b5440>
Traceback (most recent call last):
  File "/usr/lib/python3.10/subprocess.py", line 1070, in __del__
    _warn("subprocess %s is still running" % self.pid,
ResourceWarning: subprocess 28397 is still running
(1) dave - void linux ~/dev/yt-dlp/foo (git:fix/tiktok-user) $ pgrep Chromium
(1) dave - void linux ~/dev/yt-dlp/foo (git:fix/tiktok-user) $ ps -ef|grep -i chrom
(1) dave - void linux ~/dev/yt-dlp/foo (git:fix/tiktok-user) $ uname -a
Linux void.rapture.com 5.18.19_1 #1 SMP PREEMPT_DYNAMIC Thu Aug 25 14:36:55 UTC 2022 x86_64 GNU/Linux

macOS:

dave - m1book darwin ~/dev/yt-dlp/foo (git:fix/tiktok-user) $ ../yt-dlp.sh https://www.tiktok.com/@bahamas10_
[tiktok:user] bahamas10_: Downloading user embed
[tiktok:user] 7146825795997093166: Downloading video feed
[tiktok:user] Downloading signature function
[INFO] Starting Chromium download.
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 86.8M/86.8M [00:02<00:00, 39.8Mb/s]
[INFO] Beginning extraction
[INFO] Chromium extracted to: /Users/dave/Library/Application Support/pyppeteer/local-chromium/588429
Executing <Task finished name='Task-1' coro=<TikTokUserIE._video_entries_api() done, defined at /Users/dave/dev/yt-dlp/yt_dlp/extractor/tiktok.py:616> exception=DeprecationWarning('remove loop argument') created at /opt/homebrew/Cellar/python@3.10/3.10.6_2/Frameworks/Python.framework/Versions/3.10/lib/python3.10/asyncio/tasks.py:636> took 13.155 seconds
ERROR: remove loop argument
Exception ignored in atexit callback: <function Launcher.launch.<locals>._close_process at 0x106acf070>
Traceback (most recent call last):
  File "/opt/homebrew/lib/python3.10/site-packages/pyppeteer/launcher.py", line 153, in _close_process
    self._loop.run_until_complete(self.killChrome())
  File "/opt/homebrew/Cellar/python@3.10/3.10.6_2/Frameworks/Python.framework/Versions/3.10/lib/python3.10/asyncio/base_events.py", line 621, in run_until_complete
    self._check_closed()
  File "/opt/homebrew/Cellar/python@3.10/3.10.6_2/Frameworks/Python.framework/Versions/3.10/lib/python3.10/asyncio/base_events.py", line 515, in _check_closed
    raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed
Exception ignored in: <coroutine object Launcher.killChrome at 0x1075d2720>
Traceback (most recent call last):
  File "/opt/homebrew/Cellar/python@3.10/3.10.6_2/Frameworks/Python.framework/Versions/3.10/lib/python3.10/warnings.py", line 506, in _warn_unawaited_coroutine
    warn(msg, category=RuntimeWarning, stacklevel=2, source=coro)
RuntimeWarning: coroutine 'Launcher.killChrome' was never awaited
Exception ignored in: <function Popen.__del__ at 0x10599a0a0>
Traceback (most recent call last):
  File "/opt/homebrew/Cellar/python@3.10/3.10.6_2/Frameworks/Python.framework/Versions/3.10/lib/python3.10/subprocess.py", line 1070, in __del__
    _warn("subprocess %s is still running" % self.pid,
ResourceWarning: subprocess 43993 is still running
(1) dave - m1book darwin ~/dev/yt-dlp/foo (git:fix/tiktok-user) $ pgrep Chromium
43993
43997
43998
dave - m1book darwin ~/dev/yt-dlp/foo (git:fix/tiktok-user) $ ps -ef|grep -i chrom
  501 43993     1   0 11:17AM ttys000    0:00.18 /Users/dave/Library/Application Support/pyppeteer/local-chromium/588429/chrome-mac/Chromium.app/Contents/MacOS/Chromium --disable-background-networking --disable-background-timer-throttling --disable-breakpad --disable-browser-side-navigation --disable-client-side-phishing-detection --disable-default-apps --disable-dev-shm-usage --disable-extensions --disable-features=site-per-process --disable-hang-monitor --disable-popup-blocking --disable-prompt-on-repost --disable-sync --disable-translate --metrics-recording-only --no-first-run --safebrowsing-disable-auto-update --enable-automation --password-store=basic --use-mock-keychain --headless --hide-scrollbars --mute-audio about:blank --remote-debugging-port=57685 --user-data-dir=/Users/dave/Library/Application Support/pyppeteer/.dev_profile/tmp0cxntbnl
  501 43997 43993   0 11:18AM ttys000    0:00.16 /Users/dave/Library/Application Support/pyppeteer/local-chromium/588429/chrome-mac/Chromium.app/Contents/Versions/71.0.3542.0/Chromium Helper.app/Contents/MacOS/Chromium Helper --type=renderer --disable-background-timer-throttling --disable-breakpad --enable-automation --file-url-path-alias=/gen=/Users/dave/Library/Application Support/pyppeteer/local-chromium/588429/chrome-mac/gen --use-gl=swiftshader-webgl --disable-features=site-per-process --disable-gpu-compositing --service-pipe-token=4544885454737805126 --lang=en-US --headless --num-raster-threads=4 --enable-zero-copy --enable-gpu-memory-buffer-compositor-resources --enable-main-frame-before-activation --service-request-channel-token=4544885454737805126 --renderer-client-id=2 --seatbelt-client=33
  501 43998 43993   0 11:18AM ttys000    0:00.19 /Users/dave/Library/Application Support/pyppeteer/local-chromium/588429/chrome-mac/Chromium.app/Contents/Versions/71.0.3542.0/Chromium Helper.app/Contents/MacOS/Chromium Helper --type=gpu-process --disable-features=site-per-process --disable-breakpad --headless --headless --gpu-preferences=KAAAAAAAAACAAAAAAQAAAAAAAAAAAGAAAAAAAAAAAAAIAAAAAAAAADgBAAAmAAAAMAEAAAAAAAA4AQAAAAAAAEABAAAAAAAASAEAAAAAAABQAQAAAAAAAFgBAAAAAAAAYAEAAAAAAABoAQAAAAAAAHABAAAAAAAAeAEAAAAAAACAAQAAAAAAAIgBAAAAAAAAkAEAAAAAAACYAQAAAAAAAKABAAAAAAAAqAEAAAAAAACwAQAAAAAAALgBAAAAAAAAwAEAAAAAAADIAQAAAAAAANABAAAAAAAA2AEAAAAAAADgAQAAAAAAAOgBAAAAAAAA8AEAAAAAAAD4AQAAAAAAAAACAAAAAAAACAIAAAAAAAAQAgAAAAAAABgCAAAAAAAAIAIAAAAAAAAoAgAAAAAAADACAAAAAAAAOAIAAAAAAABAAgAAAAAAAEgCAAAAAAAAUAIAAAAAAABYAgAAAAAAABAAAAAAAAAAAAAAAAUAAAAQAAAAAAAAAAAAAAALAAAAEAAAAAAAAAAAAAAADAAAABAAAAAAAAAAAAAAAA0AAAAQAAAAAAAAAAAAAAAPAAAAEAAAAAAAAAAAAAAAEAAAABAAAAAAAAAAAAAAABIAAAAQAAAAAAAAAAAAAAATAAAAEAAAAAAAAAABAAAABQAAABAAAAAAAAAAAQAAAAsAAAAQAAAAAAAAAAEAAAAMAAAAEAAAAAAAAAABAAAADQAAABAAAAAAAAAAAQAAAA8AAAAQAAAAAAAAAAEAAAAQAAAAEAAAAAAAAAABAAAAEgAAABAAAAAAAAAAAQAAABMAAAAQAAAAAAAAAAQAAAAFAAAAEAAAAAAAAAAEAAAACwAAABAAAAAAAAAABAAAAAwAAAAQAAAAAAAAAAQAAAANAAAAEAAAAAAAAAAEAAAADwAAABAAAAAAAAAABAAAABAAAAAQAAAAAAAAAAQAAAASAAAAEAAAAAAAAAAEAAAAEwAAABAAAAAAAAAABgAAAAUAAAAQAAAAAAAAAAYAAAALAAAAEAAAAAAAAAAGAAAADQAAABAAAAAAAAAABgAAAA8AAAAQAAAAAAAAAAYAAAAQAAAAEAAAAAAAAAAGAAAAEgAAABAAAAAAAAAABgAAABMAAAAQAAAAAAAAAAcAAAAFAAAAEAAAAAAAAAAHAAAACwAAABAAAAAAAAAABwAAAA0AAAAQAAAAAAAAAAcAAAAPAAAAEAAAAAAAAAAHAAAAEAAAABAAAAAAAAAABwAAABIAAAAQAAAAAAAAAAcAAAATAAAA --use-gl=swiftshader-webgl --headless --service-request-channel-token=11592880234632633554
  501 44015 43178   0 11:18AM ttys000    0:00.00 grep -i chrom
dave - m1book darwin ~/dev/yt-dlp/foo (git:fix/tiktok-user) $ pkill Chromium
dave - m1book darwin ~/dev/yt-dlp/foo (git:fix/tiktok-user) $ uname -a
Darwin m1book.local 21.6.0 Darwin Kernel Version 21.6.0: Mon Aug 22 20:19:52 PDT 2022; root:xnu-8020.140.49~2/RELEASE_ARM64_T6000 arm64

On macOS the chromium process seems to hang around in the background until it is killed manually, whereas on Linux it's dead before I can pgrep for it.

@redraskal
Copy link
Contributor Author

@bahamas10 hmm I will take a look on macOS

@CorentinB
Copy link

CorentinB commented Oct 7, 2022

I got the exact same errors as @bahamas10 described on both Linux & macOS. :(

@fleescree
Copy link

I also had the same error occur, it appears I only needed to install chromium dependencies.

Error trace
Retriving user id                                                                                                             
[INFO] Starting Chromium download.                                                                                            
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 109M/109M [00:00<00:00, 239Mb/s]
[INFO] Beginning extraction                                                                                                                                                                                                                                 
[INFO] Chromium extracted to: /home/linux-user/.config/local/share/pyppeteer/local-chromium/588429                                                                                                                                                             
ERROR: Browser closed unexpectedly:                                                                                                                                                                                                                         
                                                                                                                                                                                                                                                            
Error in atexit._run_exitfuncs:                                                                                                                                                                                                                             
Traceback (most recent call last):                                                                                            
  File "/home/linux-user/.local/lib/python3.8/site-packages/pyppeteer/launcher.py", line 153, in _close_process                                                                                                                                                
    self._loop.run_until_complete(self.killChrome())                                                                                                                                                                                                        
  File "/usr/lib/python3.8/asyncio/base_events.py", line 591, in run_until_complete                                                                                                                                                                         
    self._check_closed()                                                                                                                                                                                                                                    
  File "/usr/lib/python3.8/asyncio/base_events.py", line 508, in _check_closed                                                                                                                                                                              
    raise RuntimeError('Event loop is closed')                                                                                
RuntimeError: Event loop is closed                                                                                                                                                                                                                          
sys:1: RuntimeWarning: coroutine 'Launcher.killChrome' was never awaited                                                      
RuntimeWarning: Enable tracemalloc to get the object allocation traceback                                                                                                                                                                                   
                                                                                                                                                                                                                                                            
[tiktok:user] tiktok: Downloading user embed                                                                                                                                                                                                              
[tiktok:user] 7151807169850051883: Downloading video feed
[tiktok:user] Downloading signature function
ERROR: Browser closed unexpectedly:

Error in atexit._run_exitfuncs:
Traceback (most recent call last):
  File "/home/linux-user/.local/lib/python3.8/site-packages/pyppeteer/launcher.py", line 153, in _close_process
    self._loop.run_until_complete(self.killChrome())
  File "/usr/lib/python3.8/asyncio/base_events.py", line 591, in run_until_complete
    self._check_closed()                                       
  File "/usr/lib/python3.8/asyncio/base_events.py", line 508, in _check_closed
    raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed
sys:1: RuntimeWarning: coroutine 'Launcher.killChrome' was never awaited
RuntimeWarning: Enable tracemalloc to get the object allocation traceback

To fix it I just followed this answer
https://stackoverflow.com/a/71935536/13737199

The only difference was updating the chromium path, that depends on your home dir and chromium version

-ldd ~/.local/share/pyppeteer/local-chromium/588429/chrome-linux/chrome | grep 'not found'
+ldd ~/.config/local/share/pyppeteer/local-chromium/588429/chrome-linux/chrome | grep 'not found'

Then I just followed the instructions to install google chrome and the extractor worked!

note: I am using a vps, so that's probably why I didn't have the required dependencies.

@zulc22
Copy link

zulc22 commented Oct 11, 2022

i'm getting "ERROR: remove loop argument"?

$ ~/yt-dlp-ttf/yt-dlp.sh https://tiktok.com/@sonicdude8__ --recode-video mp4
[tiktok:user] sonicdude8__: Downloading user embed
[tiktok:user] 7153079948612324650: Downloading video feed
[tiktok:user] Downloading signature function
Executing <Task finished name='Task-1' coro=<TikTokUserIE._video_entries_api() done, defined at /home/zulc22/yt-dlp-ttf/yt_dlp/extractor/tiktok.py:616> exception=DeprecationWarning('remove loop argument') created at /usr/lib/python3.10/asyncio/tasks.py:636> took 0.428 seconds
ERROR: remove loop argument
Exception ignored in atexit callback: <function Launcher.launch.<locals>._close_process at 0x7f83759ac260>
Traceback (most recent call last):
  File "/home/zulc22/.local/lib/python3.10/site-packages/pyppeteer/launcher.py", line 153, in _close_process
    self._loop.run_until_complete(self.killChrome())
  File "/usr/lib/python3.10/asyncio/base_events.py", line 621, in run_until_complete
    self._check_closed()
  File "/usr/lib/python3.10/asyncio/base_events.py", line 515, in _check_closed
    raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed
Exception ignored in: <coroutine object Launcher.killChrome at 0x7f837312b890>
Traceback (most recent call last):
  File "/usr/lib/python3.10/warnings.py", line 506, in _warn_unawaited_coroutine
    warn(msg, category=RuntimeWarning, stacklevel=2, source=coro)
RuntimeWarning: coroutine 'Launcher.killChrome' was never awaited
Exception ignored in: <function Popen.__del__ at 0x7f8378d5a780>
Traceback (most recent call last):
  File "/usr/lib/python3.10/subprocess.py", line 1070, in __del__
    _warn("subprocess %s is still running" % self.pid,
ResourceWarning: subprocess 95170 is still running

@PolcovnicGauss

This comment was marked as spam.

@julian45
Copy link

This works pretty well, thanks so much for putting this together! My only request at this would be to adjust the loop logic so that if one video is unavailable for one reason or another, it notes the failure (if desired), but continues to download other videos from the user's profile. This may be just me, but I'd rather the executable download all possible videos from the profile first, and then exit once all videos in the profile have been tried (whether with exit code 0 or otherwise), rather than it just stopping partway through the profile because it encountered a single inaccessible video. For reference, I'm working from macOS 12.6, using a freshly compiled version of the relevant branch for this PR.

Strangely enough, I'm not encountering either of the errors that @bahamas10 or @zulc22 are. However:

  • for @bahamas10's issue, the download simply stopped at 28 videos, similarly to the above, although there are clearly more than that on the user's profile upon visual inspection.
  • for @zulc22, the --recode-video flag was not necessarily required, since TikTok videos have always already been provided as mp4 in my experience, and the download stopped at 29 videos like the two instances above.

Extract from, for example, running the same command as @bahamas10:

[tiktok:user] 7134728030202875182: Downloading video feed
ERROR: 7134728030202875182: Unable to find video in feed; please report this issue on  https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using  yt-dlp -U
Traceback (most recent call last):
  File "/Users/julian/rr-ytdlp/yt-dlp/yt_dlp/YoutubeDL.py", line 1477, in wrapper
    return func(self, *args, **kwargs)
  File "/Users/julian/rr-ytdlp/yt-dlp/yt_dlp/utils.py", line 2999, in <lambda>
    return type(self.ydl)._handle_extraction_exceptions(lambda _, i: self._entries[i])(self.ydl, i)
  File "/Users/julian/rr-ytdlp/yt-dlp/yt_dlp/utils.py", line 2769, in __getitem__
    self._cache.extend(itertools.islice(self._iterable, n))
  File "/Users/julian/rr-ytdlp/yt-dlp/yt_dlp/extractor/tiktok.py", line 675, in _entries_api
    **self._extract_aweme_app(video['id']),
  File "/Users/julian/rr-ytdlp/yt-dlp/yt_dlp/extractor/tiktok.py", line 534, in _extract_aweme_app
    raise ExtractorError('Unable to find video in feed', video_id=aweme_id)
yt_dlp.utils.ExtractorError: 7134728030202875182: Unable to find video in feed; please report this issue on  https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using  yt-dlp -U

[tiktok:user] Playlist bahamas10_: Downloading 28 videos of 27

@zulc22
Copy link

zulc22 commented Oct 20, 2022

@julian45 are you using python 3.10?

@julian45
Copy link

@zulc22 Yes, I'm using version 3.10.8.

@cunlem
Copy link

cunlem commented Nov 12, 2022

So does anyone have idea what causes

ERROR: remove loop argument

error described above by @bahamas10 and @zulc22? I'm also experiencing it

@redraskal
Copy link
Contributor Author

redraskal commented Nov 13, 2022

Replaced pyppeteer with playwright since pyppeteer will not be receiving fixes for the problems we encountered.

The API requirements changed, but it looks like a headless browser is still required. The response is blank if I don't send requests from a browser.

Downside is you will need to run playwright install before running the tiktok extractor

@julian45
Copy link

Thanks! Unfortunately, I'm encountering the same issues as I did on Oct. 19; although it begins to download en masse, it still stops upon encountering any video with missing sound. For example, with @bahamas10's issue above, I am able to download 37 videos (which makes sense, as 10 have been posted since Oct. 19 and I was able to [actually] download 27 then), but no more:

sh-3.2$ ~/rr-ytdlp/yt-dlp https://www.tiktok.com/@bahamas10_
[tiktok:user] bahamas10_: Downloading user embed
[tiktok:user] 7164841989530373422: Downloading video feed
[tiktok:user] Downloading page 1
[tiktok:user] Downloading page 2
[tiktok:user] Downloading page 3
[tiktok:user] Downloading page 4
[tiktok:user] Downloading page 5
[tiktok:user] Downloading page 6
[tiktok:user] Downloading page 7
[tiktok:user] Downloading page 8
[tiktok:user] Downloading page 9
[tiktok:user] Downloading page 10
[tiktok:user] Downloading page 11
[tiktok:user] Downloading page 12
[tiktok:user] Downloading page 13
[download] Downloading playlist: bahamas10_
[tiktok:user] 7164841989530373422: Downloading video feed
[tiktok:user] 7164821593967889710: Downloading video feed
[tiktok:user] 7164209740883430699: Downloading video feed
[tiktok:user] 7162680518335548715: Downloading video feed
[tiktok:user] 7162234814542728491: Downloading video feed
[tiktok:user] 7158503684312157482: Downloading video feed
[tiktok:user] 7158241982349987115: Downloading video feed
[tiktok:user] 7157457925466754347: Downloading video feed
[tiktok:user] 7156762277989895466: Downloading video feed
[tiktok:user] 7156734153147305262: Downloading video feed
[tiktok:user] 7154941390693109035: Downloading video feed
[tiktok:user] 7154172716684152106: Downloading video feed
[tiktok:user] 7152944176110423342: Downloading video feed
[tiktok:user] 7152547553580453163: Downloading video feed
[tiktok:user] 7151936193779748142: Downloading video feed
[tiktok:user] 7151925545683422507: Downloading video feed
[tiktok:user] 7151205459649645870: Downloading video feed
[tiktok:user] 7150785587887295786: Downloading video feed
[tiktok:user] 7150022740764806446: Downloading video feed
[tiktok:user] 7146825795997093166: Downloading video feed
[tiktok:user] 7146267469533990190: Downloading video feed
[tiktok:user] 7145281423388331307: Downloading video feed
[tiktok:user] 7144474645620722986: Downloading video feed
[tiktok:user] 7143719905806994734: Downloading video feed
[tiktok:user] 7143411265371704622: Downloading video feed
[tiktok:user] 7143001307996114219: Downloading video feed
[tiktok:user] 7142661476354886955: Downloading video feed
[tiktok:user] 7142275191962586414: Downloading video feed
[tiktok:user] 7142244394500902186: Downloading video feed
[tiktok:user] 7141013832662699310: Downloading video feed
[tiktok:user] 7140135300482862382: Downloading video feed
[tiktok:user] 7139705568700140846: Downloading video feed
[tiktok:user] 7139585213042150699: Downloading video feed
[tiktok:user] 7139548951212002603: Downloading video feed
[tiktok:user] 7138852612786654510: Downloading video feed
[tiktok:user] 7138148822366096682: Downloading video feed
[tiktok:user] 7137743845999234350: Downloading video feed
[tiktok:user] 7134728030202875182: Downloading video feed
ERROR: 7134728030202875182: Unable to find video in feed; please report this issue on  https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using  yt-dlp -U
[tiktok:user] Playlist bahamas10_: Downloading 38 items of 37

I can confirm, when browsing the profile, that the 38th video from the top as of the time of writing has no sound.

@redraskal
Copy link
Contributor Author

redraskal commented Nov 13, 2022

Thanks! Unfortunately, I'm encountering the same issues as I did on Oct. 19; although it begins to download en masse, it still stops upon encountering any video with missing sound. For example, with @bahamas10's issue above, I am able to download 37 videos (which makes sense, as 10 have been posted since Oct. 19 and I was able to [actually] download 27 then), but no more:

sh-3.2$ ~/rr-ytdlp/yt-dlp https://www.tiktok.com/@bahamas10_
[tiktok:user] bahamas10_: Downloading user embed
[tiktok:user] 7164841989530373422: Downloading video feed
[tiktok:user] Downloading page 1
[tiktok:user] Downloading page 2
[tiktok:user] Downloading page 3
[tiktok:user] Downloading page 4
[tiktok:user] Downloading page 5
[tiktok:user] Downloading page 6
[tiktok:user] Downloading page 7
[tiktok:user] Downloading page 8
[tiktok:user] Downloading page 9
[tiktok:user] Downloading page 10
[tiktok:user] Downloading page 11
[tiktok:user] Downloading page 12
[tiktok:user] Downloading page 13
[download] Downloading playlist: bahamas10_
[tiktok:user] 7164841989530373422: Downloading video feed
[tiktok:user] 7164821593967889710: Downloading video feed
[tiktok:user] 7164209740883430699: Downloading video feed
[tiktok:user] 7162680518335548715: Downloading video feed
[tiktok:user] 7162234814542728491: Downloading video feed
[tiktok:user] 7158503684312157482: Downloading video feed
[tiktok:user] 7158241982349987115: Downloading video feed
[tiktok:user] 7157457925466754347: Downloading video feed
[tiktok:user] 7156762277989895466: Downloading video feed
[tiktok:user] 7156734153147305262: Downloading video feed
[tiktok:user] 7154941390693109035: Downloading video feed
[tiktok:user] 7154172716684152106: Downloading video feed
[tiktok:user] 7152944176110423342: Downloading video feed
[tiktok:user] 7152547553580453163: Downloading video feed
[tiktok:user] 7151936193779748142: Downloading video feed
[tiktok:user] 7151925545683422507: Downloading video feed
[tiktok:user] 7151205459649645870: Downloading video feed
[tiktok:user] 7150785587887295786: Downloading video feed
[tiktok:user] 7150022740764806446: Downloading video feed
[tiktok:user] 7146825795997093166: Downloading video feed
[tiktok:user] 7146267469533990190: Downloading video feed
[tiktok:user] 7145281423388331307: Downloading video feed
[tiktok:user] 7144474645620722986: Downloading video feed
[tiktok:user] 7143719905806994734: Downloading video feed
[tiktok:user] 7143411265371704622: Downloading video feed
[tiktok:user] 7143001307996114219: Downloading video feed
[tiktok:user] 7142661476354886955: Downloading video feed
[tiktok:user] 7142275191962586414: Downloading video feed
[tiktok:user] 7142244394500902186: Downloading video feed
[tiktok:user] 7141013832662699310: Downloading video feed
[tiktok:user] 7140135300482862382: Downloading video feed
[tiktok:user] 7139705568700140846: Downloading video feed
[tiktok:user] 7139585213042150699: Downloading video feed
[tiktok:user] 7139548951212002603: Downloading video feed
[tiktok:user] 7138852612786654510: Downloading video feed
[tiktok:user] 7138148822366096682: Downloading video feed
[tiktok:user] 7137743845999234350: Downloading video feed
[tiktok:user] 7134728030202875182: Downloading video feed
ERROR: 7134728030202875182: Unable to find video in feed; please report this issue on  https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using  yt-dlp -U
[tiktok:user] Playlist bahamas10_: Downloading 38 items of 37

I can confirm, when browsing the profile, that the 38th video from the top as of the time of writing has no sound.

This extractor only passes along video ids to another extractor, that is an issue with the individual video extractor.

FYI each "page" downloaded contains 30 video ids now.

At least, I'm fairly sure it has nothing to do with the user extractor lol

@redraskal
Copy link
Contributor Author

@julian45 TikTokIE throws an ExtractorError when a video is not found. It says unavailable when I try and open it on tiktok.com. Maybe that is not the proper way of handling this? Is this not a problem with other extractors?

@julian45
Copy link

@julian45 TikTokIE throws an ExtractorError when a video is not found. It says unavailable when I try and open it on tiktok.com. Maybe that is not the proper way of handling this? Is this not a problem with other extractors?

I'm not sure, to be honest. In the mobile apps, one can still view the video and all its associated data and metadata, just no sound. I don't know why they'd make that possible in one context and throw 404s in another. As for other extractors, I haven't used enough of them to give a good answer; I mostly use this one, YouTube, and occasionally Twitter.

@redraskal
Copy link
Contributor Author

@julian45 TikTokIE throws an ExtractorError when a video is not found. It says unavailable when I try and open it on tiktok.com. Maybe that is not the proper way of handling this? Is this not a problem with other extractors?

I'm not sure, to be honest. In the mobile apps, one can still view the video and all its associated data and metadata, just no sound. I don't know why they'd make that possible in one context and throw 404s in another. As for other extractors, I haven't used enough of them to give a good answer; I mostly use this one, YouTube, and occasionally Twitter.

I wrapped the TikTokIE function to suppress the error into a warning. Not sure if this is the best way of going about things, but it fixed the issue on my end. Try it now

@julian45
Copy link

That seems to have done the trick — I don't have enough time tonight to let bahamas10_'s profile download run in full, but it definitely isn't stopping at soundless videos.

Unfortunately, this new way of downloading seems to preclude a nice side-feature of the days of yore (i.e., before TikTok changed their API a bunch) where once you reached a video in the feed that you'd already downloaded, it would simply log an informative message letting you know as much and then continue to the next video; this was a nice tell for knowing when to stop the process once all new videos were downloaded. In this system, it looks like it basically gets the metadata for every single video in the profile, then tries downloading them all en masse; if that's what works, that's what works, but alas.

My Python is a bit rusty, but I'd like to test if it'd be possible to move the video download calls into the same loop as the one that grabs video IDs. Your solution is definitely workable, and I mean no ill will to you whatsoever re: the situation described above — I'd simply like to experiment a bit myself to try to get one particular bit of functionality, and if that doesn't work out, there'll still be a well-functioning download solution to use.

Thanks for your hard work in figuring all of this out!!!

@redraskal
Copy link
Contributor Author

That seems to have done the trick — I don't have enough time tonight to let bahamas10_'s profile download run in full, but it definitely isn't stopping at soundless videos.

Unfortunately, this new way of downloading seems to preclude a nice side-feature of the days of yore (i.e., before TikTok changed their API a bunch) where once you reached a video in the feed that you'd already downloaded, it would simply log an informative message letting you know as much and then continue to the next video; this was a nice tell for knowing when to stop the process once all new videos were downloaded. In this system, it looks like it basically gets the metadata for every single video in the profile, then tries downloading them all en masse; if that's what works, that's what works, but alas.

My Python is a bit rusty, but I'd like to test if it'd be possible to move the video download calls into the same loop as the one that grabs video IDs. Your solution is definitely workable, and I mean no ill will to you whatsoever re: the situation described above — I'd simply like to experiment a bit myself to try to get one particular bit of functionality, and if that doesn't work out, there'll still be a well-functioning download solution to use.

Thanks for your hard work in figuring all of this out!!!

Probably a good idea, I also notice skipping is more delayed

@Taion1
Copy link

Taion1 commented Nov 13, 2022

I can report that I had the same issue as julian45 and that it is now fixed. Great work!

I'm also echoing a similar thought about the old days; it was possible to use --break-on-existing or --playlist-end NUMBER or -I :NUMBER to download only the first 'page' of videos, like you can do with the regular extractors. I personally use this method to get new videos with minimal overhead.

Thanks again for your work so far.

@redraskal
Copy link
Contributor Author

redraskal commented Nov 21, 2022

Made some changes:

  • Fix user video listing api request (looks like they check sec-fetch headers now, so I now evaluate JS instead of navigating to the endpoint)
  • Remove async usage
  • Add check for --playlist-end
  • Fallback for unavailable embed pages (accounts can choose to disable embeds for whatever reason, supply secuid with --video-password for now until I find a more sane way lol)
  • Profile thumbnails should be larger

@pukkandan pukkandan added enhancement New feature or request site-request Request to support a new website labels Dec 8, 2022
@bashonly bashonly mentioned this pull request Dec 13, 2022
9 tasks
@tomperchtold
Copy link

I am using this fix (or more exactly this here: https://github.com/TheMrRandomDude/yt-dlp-tiktok-scraper-fixed which is basically the same) to download TikTok videos of some users. While most of the users work (celebs with 1M+ but also normal with only few hundred followers) I found three users I am unable to download anything:
https://www.tiktok.com/@avamajuryyy
https://www.tiktok.com/@chelsea.m07
https://www.tiktok.com/@lillybketchman
I don't see anything special with these users, they load normally in the browsers. Yet they throw a 400 Bad Request error with yt-dlp ...
Can you please take a look at it? Thanks!

@redraskal
Copy link
Contributor Author

redraskal commented Feb 5, 2023

@tomperchtold The TikTok HTML embeds are throwing 400 errors (ex. https://tiktok.com/embed/@username). This page is required to fetch a user identifier. In this case, you may manually supply the user id.

To find the user id for one of these accounts, you first need to open the browser developer tools (on the profile page like you linked). Then, swap to the "Network" tab and filter the contents to XHR. Refresh the page and find a request for us.tiktok.com/api/user/detail?aid=1988... and select that.

Click on Response and then navigate through the JSON structure to userInfo -> user -> secUid.

Copy the value of secUid.

Then, substitute it into this additional argument (example):

yt-dlp --extractor-args "tiktok:secuid=PASTE_HERE" https://tiktok.com/@example

See https://github.com/redraskal/yt-dlp/tree/fix/tiktok-user#extractor-arguments for the guide on extractor arguments.

@tomperchtold
Copy link

@redraskal
Thanks for your answer! I did not know about the secuid so I checked that right away.
Took the chelsea.m07 user and it got the id: MS4wLjABAAAA40PUfcpIYmV6z6R8zfneyFoseqzb_Jen5DeKDuSpkhmoKaXNjDikpuK3cPzuD4wv which I know it to be right because if I use https://www.tiktok.com/@MS4wLjABAAAA40PUfcpIYmV6z6R8zfneyFoseqzb_Jen5DeKDuSpkhmoKaXNjDikpuK3cPzuD4wv it gets me to the page.

I then tried to integrate the value to the --extractor-args "tiktok:secuid=MS4..."
I was not sure how these work within a bash script so I used all kinds of variants to no avail ...

++ ./../../yt-dlp.sh --extractor-args tiktok:secuid=MS4wLjABAAAA40PUfcpIYmV6z6R8zfneyFoseqzb_Jen5DeKDuSpkhmoKaXNjDikpuK3cPzuD4wv https://www.tiktok.com/@chelsea.m07 --write-thumbnail --no-overwrites --print-traffic --write-pages -vU [debug] Command-line config: ['--extractor-args', 'tiktok:secuid=MS4wLjABAAAA40PUfcpIYmV6z6R8zfneyFoseqzb_Jen5DeKDuSpkhmoKaXNjDikpuK3cPzuD4wv', 'https://www.tiktok.com/@chelsea.m07', '--write-thumbnail', '--no-overwrites', '--print-traffic', '--write-pages', '-vU'] [debug] Encodings: locale UTF-8, fs utf-8, pref UTF-8, out utf-8, error utf-8, screen utf-8 [debug] yt-dlp version 2022.11.11 [8b644025b] (source) [debug] Plugins: ['SamplePluginIE', 'SamplePluginPP'] [debug] Git HEAD: b7a664e05 [debug] Python 3.10.4 (CPython x86_64 64bit) - Linux-5.16.18-200.fc35.x86_64-x86_64-with-glibc2.34 (OpenSSL 1.1.1n FIPS 15 Mar 2022, glibc 2.34) [debug] exe versions: phantomjs 3.0.0 [debug] Optional libraries: Cryptodome-3.17, brotli-1.0.9, certifi-2022.12.07, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.4 [debug] Proxy map: {} [debug] Loaded 1731 extractors [debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest ... skipping failed update procedure ... ERROR: You cannot update when running from source code; Use git to pull the latest changes [tiktok:user] Extracting URL: https://www.tiktok.com/@chelsea.m07 [tiktok:user] chelsea.m07: Downloading user embed send: b'GET /embed/@chelsea.m07 HTTP/1.1\r\nHost: www.tiktok.com\r\nUser-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.74 Safari/537.36\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\nAccept-Language: en-us,en;q=0.5\r\nSec-Fetch-Mode: navigate\r\nAccept-Encoding: gzip, deflate, br\r\nConnection: close\r\n\r\n' reply: 'HTTP/1.1 400 Bad Request\r\n' header: Server: nginx header: Content-Type: text/html; charset=utf-8 header: Content-Length: 58056 header: X-Tt-Logid: 20230205151216022966AAD1B093729240 header: Strict-Transport-Security: max-age=31536000 header: x-tt-trace-host: 01f9bb77ba67103b2658e8fc2fbdcfcaa45f168cdb5620b0fa49d88c17b053fdcde410f18f7b03cf483bc2297a14e99f9e4b98ca34cecd83d6dee1e439de171c0b4c27332de8533b4737f6cbfee399f820ac86130006ea02e39349abcb83bef7d74a5949f98a5cba73e3d84d1ca3073d5c header: X-Origin-Response-Time: 116,23.209.100.78 header: X-Akamai-Request-ID: 662e518c.43729b50 header: Expires: Sun, 05 Feb 2023 15:12:17 GMT header: Cache-Control: max-age=0, no-cache, no-store header: Pragma: no-cache header: Date: Sun, 05 Feb 2023 15:12:17 GMT header: X-Cache: TCP_MISS from a2-19-82-201.deploy.akamaitechnologies.com (AkamaiGHost/10.10.3-45298580) (-) header: Connection: close header: Set-Cookie: ttwid=1%7CcUyyJS4wD3b_ZARdyUXnGXDtSnJ5Xs3AcRekTy8c5yo%7C1675609937%7C2a06bda898e63f78a51cdaeb146338d50b96d77cff6a1eeceb326336e7357907; Domain=.tiktok.com; Path=/; Expires=Wed, 31 Jan 2024 15:12:17 GMT; HttpOnly; SameSite=None; Secure header: X-Cache-Remote: TCP_MISS from a23-209-100-78.deploy.akamaitechnologies.com (AkamaiGHost/10.10.3-45298580) (-) header: x-tt-trace-tag: id=16;cdn-cache=miss;type=dyn header: Server-Timing: cdn-cache; desc=MISS, edge; dur=93, origin; dur=116 header: Server-Timing: inner; dur=115 header: X-Parent-Response-Time: 209,2.19.82.201 header: Set-Cookie: _abck=A1DD5F235B616CD29B36F64744B43637~-1~YAAQyVITAvTfA8qFAQAAkZUiIgkKXKku7UMsTeJRzST8SG5P5Ol5vhK5R+g5LbG+PgP6yiGe9JPew9OAys30t2Ww66l7NbQZGIt/VER1HA9+nIVWx0m+Bnv3upp8iN5ZV5G7L7v6I82FvvsS7CqN6mGnQR0a+wL98GglkBO08BiodcqlkGVtirtiuZ7m2378nwNd5V+EvIhboGTNsc/lHmJJvwVh+07vvIAI/SnDziPvk2MLlsM7HQbypAXxiwyv28Jz7thdsuIDIec77xwycokOQKpvnRQsM+sK1Bwy16wz2zRv9zK2WgvnSsxZsGLLbZ1FkAsOho+fuoqpuaXviHUJjfgVzcmbu3CfrmMuzeIb5WLiRVLSuLVmUeo=~-1~-1~-1; Domain=.tiktok.com; Path=/; Expires=Mon, 05 Feb 2024 15:12:17 GMT; Max-Age=31536000; Secure header: Set-Cookie: bm_sz=736C042F27DE5BE16121A60944A92791~YAAQyVITAvXfA8qFAQAAkZUiIhIM1IuxgLPYL/a4M940quxzhcsBiibqHjXLf6bqcb8XEi8LY5empBT3AV3JiDOlpnYWoL5aXWdktkl6cXXWovR8gD+BSlBcehdZwkwx+YCj77F1uK9T6Cwd4ZOQuuPfzRg1KoCWldgcMe8s7T+3Z7YZ059OkngaKROqm26qsh93sOtvHm4K0sOioLBLEwvTp87uVq7sOtIfXDxXY3pDa9JwlXQPh4bIVC9hLFDtiE9gYm08Qo60RHw+3IIucsJtFrAH7S+W4HcleWyh1W7I/gU=~3293508~3360070; Domain=.tiktok.com; Path=/; Expires=Sun, 05 Feb 2023 19:12:17 GMT; Max-Age=14400 WARNING: [tiktok:user] Unable to download webpage: HTTP Error 400: Bad Request (caused by <HTTPError 400: 'Bad Request'>); please report this issue on https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using yt-dlp -U; secUid supplied, trying anyway [debug] [tiktok:user] Launching headless browser [tiktok:user] Downloading page 1 [debug] [tiktok:user] x-tt-params: 4mOJhGnMOdadxrLEy2bkmGSR2R38w8nZC8MQKREioTAU76aXIbW+KkRzj5O7qqbOI65rqSqkFAXNltiJ1p2YyvalYQ0VbxXkcXRyJfQSSRT/hNxrfVoguXF5EoUWv0lX2zNU4wIvElO6a9RAz9QTfldaJ7RIN/DcdaDawmdeoTsMRPjntqqS3+U5rwYyzoeQ1M9wog/i844XuEGw7d/iLpJ8hnqtX9SVLCPLyqU9IoJEFPpXWWu0yCYoU0SlbAzPmuuvr8aFV4VkaE8glxo7Djld7KZDWOodERJAfxTz11sRrwIr79sFyQTB2aZafrLsRU0UU0fr1uDY670tJbGO6P+cMgtC0Ce2SJOuhg4WPpyyFtrWgHs/HquPVWLxru2+lr8IEfPF6rz2ZPdM8FH9TRWsZYs0Xb88kVjTkFOJV0KKBMO8qTDMtrQb0cO3AWuYWJxSKGIdNMo9PBG+HYdiJ/pj06Z0w+fmkmZwKpRCv82Lp1uBTjsMQQtl6dPu8MP1WnA0UBW7gszBu+w711dOJ5yPwpBXkZ20WNlLKDh/HTgSFLpbolUmZ5Qzz2MzjlKiumvfN+BrKIYIvpO0Ud2LXHi1Gq8a1QV1qY7vxIlWdLlQ8rwOvPNH+r6kaUG/EL+TWiwDWl0sbQ50cv90nNkjuxSbHbHnMa13FaAZIguVbUl2GMnDm4eEY0h+MY+f6e+5Jy8QDqBiaM4PDAkY+HFor427xOV86cPojYGO4Zbjfo8vMCPlBvISTaON7QRRlH/YQBjpUJ2FPpSVE8DSqdbngqza36/AhNITjROAJKIt8YXMjyEMLtIFg15RpExxkfxBQbsJJZZHWVxFFdrS6MpITKXbw/WHN7zKFBIhMNujENM= Executing <Handle SyncBase._sync.<locals>.<lambda>(<Task finishe...nc_base.py:96>) at /usr/local/lib/python3.10/site-packages/playwright/_impl/_sync_base.py:100 created at /usr/lib64/python3.10/asyncio/events.py:80> took 2.008 seconds ERROR: 'list' object has no attribute 'get' Traceback (most recent call last): File "/home/user/yt-dlp-tiktok-scraper-fixed/yt_dlp/YoutubeDL.py", line 1495, in wrapper return func(self, *args, **kwargs) File "/home/user/yt-dlp-tiktok-scraper-fixed/yt_dlp/YoutubeDL.py", line 1571, in __extract_info ie_result = ie.extract(url) File "/home/user/yt-dlp-tiktok-scraper-fixed/yt_dlp/extractor/common.py", line 680, in extract ie_result = self._real_extract(url) File "/home/user/yt-dlp-tiktok-scraper-fixed/yt_dlp/extractor/tiktok.py", line 772, in _real_extract if author.get('uniqueId', '') == user_name: AttributeError: 'list' object has no attribute 'get'

++ ./../../yt-dlp.sh --extractor-args tiktok:secuid=MS4wLjABAAAA40PUfcpIYmV6z6R8zfneyFoseqzb_Jen5DeKDuSpkhmoKaXNjDikpuK3cPzuD4wv https://www.tiktok.com/@chelsea.m07 --write-thumbnail --no-overwrites --print-traffic --write-pages -vU [debug] Command-line config: ['--extractor-args', 'tiktok:secuid=MS4wLjABAAAA40PUfcpIYmV6z6R8zfneyFoseqzb_Jen5DeKDuSpkhmoKaXNjDikpuK3cPzuD4wv', 'https://www.tiktok.com/@chelsea.m07', '--write-thumbnail', '--no-overwrites', '--print-traffic', '--write-pages', '-vU'] [debug] Encodings: locale UTF-8, fs utf-8, pref UTF-8, out utf-8, error utf-8, screen utf-8 [debug] yt-dlp version 2022.11.11 [8b644025b] (source) [debug] Plugins: ['SamplePluginIE', 'SamplePluginPP'] [debug] Git HEAD: b7a664e05 [debug] Python 3.10.4 (CPython x86_64 64bit) - Linux-5.16.18-200.fc35.x86_64-x86_64-with-glibc2.34 (OpenSSL 1.1.1n FIPS 15 Mar 2022, glibc 2.34) [debug] exe versions: phantomjs 3.0.0 [debug] Optional libraries: Cryptodome-3.17, brotli-1.0.9, certifi-2022.12.07, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.4 [debug] Proxy map: {} [debug] Loaded 1731 extractors [debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest ... skipping failed update procedure ... ERROR: You cannot update when running from source code; Use git to pull the latest changes [tiktok:user] Extracting URL: https://www.tiktok.com/@chelsea.m07 [tiktok:user] chelsea.m07: Downloading user embed send: b'GET /embed/@chelsea.m07 HTTP/1.1\r\nHost: www.tiktok.com\r\nUser-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.71 Safari/537.36\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\nAccept-Language: en-us,en;q=0.5\r\nSec-Fetch-Mode: navigate\r\nAccept-Encoding: gzip, deflate, br\r\nConnection: close\r\n\r\n' reply: 'HTTP/1.1 400 Bad Request\r\n' header: Server: nginx header: Content-Type: text/html; charset=utf-8 header: Content-Length: 58056 header: X-Tt-Logid: 202302051512297DB549F5E74D6873853B header: Strict-Transport-Security: max-age=31536000 header: x-tt-trace-host: 01f9bb77ba67103b2658e8fc2fbdcfcaa4cbe96b1de7bff2e658c550124c4cc17d610484d3e5bcdfe7d2642e2063ed0207e03461f4021a6fec4b706b4707f15d7228220d7a0b5d9cc69d1464b8d2bc6832b1fb23b4fed04ae6f93dcbf81918c866b7298e4e24a3154f3872f42b5aea522a header: X-Origin-Response-Time: 122,23.15.9.30 header: X-Akamai-Request-ID: cbadca49.425e6d8f header: Expires: Sun, 05 Feb 2023 15:12:31 GMT header: Cache-Control: max-age=0, no-cache, no-store header: Pragma: no-cache header: Date: Sun, 05 Feb 2023 15:12:31 GMT header: X-Cache: TCP_MISS from a2-19-82-204.deploy.akamaitechnologies.com (AkamaiGHost/10.10.3-45298580) (-) header: Connection: close header: Set-Cookie: ttwid=1%7Cej9iKebfQKWL6Fa2iu3sQ_ehfDPlimVlHLvWttIUnfw%7C1675609950%7C846c86dc0363c8c8b5ea9d626b897165a1a8571cf650d97ca6d839e9a2c7072b; Domain=.tiktok.com; Path=/; Expires=Wed, 31 Jan 2024 15:12:30 GMT; HttpOnly; SameSite=None; Secure header: X-Cache-Remote: TCP_MISS from a23-15-9-30.deploy.akamaitechnologies.com (AkamaiGHost/10.10.3-45298580) (-) header: x-tt-trace-tag: id=16;cdn-cache=miss;type=dyn header: Server-Timing: cdn-cache; desc=MISS, edge; dur=97, origin; dur=122 header: Server-Timing: inner; dur=109 header: X-Parent-Response-Time: 219,2.19.82.204 header: Set-Cookie: _abck=65733AD9FF922767AFE1E093D046D998~-1~YAAQzFITArosEc2FAQAAGMsiIgkGNxMiCix1sZvG6zUsA1xGiJqXsqBiMSr1A5IG3/qcBqJbA6YCOIkeabaTnnOq3AOR5gkoeLlcnbVMpZd47lPdhehHUbNnFS9bkpVC7jpdfPQqO9HI+L8AGKCaObntDYQQhFT9Dhbh5npKz1GN4LM3N1gczRyTVKAwDudSzdUegTRwg/DYBc/0SXWXYOjTFZhrO6gcyxT9TcXEJsMW1wf4gfDBWtZTRFK2IW0EoEn+BgL49hnc/BN4FAE/co9WfcWDfmXt6uz0TFeJ1mOQdBT46+iTXV6HHKqz8qqOy+isuGtwIbvBsP/HPdj0xgQ2L4CXpgajS4dRPUNlC3YfLKtVTnreCsbSwpI=~-1~-1~-1; Domain=.tiktok.com; Path=/; Expires=Mon, 05 Feb 2024 15:12:30 GMT; Max-Age=31535999; Secure header: Set-Cookie: bm_sz=60D8F5B8546597E2E0F3F31915D2EA34~YAAQzFITArssEc2FAQAAGMsiIhKHv6Ixp6KkP3Ej0My7G6w6uVpH7CTZpxUDFBtag7wUWQqQPi05DemLmA++iKAMM1GvQLqWBTvtudcoQyZYMTlMQa7a1sAGI5K5/0QcuUN2fj0s3l6nq5gEUFcPlXSZl3BS5spMxvy0RIc8buRG+1vnLNskQkiTFOkuk5n28yWOUbeuwpuNLnPjK6Ey3WhAr4Jf/QAAKzVwZODAJisXHziHxW9coyBphmsoGzrJdX8svLHwPy8/38vaT6ra0f6LMAZP2CxtyogGZ/p3IyHvtLY=~4538950~3294261; Domain=.tiktok.com; Path=/; Expires=Sun, 05 Feb 2023 19:12:30 GMT; Max-Age=14399 WARNING: [tiktok:user] Unable to download webpage: HTTP Error 400: Bad Request (caused by <HTTPError 400: 'Bad Request'>); please report this issue on https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using yt-dlp -U; secUid supplied, trying anyway [debug] [tiktok:user] Launching headless browser [tiktok:user] Downloading page 1 [debug] [tiktok:user] x-tt-params: 4mOJhGnMOdadxrLEy2bkmGSR2R38w8nZC8MQKREioTAU76aXIbW+KkRzj5O7qqbOI65rqSqkFAXNltiJ1p2YyvalYQ0VbxXkcXRyJfQSSRSQbvnGZQeQ1+VU5wIsbTnU2RjOAE1hPYgGq1NvWwsMw5/KQMmJ+ZWfvugf+Z55lGRSOAHYrEH7lAQOoJ+U1IJFYlGbmYXx6EMklckYSIWxyp77zBc2pZpzg27WlfpOt+9PJrXe1O4aw8wDcmZ9WNs6VDNUKyd90G/oxM4jkVJKYz4RfVI7j13dCgWFDGbSiFbkYWX6CQ6t5s54hKE0+lNLr6cLfTOED61dl8QQvFCwGiwyyn/lu+JVJ66xGjbv2aYEAHM/DEtKg/Wj6MViLXTc0rfwVBgYJmzVCEIz3BUWL/iM5Uc2aZAFtEMPR2/OuSr2m7TE2V+2ZRuEFXTSMKfkIdl2EHgsGuS9Pz0sKxz1J+eScxonnK/BHc5ncUsmlhv2qLnlOBMDHsQ5VTVYYEXgJxAFd+WZRWKyjW8OOwqughDkrGEt35s0CnjBQO/V0rTtFCRpDuONsGB3BhTECHOZsd+k7Ei+ya93T8ca2CsDtafXgkUcfgp3HDekt4KxmB41jZE4vleanJmDj+evBirmbhPko3eCNjz8Z9zC1zYgTxrqU45UA5QGwfT4GuNJeWSAB8jXAbvIWNlqoiW0x82UuwZSIDWoYOUajw+9FwZ0VuLwzwH1q540qBz6rR3NqUU72TOLHISLgMxDaCaaMMFRiQ06KsRUdUcVjWtAl/kJViQsEUzY+JRhJ9BchBd/aIlgty1QZC0Ii7mvcJlqVSPAWdiZQs8Aotr5k5egXvj+jsm789o8DVjpB4akhA8ffzs= Executing <Handle SyncBase._sync.<locals>.<lambda>(<Task finishe...nc_base.py:96>) at /usr/local/lib/python3.10/site-packages/playwright/_impl/_sync_base.py:100 created at /usr/lib64/python3.10/asyncio/events.py:80> took 2.007 seconds ERROR: 'list' object has no attribute 'get' Traceback (most recent call last): File "/home/user/yt-dlp-tiktok-scraper-fixed/yt_dlp/YoutubeDL.py", line 1495, in wrapper return func(self, *args, **kwargs) File "/home/user/yt-dlp-tiktok-scraper-fixed/yt_dlp/YoutubeDL.py", line 1571, in __extract_info ie_result = ie.extract(url) File "/home/user/yt-dlp-tiktok-scraper-fixed/yt_dlp/extractor/common.py", line 680, in extract ie_result = self._real_extract(url) File "/home/user/yt-dlp-tiktok-scraper-fixed/yt_dlp/extractor/tiktok.py", line 772, in _real_extract if author.get('uniqueId', '') == user_name: AttributeError: 'list' object has no attribute 'get'

It never even made a .dump file ...

@redraskal
Copy link
Contributor Author

redraskal commented Feb 5, 2023

@tomperchtold I found the problem. The argument was automatically being converted to lowercase & TikTok apparently only checks the case on accounts without embeds enabled 🤷‍♂️

Example of working syntax:

yt-dlp --extractor-args "tiktok:secuid=MS4wLjABAAAA40PUfcpIYmV6z6R8zfneyFoseqzb_Jen5DeKDuSpkhmoKaXNjDikpuK3cPzuD4wv" https://tiktok.com/@chelsea.m07

@redraskal redraskal reopened this Feb 5, 2023
@redraskal
Copy link
Contributor Author

I screwed up the branch somehow and I think it's fixed now 😅

@chavinlo
Copy link

chavinlo commented Feb 6, 2023

Doesn't works for me, stuck on playwright install.
I've compiled it and it keeps asking me to run playwright install even though I have already done it 5+ times, even changed from firefox to chromium

(env) root@ac34d37c5636:/workspace/yt-dlp-fix-tiktok-user# /workspace/yt-dlp-fix-tiktok-user/dist/yt-dlp_linux https://www.tiktok.com/@dancetutorials.tv
WARNING: Assuming --restrict-filenames since file system encoding cannot encode all characters. Set the LC_ALL environment variable to fix this.
[tiktok:user] Extracting URL: https://www.tiktok.com/@dancetutorials.tv
[tiktok:user] dancetutorials.tv: Downloading user embed
[tiktok:user] 7184245369289608494: Downloading video feed
ERROR: Executable doesn't exist at /tmp/_MEI18665q/playwright/driver/package/.local-browsers/chromium-1000/chrome-linux/chrome

 Looks like Playwright Test or Playwright was just installed or updated. 
 Please run the following command to download new browsers:              
                                                                         
     playwright install                                                  
                                                                         
 <3 Playwright Team                                                      

@chavinlo
Copy link

chavinlo commented Feb 6, 2023

Turns out playwright saves browsers to /root/.cache/ms-playwright/
Setting the following env var: PLAYWRIGHT_BROWSERS_PATH=/root/.cache/ms-playwright/
and running it as follows:
PLAYWRIGHT_BROWSERS_PATH=/root/.cache/ms-playwright/ /workspace/yt-dlp-fix-tiktok-user/dist/yt-dlp_linux https://www.tiktok.com/@dancetutorials.tv
Works.

However, I still get another error, this time more deep:

(env) root@ac34d37c5636:/workspace/yt-dlp-fix-tiktok-user# PLAYWRIGHT_BROWSERS_PATH=/root/.cache/ms-playwright/ /workspace/yt-dlp-fix-tiktok-user/dist/yt-dlp_linux https://www.tiktok.com/@dancetutorials.tv
WARNING: Assuming --restrict-filenames since file system encoding cannot encode all characters. Set the LC_ALL environment variable to fix this.
[tiktok:user] Extracting URL: https://www.tiktok.com/@dancetutorials.tv
[tiktok:user] dancetutorials.tv: Downloading user embed
[tiktok:user] 7184245369289608494: Downloading video feed
[tiktok:user] Downloading page 1
ERROR: TypeError: Failed to fetch
    at fetch (https://s20.tiktokcdn.com/tiktok/common/init.js?cache:1:2112)
    at https://sf16-short-va.bytedapm.com/slardar/fe/sdk_lite/browser-nocookie.lite.1.2.4.maliva.js:1:1744
    at window.fetch (https://lf16-tiktok-web.tiktokcdn-us.com/obj/static-tx/secsdk/secsdk-lastest.umd.js:5:50199)
    at _0x171e0b (https://sf16-website-login.neutral.ttwstatic.com/obj/tiktok_web_login_static/webmssdk/1.0.0.12/webmssdk.js:1:573406)
    at _0x46e89c (https://sf16-website-login.neutral.ttwstatic.com/obj/tiktok_web_login_static/webmssdk/1.0.0.12/webmssdk.js:1:571546)
    at https://lf16-tiktok-web.tiktokcdn-us.com/obj/tiktok-web-tx/webmssdk_ex/2.0.0.38/webmssdk_ex.js:1:50651
    at new Promise (<anonymous>)
    at https://lf16-tiktok-web.tiktokcdn-us.com/obj/tiktok-web-tx/webmssdk_ex/2.0.0.38/webmssdk_ex.js:1:50508
    at _0x735c04 (https://lf16-tiktok-web.tiktokcdn-us.com/obj/tiktok-web-tx/webmssdk_ex/2.0.0.38/webmssdk_ex.js:1:8632)
    at eval (eval at evaluate (:197:30), <anonymous>:1:13)
    ```

@chavinlo
Copy link

chavinlo commented Feb 6, 2023

wget works, so I doubt it's IP blocking:

(env) root@ac34d37c5636:/workspace/yt-dlp-fix-tiktok-user# wget https://s20.tiktokcdn.com/tiktok/common/init.js?cache:1:2112
--2023-02-06 04:04:24--  https://s20.tiktokcdn.com/tiktok/common/init.js?cache:1:2112
Resolving s20.tiktokcdn.com (s20.tiktokcdn.com)... 104.129.67.171, 104.129.67.169
Connecting to s20.tiktokcdn.com (s20.tiktokcdn.com)|104.129.67.171|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 7193 (7.0K) [application/javascript]
Saving to: 'init.js?cache:1:2112.1'

init.js?cache:1:2112.1   100%[==================================>]   7.02K  --.-KB/s    in 0s      

2023-02-06 04:04:24 (1.18 GB/s) - 'init.js?cache:1:2112.1' saved [7193/7193]

@redraskal
Copy link
Contributor Author

@chavinlo I really doubt its ip blocking. I can download the profile on Windows. So, I'm guessing this is something with your environment or the x-tt-params header. The x-tt-params header contains operating system details, so TikTok might detect the os in the header is incorrect or possibly the timezone. Can you run yt-dlp with --verbose and send the output?

@tomperchtold
Copy link

@tomperchtold I found the problem. The argument was automatically being converted to lowercase & TikTok apparently only checks the case on accounts without embeds enabled man_shrugging

Example of working syntax:

yt-dlp --extractor-args "tiktok:secuid=MS4wLjABAAAA40PUfcpIYmV6z6R8zfneyFoseqzb_Jen5DeKDuSpkhmoKaXNjDikpuK3cPzuD4wv" https://tiktok.com/@chelsea.m07

@redraskal
Thank you so much for your work! Finally I can download most of my accounts again. I noticed however one small detail: There are some accounts that use special characters so when I search I get only the SecUid URL, f.e. https://www.tiktok.com/@MS4wLjABAAAAqr9sFjn_sxBpBIXokOmQuOkDYAHd2H1vqA5WcNmW8AEpENMmU3YQcbEXLgqocQv1 for the user ħσиєуу (and not https://www.tiktok.com/@ħσиєуу), probably because no browser could handle that.

When I use
yt-dlp --extractor-args "tiktok:secuid=MS4wLjABAAAAqr9sFjn_sxBpBIXokOmQuOkDYAHd2H1vqA5WcNmW8AEpENMmU3YQcbEXLgqocQv1" https://tiktok.com/@MS4wLjABAAAAqr9sFjn_sxBpBIXokOmQuOkDYAHd2H1vqA5WcNmW8AEpENMmU3YQcbEXLgqocQv1
I'm getting the same 400 error and no files, also of course without the extractor-args.
If it isn't too much trouble can you please take a look at that, but it is not that important, only a small portion of the accounts use special characters

@redraskal
Copy link
Contributor Author

redraskal commented Feb 11, 2023

@tomperchtold It may have to do with the account not having a lot of videos. The videos come with the profile page online without calling the API. TikTok might block API usage if they know it could be pre-rendered with captcha. Do you notice this with other accounts that have only a couple of videos?

@tomperchtold
Copy link

@tomperchtold It may have to do with the account not having a lot of videos. The videos come with the profile page online without calling the API. TikTok might block API usage if they know it could be pre-rendered with captcha. Do you notice this with other accounts that have only a couple of videos?

@redraskal
Hi! Thanks for an answer! I must admit I only have a few users with special characters in it and they all have only a few videos.

@pukkandan pukkandan mentioned this pull request Mar 3, 2023
9 tasks
@zone559
Copy link

zone559 commented Mar 31, 2023

Is there any plans to have the playwright obtain secuid automatically?
Also Is there plans to have it download video without watermark like the originally fork?

@redraskal
Copy link
Contributor Author

redraskal commented Mar 31, 2023

Is there any plans to have the playwright obtain secuid automatically?

Also Is there plans to have it download video without watermark like the originally fork?

Many videos can be downloaded without watermark on here & this fork does automatically fetch the secuid. Sometimes, the request fails. The extractor argument is to ensure it succeeds even if tiktok blocks the request.

@zone559
Copy link

zone559 commented Apr 1, 2023

I graduated from Bing Chat University, but I can't code for sh*t. However, I was able to ask Bing Chat to write this code for me, which can scrape video URLs from a user's profile without requiring a secuid. I'm not sure if it will be helpful since the original code seems to rely on sending API requests and other technical aspects that I don't understand, since I can't code. Lol.

It was able to scrape 50000+ url from addisonre account
here photo proof https://i.imgur.com/4DeJjCq.png

I figure you might be able to find a way to integrate the code to your code since I don't know how to do it.

addisonre.txt
testdownload.zip

@redraskal
Copy link
Contributor Author

redraskal commented Apr 1, 2023

I graduated from Bing Chat University, but I can't code for sh*t. However, I was able to ask Bing Chat to write this code for me, which can scrape video URLs from a user's profile without requiring a secuid. I'm not sure if it will be helpful since the original code seems to rely on sending API requests and other technical aspects that I don't understand, since I can't code. Lol.

It was able to scrape 50000+ url from addisonre account

here photo proof https://i.imgur.com/4DeJjCq.png

I figure you might be able to find a way to integrate the code to your code since I don't know how to do it.

addisonre.txt

testdownload.zip

There is no way to retrieve a list of videos without an secuid.

The TikTok web API does not internally index accounts by username. They use secuid, which you have to retrieve with various methods.

My fork uses a method to automatically retrieve the secuid. TikTok will occasionally block this method, which is why I added an argument for fallback.

Behind the scenes, the browser in your example runs the same logic from my implementation.

Now, it is slower because the browser renders all of the videos while scrolling and uses significantly more resources.

My fork uses a minimal amount of browser interaction to achieve the same results.

The other problem with the scrolling approach is you risk captcha screening that will break the scraping.

In the browser, TikTok supplies the secuid. This page can be blocked by captcha, which is why I designed a new approach to retrieve the secuid without risking captcha. Unfortunately, this solution still risks random failures from TikTok's security measures. But, it's more reliable.

@zone559
Copy link

zone559 commented Apr 2, 2023

was able to fix the secuid issue i think, dont need it anymore to download video, if you have the time to double check it, that be great.

also please fix the issue with batch file downloading. I was trying to do a batch file download and it didnt work.

Do you think this method would work without having to deal with a captcha?

Sorry if I'm bothering you. I'm just trying to fix the issue with the secUid and HTTP 400 errors and make it easy for everyone to use without having to deal with these problems.

https://www.tiktok.com/@MS4wLjABAAAAqr9sFjn_sxBpBIXokOmQuOkDYAHd2H1vqA5WcNmW8AEpENMmU3YQcbEXLgqocQv1
https://www.tiktok.com/@lillybketchman
https://www.tiktok.com/@chelsea.m07
https://www.tiktok.com/@avamajuryyy
https://www.tiktok.com/@addisonre
https://www.tiktok.com/@rizz4emma
https://www.tiktok.com/@samandjessofficial
https://www.tiktok.com/@karolg
https://www.tiktok.com/@jasonderulo
https://www.tiktok.com/@samsmith
https://www.tiktok.com/@mileycyrus
https://www.tiktok.com/@edsheeran

tiktok.zip

@zone559
Copy link

zone559 commented Apr 21, 2023

code broken again,

>yt-dlp -vw -f best -o "%(uploader)s/ %(title)s.%(ext)s" https://www.tiktok.com/@badbunny?lang=en [debug] Command-line config: ['-vw', '-f', 'best', '-o', '%(uploader)s/ %(title)s.%(ext)s', 'https://www.tiktok.com/@badbunny?lang=en'] [debug] Encodings: locale cp1252, fs utf-8, pref cp1252, out utf-8, error utf-8, screen utf-8 [debug] yt-dlp version stable@2023.03.04 [392389b7d] (win_exe) [debug] Lazy loading extractors is disabled [debug] Python 3.11.3 (CPython AMD64 64bit) - Windows-10-10.0.19045-SP0 (OpenSSL 1.1.1t 7 Feb 2023) [debug] exe versions: ffmpeg 2023-04-06-git-b564ad8eac-full_build-www.gyan.dev (setts), ffprobe 2023-04-06-git-b564ad8eac-full_build-www.gyan.dev [debug] Optional libraries: Cryptodome-3.17, brotli-1.0.9, certifi-2022.09.24, mutagen-1.46.0, sqlite3-2.6.0, websockets-10.4 [debug] Proxy map: {} WARNING: "-f best" selects the best pre-merged format which is often not the best option. To let yt-dlp download and merge the best available formats, simply do not pass any format selection. If you know what you are doing and want only the best pre-merged format, use "-f b" instead to suppress this warning [debug] Loaded 1786 extractors [tiktok:user] Extracting URL: https://www.tiktok.com/@badbunny?lang=en [tiktok:user] badbunny: Downloading user embed [tiktok:user] 7222743017118240046: Downloading video feed [debug] [tiktok:user] Launching headless browser [tiktok:user] Downloading page 1 [debug] [tiktok:user] x-tt-params: 4mOJhGnMOdadxrLEy2bkmGSR2R38w8nZC8MQKREioTAU76aXIbW+KkRzj5O7qqbOI65rqSqkFAXNltiJ1p2YyvalYQ0VbxXkcXRyJfQSSRTrfXJEJCBQAo4pw9y2uxmp+rZh4iq0meHipIelfrq1iam21uXzUd1axVIx41CrUZVZGWlljASYrKA6EeLj8tkbp3JBHYdiu2jrZr2Mom4hPjajDbU2tBMonl3e765JMg/Hfrn7LJAZDYdFvxutJQIr+QChno/OgWNz7/jIIWdbQezaYdZXc1UEftjNySIiAFWJwGInqqt2ILW6ccB5cLBR9k3jzKduhfUn2eDGv9sMw26CaX2XfLZwslL2i91IZW0zlpLD/9h7wlpmg/6buOEXfKoGP2yaVGw3qepWWajKXhcxEnqEesvbJ9eA5AacqyxEuCWQSTt/cyqIMiYLvpmkl30c7jeNp784QQ+bEc8qg3s61+j8S0tWmjaOPae8/Gjmrk6+T365EbYSQhdpSuDUUNopCpSvSn5vu9t6LRx8PgWibM5e/t3dEJLK0PMsghJAcRdwloYiUNC9Vp7JaaYQ/C+GoCxkdRXkAhHXMGibgZ6BlUvAWkuortBEhheq/r+gGgGLLcz6wVy1WEDX+dPjxCXK6e8+4TcY80/V0PQH/fXhJDWftlD8Gk6sFJWOnU095NbEhd9hdB2aKm5cALM/wvc4eiYB3RH8Ocxed71XtzEX5JmZYnOxJxMxKR8VISh0jI5Bw+Ypldsjv/KOAYDMheBQqrKh3d7VZDIq7d/nk73XD8HnIklbceU90M2lYdSBknI8FQ/tYqxmJ7QaDAJP7q3pvLWjySkcRe/HWFcskT5F8t4GfpYFJsl+rES20cu0gjpA+pt12SZ42tSVw/dl ERROR: JSON.parse: unexpected end of data at line 1 column 1 of the JSON data Traceback (most recent call last): File "yt_dlp\YoutubeDL.py", line 1518, in wrapper File "yt_dlp\YoutubeDL.py", line 1594, in __extract_info File "yt_dlp\extractor\common.py", line 694, in extract File "yt_dlp\extractor\tiktok.py", line 794, in _real_extract File "yt_dlp\extractor\tiktok.py", line 704, in _video_entries_api File "playwright\sync_api\_generated.py", line 8622, in evaluate File "playwright\_impl\_sync_base.py", line 104, in _sync File "playwright\_impl\_page.py", line 411, in evaluate File "playwright\_impl\_frame.py", line 277, in evaluate File "playwright\_impl\_connection.py", line 61, in send File "playwright\_impl\_connection.py", line 461, in wrap_api_call File "playwright\_impl\_connection.py", line 96, in inner_send playwright._impl._api_types.Error: JSON.parse: unexpected end of data at line 1 column 1 of the JSON data

@pukkandan pukkandan force-pushed the master branch 2 times, most recently from ee280c7 to 7aeda6c Compare May 24, 2023 18:09
@julian45
Copy link

julian45 commented Jun 4, 2023

I'm not sure how helpful this information is to @redraskal (thanks so much for all of your time and effort on this thus far!!) or anyone else here, but I tried using this branch a few different ways. No luck on any of them, unfortunately. However, I did put the --verbose flag on the attempt most likely to work, and the output of that is below. This is on an M2 MacBook Air; the fork was freshly cloned, compiled, and installed with all dependencies (including playwright) immediately before execution, and the secuid was obtained directly from the account via the method described here. For what it's worth, playwright install exited quickly and silently whenever I ran it, though I recall it taking some time to download a copy of its browser (and tell me it was doing so) before exiting in past instances.

sh-3.2$ ~/rr-ytdlp/yt-dlp --extractor-args "tiktok:secuid=MS4wLjABAAAAS9ciBdtH3kzTIz09dQVBybwPFnbjO-Q5FqkXdQ34LLBTkIi6hLYYwtYLDwjh0PaY" "https://www.tiktok.com/@lilnasx" --verbose
[debug] Command-line config: ['--extractor-args', 'tiktok:secuid=MS4wLjABAAAAS9ciBdtH3kzTIz09dQVBybwPFnbjO-Q5FqkXdQ34LLBTkIi6hLYYwtYLDwjh0PaY', 'https://www.tiktok.com/@lilnasx', '--verbose']
[debug] Encodings: locale UTF-8, fs utf-8, pref UTF-8, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2023.01.06 [6becd2508] (zip)
[debug] Python 3.11.3 (CPython arm64 64bit) - macOS-13.2-arm64-arm-64bit (OpenSSL 1.1.1u  30 May 2023)
[debug] exe versions: ffmpeg 6.0 (setts), ffprobe 6.0, phantomjs 2.1.1, rtmpdump 2.4
[debug] Optional libraries: Cryptodome-3.18.0, brotli-1.0.9, certifi-2023.05.07, mutagen-1.46.0, sqlite3-2.6.0, websockets-11.0.3
[debug] Proxy map: {}
[debug] Loaded 1764 extractors
[tiktok:user] Extracting URL: https://www.tiktok.com/@lilnasx
[tiktok:user] lilnasx: Downloading user embed
[tiktok:user] 7199859560654392618: Downloading video feed
[debug] [tiktok:user] Launching headless browser
[tiktok:user] Downloading page 1
[debug] [tiktok:user] x-tt-params: 4mOJhGnMOdadxrLEy2bkmGSR2R38w8nZC8MQKREioTAU76aXIbW+KkRzj5O7qqbOI65rqSqkFAXNltiJ1p2YyvalYQ0VbxXkcXRyJfQSSRQ83IfoaB0645+3LXugf8JQKn8jWpUFzNZreXESRI5rztJ7vOX8JYMkIVKyI1rTA6yPuTGSoSmKZ/gX5MStf2/3pXzQvgBY9TSheAteuSuPFLILGUL46IAs8Md9iwKTgtK5Xhnmn69v/86g8vtmIlWNTwdiqxVEDbKeLXWVNwEnnv+tGIB/LR4OfE7JyWcC4PbqSGBtWADvYn/3fRAt3KVaDwhudcXqxcrQaieLJyGQrbRN7PLMyVZvKJb0LZzCwJO0zbGnJ37A7isxgb0Pws2TmvsnEwCaIpVm8bzVzdrjBYiWEtUYrcaBgdqDVV0hj2y432s2hBxCR/YhCGVNC9BNgjKzx0gNtOhUqIg8zBSRyyP81SvzrLzMTEpKe+9LCbEfh4St+76Bt7J4YyetihgRC3aWkmFsSdmeP8vYCtkY48CxyFHiDw7ntqwa3nTw+xarWRUXn3Ewhq9gHjXOXG9qsjSXDyTAhEzZRv4Zde9So5DJDBZBNB4CZTTZjssPvl1e2VMQvQ0w9RIStCdxQsrfuonULp2O+amwao9yjXaE2kuZ7dFdDT+C+ZScWvQCMWL9jscQ02XgfEBLhX3UUpklgetMQCey5FErGI5EuLu9fw+mUewfN9lLE0MJIJgaXpOoPZA1/Nm53ong9pxVrCLGREXfEiyznaTpT1RjrE7ElXP3sG0N46ycjFzZmDeBSCOZZcjzF/sBgae3Ji3RYDM8Wy3xIuqgvGSvwki9Ivl+FRmcwb8z61QRApLO8L+IDsfMHasw2eFRlPZPggDyDgu2
ERROR: JSON.parse: unexpected end of data at line 1 column 1 of the JSON data
Traceback (most recent call last):
  File "/Users/julian/rr-ytdlp/yt-dlp/yt_dlp/YoutubeDL.py", line 1502, in wrapper
    return func(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/julian/rr-ytdlp/yt-dlp/yt_dlp/YoutubeDL.py", line 1578, in __extract_info
    ie_result = ie.extract(url)
                ^^^^^^^^^^^^^^^
  File "/Users/julian/rr-ytdlp/yt-dlp/yt_dlp/extractor/common.py", line 682, in extract
    ie_result = self._real_extract(url)
                ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/julian/rr-ytdlp/yt-dlp/yt_dlp/extractor/tiktok.py", line 794, in _real_extract
    author, response = self._video_entries_api(user_name, secUid)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/julian/rr-ytdlp/yt-dlp/yt_dlp/extractor/tiktok.py", line 704, in _video_entries_api
    data_json = page.evaluate('([x, d]) => fetch(`https://us.tiktok.com/api/post/item_list/?aid=1988&app_language=en&app_name=tiktok_web&browser_language=en-US&browser_name=Mozilla&browser_online=true&browser_platform=Win32&browser_version=5.0%20%28Windows%29&channel=tiktok_web&cookie_enabled=true&device_id=${d}&device_platform=web_pc&focus_state=true&from_page=user&history_len=2&is_fullscreen=false&is_page_visible=true&os=windows&priority_region=&referer=&region=US&screen_height=1080&screen_width=1920`, { headers: { "x-tt-params": x } }).then(res => res.json())', [x_tt_params, device_id])
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/playwright/sync_api/_generated.py", line 8622, in evaluate
    self._sync(
  File "/opt/homebrew/lib/python3.11/site-packages/playwright/_impl/_sync_base.py", line 104, in _sync
    return task.result()
           ^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/playwright/_impl/_page.py", line 411, in evaluate
    return await self._main_frame.evaluate(expression, arg)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/playwright/_impl/_frame.py", line 277, in evaluate
    await self._channel.send(
  File "/opt/homebrew/lib/python3.11/site-packages/playwright/_impl/_connection.py", line 61, in send
    return await self._connection.wrap_api_call(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/playwright/_impl/_connection.py", line 461, in wrap_api_call
    return await cb()
           ^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/playwright/_impl/_connection.py", line 96, in inner_send
    result = next(iter(done)).result()
             ^^^^^^^^^^^^^^^^^^^^^^^^^
playwright._impl._api_types.Error: JSON.parse: unexpected end of data at line 1 column 1 of the JSON data

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request site-request Request to support a new website
Projects
Status: tiktok
Development

Successfully merging this pull request may close these issues.

[tiktok:user] Failed to parse JSON [TikTok] Add more data to profile json