Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[extractor/youtube] Fix parsing comment_count with unit suffix #6523

Merged
merged 2 commits into from Mar 14, 2023

Conversation

nick-cd
Copy link
Contributor

@nick-cd nick-cd commented Mar 12, 2023

IMPORTANT: PRs without the template will be CLOSED

Description of your pull request and other information

Youtube's extractor can now correctly parse comment count values containing unit suffixes (i.e. K, M, etc.).

Fixes #5849

Template

Before submitting a pull request make sure you have:

In order to be accepted and merged into yt-dlp each piece of code must be in public domain or released under Unlicense. Check all of the following options that apply:

  • I am the original author of this code and I am willing to release it under Unlicense
  • I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

The `comment_count` extraction logic now uses the internal
`self._get_count()` method to reliably extract the count.

Closes yt-dlp#5849
@pukkandan pukkandan added the site-bug Issue with a specific website label Mar 12, 2023
@pukkandan
Copy link
Member

Pls make sure to manually test both traversal paths. You can test the second path by changing one of the keys in first to intentionally break it.

@nick-cd
Copy link
Contributor Author

nick-cd commented Mar 14, 2023

Sure thing :)

I conducted a bit of smoke testing on the same URL as the OP in #5849 (https://www.youtube.com/watch?v=TPKX6K2u8Tc). I shared my results below. (Note: At the time of this writing, the comment counter is around 3.2K.)

First Path Test

The following GIF shows the resultant info dictionary after fetching the comment_count key, following the first expected path:

fetching-count-with-expected-path

Second Path Test

In this test, I eliminated twoColumnWatchNextResults from the first path, breaking it. The application now relies on the fallback path to retrieve the comment_count key.

fetching-count-with-fallback-path

According to these trivial tests, both paths provide the same count information.

Please let me know if this evidence is enough to prove that my patch works.

@pukkandan
Copy link
Member

Evidence wasn't necessary 😄. I was just reminding you to test the fallback path

@nick-cd
Copy link
Contributor Author

nick-cd commented Mar 14, 2023

Oh, My bad 😄.

@pukkandan pukkandan merged commit 071670c into yt-dlp:master Mar 14, 2023
11 checks passed
@nick-cd nick-cd deleted the fix/comment-count-parsing branch March 14, 2023 23:34
aalsuwaidi pushed a commit to aalsuwaidi/yt-dlp that referenced this pull request Apr 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
site-bug Issue with a specific website
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Youtube comment count sometimes empty
4 participants