Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BiliBili - cannot parse <h1> title when it has `<` or `>` characters inside #15389

Closed
XVilka opened this issue Jan 22, 2018 · 1 comment
Closed

BiliBili - cannot parse <h1> title when it has `<` or `>` characters inside #15389

XVilka opened this issue Jan 22, 2018 · 1 comment
Labels

Comments

@XVilka
Copy link

@XVilka XVilka commented Jan 22, 2018

  • I've verified and I assure that I'm running youtube-dl 2018.01.21
  • At least skimmed through the README, most notably the FAQ and BUGS sections
  • Searched the bugtracker for similar issues including closed ones

What is the purpose of your issue?

  • Bug report (encountered problems with youtube-dl)
  • Site support request (request for adding support for a new site)
  • Feature request (request for a new functionality)
  • Question
  • Other

In line

 title = self._html_search_regex('<h1[^>]*>([^<]+)</h1>', webpage, 'title')

of https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/bilibili.py you do the regex search. But it is common for videos to contain > and < characters as a part of a title.

@dstftw
Copy link
Collaborator

@dstftw dstftw commented Jan 22, 2018

Carefully read new issue template and provide all requested information.

@dstftw dstftw closed this Jan 22, 2018
@dstftw dstftw added the incomplete label Jan 22, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.