Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix line endings handling for downloaded web pages #10268

Closed
lkho opened this issue Aug 9, 2016 · 3 comments
Closed

Fix line endings handling for downloaded web pages #10268

lkho opened this issue Aug 9, 2016 · 3 comments

Comments

@lkho
Copy link
Contributor

@lkho lkho commented Aug 9, 2016

Make sure you are using the latest version: run youtube-dl --version and ensure your version is 2016.08.07. If it's not read this FAQ entry and update. Issues with outdated version will be rejected.

  • I've verified and I assure that I'm running youtube-dl 2016.08.07

Before submitting an issue make sure you have:

  • At least skimmed through README and most notably FAQ and BUGS sections
  • Searched the bugtracker for similar issues including closed ones

What is the purpose of your issue?

  • Bug report (encountered problems with youtube-dl)
  • Site support request (request for adding support for a new site)
  • Feature request (request for a new functionality)
  • Question
  • Other

Description of your issue, suggested solution and other information

This error is found while trying to download a subtitle from http://d2anahhhmp1ffz.cloudfront.net/1828211116/0de60692c14fea8784203ca9f696a6be874beb52, format: SRT, encoding: utf-8, line ending: CRLF. the first 70 bytes of the response is
b'\xef\xbb\xbf1\r\n00:00:20,173 --> 00:00:21,303\r\n(Episode 1)\r\n\r\n2\r\n00:00:50,559 --'

However the written subtitle file has all '\r\n' replaced to '\r\r\n', which breaks the SRT format and caused some players not rendering the correct subtitle lines.

Suggested solution

fix the decoding of downloaded webpage content with io streams instead of bytes.decode(), so python will handle the normalization of line endings, at
C:\Users\Me\git\youtube-dl\youtube_dl\extractor\common.py: def _webpage_read_content()

@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented Aug 9, 2016

Which URL are you trying to downloading?

@lkho
Copy link
Contributor Author

@lkho lkho commented Aug 9, 2016

I was downloading https://www.viu.com/ott/hk/zh-hk/vod/16061/, with a custom extractor (#8131) not built in the current release.

yan12125 added a commit that referenced this issue Aug 11, 2016
…kho-pr/#10268
yan12125 added a commit that referenced this issue Aug 11, 2016
@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented Aug 11, 2016

Fixed as #10269 merged. Thanks!

@yan12125 yan12125 closed this Aug 11, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.