Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix deezer extractor #4935

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open

Fix deezer extractor #4935

wants to merge 7 commits into from

Conversation

LucBerge
Copy link
Contributor

@LucBerge LucBerge commented Sep 15, 2022

Description of the pull request and other information

This pull request fix the Deezer extractor with full song support and closes #1591.

  • Add blowfish encipher
  • Add deezer downloader
  • Add full song support
  • Add artist extractor
  • Add track extractor
  • Add episode extractor
  • Add show extractor

Before submitting this pull request I have:

In order to be accepted and merged into yt-dlp each piece of code must be in public domain or released under Unlicense. Check one of the following options:

  • I am the original author of the DeezerExtractor code and I am willing to release it under Unlicense. See youtube-dl.
  • I am not the original author of the Blowfish key computation and cipher code but it is in public domain or released under Unlicense (provide reliable evidence). See youtube-dl.

What is the purpose of my pull request?

- Add blowfish encipher
- Add deezer downloader
- Add full song support
- Add artist extractor
- Add track extractor
- Add episode extractor
- Add show extractor
@LucBerge
Copy link
Contributor Author

What about the failed core tests ?

@pukkandan
Copy link
Member

Dont import legacy compat variables

@LucBerge
Copy link
Contributor Author

LucBerge commented Sep 15, 2022

Dont import legacy compat variables

So I copy the function code in the file ?

yt_dlp/extractor/deezer.py Outdated Show resolved Hide resolved
yt_dlp/extractor/deezer.py Outdated Show resolved Hide resolved
@Grub4K
Copy link
Member

Grub4K commented Sep 15, 2022

So I copy the function code in the file ?

No, you use the functions provided by Python, for example the chr builtin instead of compat_chr. Compat functions were workarounds which are no longer needed since support for old Python versions is not required.

The decryption process could potentially be done more cleanly if the calculation of the key happens inside the downloader and is not dependant on the key property being present.

Also Cryptodome provides an internal Blowfish implementation (Cryptodome.Cipher.Blowfish). It might be worth using that one and fallback only if Cryptodome not available, like doing already for AES inside dependencies.py

- Remove utf-8 encoding sys reconfiguration
- Do not use compat functions anymore
- Blowfish key calculation in the extractor
- Minor fixes
@LucBerge
Copy link
Contributor Author

No, you use the functions provided by Python, for example the chr builtin instead of compat_chr. Compat functions were workarounds which are no longer needed since support for old Python versions is not required.

Done.

The decryption process could potentially be done more cleanly if the calculation of the key happens inside the downloader and is not dependant on the key property being present.

If by "inside the downloader" you mean the DeezerBaseInfoExtractor, then it's done.

Also Cryptodome provides an internal Blowfish implementation (Cryptodome.Cipher.Blowfish). It might be worth using that one and fallback only if Cryptodome not available, like doing already for AES inside dependencies.py

I'll do it later.


GW_LIGHT_URL = "https://www.deezer.com/ajax/gw-light.php"
GET_URL = "https://media.deezer.com/v1/get_url"
BLOWFISH_KEY = b"g4el58wc0zvf9na1"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this key obtained?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving the key in the repository may not be the best idea. Deezer has sent DMCAs to many repos containing it in the past. It specifically mentions "To do so, these repositories are sharing the private encryption keys retrieved maliciously to bypass Deezer's security measures to unlawfully download its music catalogue, directly from Deezer’s servers in total violation of our rights and those of our music licensors (phonographic producers, performing artists, songwriters and composers)."

It might be a better idea to dynamically generate the key or just make users provide it themselves.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Ryan5453 I agree, this is a good idea. Note that this key is not directly used to decrypt the song. It is used to compute a new key that will decrypt the song.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Ryan5453 I have no idea what this code returns. I can't run it on my windows. Do you know how to generate this key in python directly ?

yt_dlp/blowfish.py Outdated Show resolved Hide resolved
@gamer191
Copy link
Collaborator

gamer191 commented Sep 18, 2022

Related PRs: ytdl-org/youtube-dl#13194 ytdl-org/youtube-dl#22074

EDIT: the 2nd PR I linked doesn't support full length songs, according to ytdl-org/youtube-dl#22074 (comment):
@Jan-NiklasB , note that the full length song support is available on the branch [deezer uncipher](https://github.com/LucBerge/youtube-dl/tree/deezer_uncipher).
Unfortunately the linked repo was DMCAed when youtube-dl was DMCAed, but it might be available on the wayback machine (which seems to be down right now)

@LucBerge
Copy link
Contributor Author

Unfortunately the linked repo was DMCAed when youtube-dl was DMCAed, but it might be available on the wayback machine (which seems to be down right now)

Yes it is : https://web.archive.org/web/20200917071235/https://github.com/LucBerge/youtube-dl but the deezer_uncipher branch is not.

@LucBerge
Copy link
Contributor Author

LucBerge commented Oct 3, 2022

Regarding this comment (and a discussion I had with someone 😅) authentication is not used in my code. It explains why the MD5_ORIGIN key is not present and why MP3_64, MP3_128 and MP3_MISC are the only format I have access to even though all of them exists. Authentication code must be added in the futur for better song quality.

@gamer191 GitHub don't want to unlock my repo, they proposed to remove it so I can fork it again. My commits are still on my local computer, I can push it to a new github repo but it can be DMCA again for the same reason youtube-dl was. So maybe it's not the best idea.

- Remove useless impot
- Fix tracks number for extractors
- Fix comment
@LucBerge LucBerge requested review from pukkandan, gamer191 and Ryan5453 and removed request for pukkandan and gamer191 March 8, 2023 12:39
Copy link

@Ryan5453 Ryan5453 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't actually tested this code and therefore can not confirm if it actually works or not, but the new key extractor (in theory) is definitely a much better implementation than hardcoding it for legal reasons.

@LucBerge
Copy link
Contributor Author

LucBerge commented Mar 8, 2023

I haven't actually tested this code and therefore can not confirm if it actually works or not, but the new key extractor (in theory) is definitely a much better implementation than hardcoding it for legal reasons.

You mean generating the key dynamically based on the code you provided?

@Ryan5453
Copy link

Ryan5453 commented Mar 8, 2023

I haven't actually tested this code and therefore can not confirm if it actually works or not, but the new key extractor (in theory) is definitely a much better implementation than hardcoding it for legal reasons.

You mean generating the key dynamically based on the code you provided?

Yes. I'm saying you did a good job updating that.

@LucBerge
Copy link
Contributor Author

LucBerge commented Mar 8, 2023

job updating tha

Thanks. What can I do for core tests failing?
I have not even touch the files it fails on.

@@ -0,0 +1,309 @@
from .compat import compat_struct_unpack, compat_struct_pack
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I should not import compat functions

@LucBerge LucBerge requested a review from pukkandan March 8, 2023 13:05
@pukkandan
Copy link
Member

pukkandan commented Mar 8, 2023

Due to reasons we have discussed above (especially #4935 (comment)), I don't think it is a good idea to merge this code. In theory, we would have some protection by not hard-coding the key, but even so, it is a very risky move for us1.

Since our plugin system has now matured, I recommend distributing this yourself as a plugin instead. This way, any DMCA will only affect your plugin, and not the entire project. While there are no downloader plugins yet, you can hack it in due to the flexibility of Python

from yt_dlp.downloader import PROTOCOL_MAP
PROTOCOL_MAP['deezer'] = DeezerFD

PS: Also, any PR adding new downloader is currently being held on pause since we are discussing re-working how IEs declare FDs

Footnotes

  1. I have neither the time, the will, or the economic ability to fight it in court

@LucBerge
Copy link
Contributor Author

LucBerge commented Mar 8, 2023

Due to reasons we have discussed above (especially #4935 (comment)), I don't think it is a good idea to merge this code. In theory, we would have some protection by not hard-coding the key, but even so, it is a very risky move for us.

It might be a better idea to dynamically generate the key or just make users provide it themselves.

The key is now dynamically generated as suggested

Since our plugin system has now matured, I recommend distributing this yourself as a plugin instead. This way, any DMCA will only affect your plugin, and not the entire project. While there are no downloader plugins yet, you can hack it in due to the flexibility of Python

from yt_dlp.downloader import PROTOCOL_MAP
PROTOCOL_MAP['deezer'] = DeezerFD

My goal in this PR is to share the extractor with others so people will be able to download too. Nobody will use my extractor as a plugin...

PS: Also, any PR adding new downloader is currently being held on pause since we are discussing re-working how IEs declare FDs

Any link to the issue/discussion ?

@pukkandan
Copy link
Member

Nobody will use my extractor as a plugin...

People who need it will find it. It can be added to the wiki and I could change even the current extractor warning to point to plugins if desired.

PS: Also, any PR adding new downloader is currently being held on pause since we are discussing re-working how IEs declare FDs

Any link to the issue/discussion ?

A lot of the discussion happens on discord b/w maintainers. #3048 is one possible implementation that came out of it.

@coletdjnz
Copy link
Member

Plugin repos can be tagged with yt-dlp-plugins to be discoverable. With the new plugin system we are aiming to make it a viable alternative to having extractor support in core. It's pretty new so not too many have heard of it yet nor is there many public plugins listed yet, so will take some time.

Also, after some investigation a while back, the origins of some of this code are potentially from DMCA'd sources and likely not under unlicense (it goes further back than the PR on youtube-dl), so we likely cannot include it anyway in it's current form.

@LucBerge
Copy link
Contributor Author

LucBerge commented Mar 8, 2023

So we likely cannot include it anyway in it's current form.

Ok. So in which form can you accept it ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Deezer support
6 participants