New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Download music with length closest to one on Spotify #111
Conversation
- added .python-version This file is used in pyenv to select good python version. Changes in core/misc.py: - added function get_sec to convert HH:mm:ss to seconds Changes in spotdl.py: - in function generate_songname change function to receive generate_metadata, this is optymalization, becouse in oldest version metadata from spotify api is downlaoded 2 times - in function generate_youtube_url song variable use changed function generate_songname function now looking for songs in while, and save data to dict. Dictionary is used in manual and auto mode. In dictionary keep is youtube link, title, videotime (in format HH:mm:ss) and videotime converted to seconds. For now in automatic downloading is selected video with least difference betwen youtube video time and time from spotify. This is important, becouse in youtube a lot of movies has scenes before/after without musics.
This sounds good in theory but I'll do some everyday testing with your changes to see how it performs before we finally decide whether to merge this or not. It might take some though. Thank you the PR! |
@WMP No offense, the overall idea is great. I'm just complaining about the comment you've changed, click on the commit > changed files and scroll down to see the specific lines I mean 😉 |
@linusg Ahhh, ok, now i understand ;) Thanks |
I agree, I think this would be a great additional feature! If it works pretty well perhaps it could be added as an option. It isn't fun realizing later that the song is over a minute longer than it should have been! |
@lkgarrison Agreed, as an optional feature, this would be great! We could like have an command line option for that... |
@lkgarrison @linusg : Why we want to add this alghorytm as option? What arguments we have for old solution? |
and
Instead of just promoting your new version, could you please give one or two examples where this is indeed the case? So the song names or Spotify links where the cases you've described apply. |
Yes, of course: Official version:
wmp@LUBN366:/skrypty/pobrane/spotify-downloader$ for i in *.mp3; do echo $"$i"; ffmpeg -af "volumedetect" -f null /dev/null -i $i 2>&1 | grep Parsed_volumedetect; done And my version:
Feeding 9 tracks to samochod.txt
wmp@LUBN366:/mnt/d/Users/wmp/Praca/skrypty/spotify-downloader$ cd Music/ |
Wow, thank you for the effort! 😄 |
https://github.com/ritiek/spotify-downloader/issues/31#issuecomment-314580638 🤔 |
This PR seems to be working good. I tested on some songs and also on a few ones that are known to cause issues like below: http://open.spotify.com/track/3ZffCQKLFLUvYM59XKLbVm http://open.spotify.com/track/5EauMVTXVoQOkax03XXVaV I haven't come across any issues yet. So far great. |
I think it will be nice to have this PR merged. @linusg what do you think? |
I did some small tests now and it works fine! I see no problem in merging, as I can definitively reproduce the results. Maybe it will also be a good improvement for some of the songs I downloaded, I'll see. My only concerns though are the tests failing: https://travis-ci.org/ritiek/spotify-downloader/jobs/253803168. Basically the YouTube URLs are different now and there's some type issue (indexing string but not passing an int to the brackets – don't know if this was introduced here). So basically yes, but the tests should be fixed before. |
Okay, I'll fix the tests tomorrow (kinda late here). @lkgarrison I personally don't think this should be added as an "additional feature" though, it works great as is in defaults. The main purpose of the automatic downloading is to download the best matching song and this PR just improves it. |
To be fair, I had the idea of making this optional too, but I see it the same way as @ritiek – making this default would be no bad at all 😄 Ah, and If you want to I can look into the tests – but a pointer what is wrong with the indices would be great. If you guys have no idea, I'll spend my time debugging. |
@linusg That seems to happening because Replacing
with
in every test function in Also as There are some more changes that could be made to tests to make them run faster but lets just focus on the tests passing any how at the moment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I think this is ready to merge. Thanks @WMP! 😄
Thanks @ritiek ! :) |
@ritiek I think it's great as a default feature too. I only suggested it as an optional feature in case it wasn't working quite as well as expected as it would give people the ability to "use at their own risk". As it stands I definitely agree - it certainly seems to improve the automatic downloading. Thanks for everyone's work on this! |
How does the PR select the right video with the same time as the Spotify song? In my testing I've seen a few downloads that didn't sound right, such as My concern is that the new code is looking for a song that is the exact same time length as the Spotify song, and if the true best youtube match is even 1 second different and there is another youtube video that is the exact same length but is far worse quality and is distorted (like the above examples) then it should not be using the video that is the exact length but not good quality. I think ideally it should be a the Youtube song with the highest number of views that is also +/- 5 seconds in length from the Spotify song. This isn't to say it never works, but I have about 1 or 2 in 10 songs definitely don't sound like the spotify track. Let me know what you guys think |
Both ways of selecting the YouTube song have their own ups and downs. I'll spend some more time trying to optimize it and hopefully fix these issues. |
This line count difference betwen spotify duration and youtube duration and sort by smallest difference. @lkgarrison if you want download single song, better is use manual option. I dokt know why, but i have other version that you:
|
@ritiek, I'm considering grabbing the view count off of the html just like the video length is grabbed. Is it an OK assumption that the structure of the html won't be changing any time soon? I suppose if it did, it would break much more than just the view count? |
@lkgarrison There is also another easy way to get the view count (and length too), you should use:
pafy uses the YouTube API, so it should be stable enough for the time being. Some one (or me) will need to make the same changes for getting the length too for stability purposes. |
@ritiek awesome! I'll see what I can do with that! |
@lkgarrison I've got another idea! We could use the YouTube filter to sort the videos results by view count by replacing the URL here with https://www.youtube.com/results?q={0}&sp=CAMSAhABUBQ%253D and then look for the video with the closest duration to spotify. I think this approach will be better than both the previous attempts. What say? |
Btw, I experimented a little but it seems like it won't be a good idea to use pafy to get information about all of the videos on the search results as pafy takes quite a time to parse a video. Each page contains about 20 videos, it would take a minute or so, parsing them with pafy. I think we are better off with extracting info from the HTML, even though it isn't a good idea in the long run. |
I see no problem in directly parsing the HTML at all. YouTube knows that by changibg the overall structure they'll probably break tons of 3rd party software not using their API. However, before doing so they'll announce it for sure so people can react. |
@ritiek I agree, I found that using pafy on all 20 results was quite slow. I also experimented with extracting the viewcount from the html, but after some testing, I found I was still getting more incorrect versions than expected. What I eventually found was best was to simply filter out the video results that are too much longer/shorter than the spotify song but keep the results in the same order that Youtube returned them (noting that Youtube is generally better at picking the most relevant results than I am i.e. trying to sort by highest viewcount only)). While I am still refining the implementation, the core logic of the idea is to use a similar filter function:
(this code would go in spotdl.py near line 100 in the else corresponding to "if args.manual") This is the basic idea, but it still needs refined such as potentially retrying up to a duration_tolerance of 20s in case a duration_tolerance of 6 results in no matching videos. In testing, this downloaded the expected song 79/80 times |
@ritiek @linusg I've made the above changes that I mentioned and tested on a playlist of 220 songs. Only 5 songs were downloaded with the wrong version. Most of these instances were because the official music video had extra noises in the first few seconds and I wanted the exact same audio as Spotify, so I wanted a different version. Thus, after extensive testing, the code I added has yielded the best youtube video selection thus far with the fewest mp3's needing re-done manually. Let me know if you'd like to get this merged and anything further that I can do to assist. |
Thanks for the effort! Would you mind preparing a PR? So we can test it both manually and using the automated tests. I have almost no concerns, sounds pretty neat 😄 |
@lkgarrison Yeah, make a PR and we'll go further from there. Thank you for your effort! |
First, sorry for using old version :(
This file is used in pyenv to select good python version.
Changes in core/misc.py:
Changes in spotdl.py:
change function to receive generate_metadata, this is optymalization, becouse in oldest version metadata from spotify api is downlaoded 2 times
song variable use changed function generate_songname
function now looking for songs in while, and save data to dict. Dictionary is used in manual and auto mode. In dictionary keep is youtube link, title, videotime (in format HH:mm:ss) and videotime converted to seconds.
For now in automatic downloading is selected video with least difference betwen youtube video time and time from spotify. This is important, becouse in youtube a lot of movies has scenes before/after without musics.