Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Searching for music library tracks, finding all artists' albums and artist's album's tracks. #246

Merged
merged 12 commits into from
Dec 1, 2014

Conversation

dajobe
Copy link
Contributor

@dajobe dajobe commented Oct 12, 2014

Adds searching for music library tracks, finding all artists' albums and artist's album's tracks.

This adds a set of search methods for the music library (not music services) by the means of updating get_music_library_information to support fuzzy searches, category searches and getting all matches by handling paging through results.

The new methods (that all call get_music_library_information with the new args) are:

  • search_tracks: to find everything for an artist, artist+album, artist+album+track combinations
  • get_albums_for_artist to get all albums for an artist
  • get_tracks_for_album to get all tracks on an artist's album

The code works well and I've used this to search for a bunch of music library tracks automatically. It took a while to get the Unicode and URL encoding working especially from py2.6 to 3.x. I also added a new compatibility call quote_path from urllib or urllib.parse (py3) that is used by new utility function url_escape_path.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.88%) when pulling 1bde2ed on dajobe:search-tracks into 6417a99 on SoCo:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.32%) when pulling e73e9a0 on dajobe:search-tracks into 6417a99 on SoCo:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.32%) when pulling fe18ef5 on dajobe:search-tracks into 6417a99 on SoCo:master.

1 similar comment
@coveralls
Copy link

Coverage Status

Coverage increased (+0.32%) when pulling fe18ef5 on dajobe:search-tracks into 6417a99 on SoCo:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.32%) when pulling 1f307e1 on dajobe:search-tracks into 6417a99 on SoCo:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.32%) when pulling cf2a6c2 on dajobe:search-tracks into 6417a99 on SoCo:master.

@dajobe dajobe changed the title Search for albums and tracks Search for albums and tracks by name Oct 13, 2014
@coveralls
Copy link

Coverage Status

Coverage increased (+0.36%) when pulling bd26b39 on dajobe:search-tracks into 6417a99 on SoCo:master.

@DPH
Copy link
Member

DPH commented Oct 14, 2014

Hi @dajobe ,
all works well and I like the simple clear interface.

Observations / thoughts:

  1. Not found: if the Artist / album is not found then you get a 701 upnp error. I can trap for this with try: / except soco.SoCoException: but this means wrapping all request with try / except blocks (unless I misunderstand). Would it make sense to return None ?

  2. Observation: I don't think to do with this change: but when you say max_items=4 you only get back 3. Always one less on get_albums_for_artist. Not tested on others but guess a common issue with < should be <=. Could confuse as is.

2a) Observation: UPNP returns items found and total matches. The later lets you know if you need to loop and call again. I don't see total matches returned here, but I may have missed it. Would be useful, otherwise have to keep looping until null return

  1. Addition: it would be nice to add a partial search option. Either as an option here or as a separate search call. This can be done by changing the ObjectID from A:ALBUMARTIST/Beatles to A:ALBUMARTIST:Beat (colon instead of slash). Maybe add a parameter on your functions fuzzy=False / True (default to False)

Hope this help,
Cheers David

@@ -133,3 +133,9 @@ def decorated(*args, **kwargs):
decorated.__doc__ = ''
decorated.__doc__ += docs
return decorated


def url_escape_path(path):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend to add a doctest here, to illustrate what the method does.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.36%) when pulling 212c3c2 on dajobe:search-tracks into 6417a99 on SoCo:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.36%) when pulling 7619c52 on dajobe:search-tracks into 6417a99 on SoCo:master.

@KennethNielsen
Copy link
Member

Hallo @dajobe

This looks really great and great work on the escaping. I tried with some Danish vowels 'æøå' and it worked great.

About the whole unicode encoding problem, it is something I don't believe we have discussed before. I think a lot of the problems we have with methods that take text as an argument, is that per default, if you fire up python <3 writing search_term = 'String with non-ascii æøå' will result in a byte array, whereas we really use (and expect?) unicode everywhere internally. So the questions is, should we keep jumping through hoops by calling really_unicode on any text input as long as we advertise support <3, or would be better to explicitly write in all the affected doc strings, that the text input must be unicode, and require that the python <3 users handle it them selves?

Regarding the actual PR, I think it looks great. One thing that I had been thinking about when we previously talked about searching, is that it would be great if it was possible to do the fuzzy searching that @DPH talks about in all the music library functions.

I also had the idea, that maybe those specific searches you do, could be done easier by creating a dummy music library item and levering the browse method. Based on you work I have made an alternative implementation in #249 as a basis for discussion. I think it makes some of the things a bit simpler. (although not everything in implemented 1to1 with your functionality). I also specifically does not handle the unicode stuff mentioned above, but that's really just to add the really_unicode on all text input before anything else is done with it.

Let me know what you think.

Regards Kenneth

@dajobe
Copy link
Contributor Author

dajobe commented Oct 21, 2014

@KennethNielsen thanks for the feedback.

On Unicode, I realise it's a mess but I was happy it seemed to just work. Looks like it's the big py2/py3 issue.

On the APIs themselves I'm not sure whether to return a SearchResult (that allows paging or partial results) or a list of actual values. I've got different APIs here. Maybe if a raw search, you want paging but for the 'get albums of artist' mode, you want everything, no paging.

I'd like to add the fuzzy / wildcard searching, as @DPH suggested too.

I'll take a look at your new PR #249

Thanks

@coveralls
Copy link

Coverage Status

Coverage increased (+0.36%) when pulling 1559bf7 on dajobe:search-tracks into 6417a99 on SoCo:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.36%) when pulling 1ad3f80 on dajobe:search-tracks into 6417a99 on SoCo:master.

@KennethNielsen
Copy link
Member

Hi @dajobe

Regarding unicode, I guess it is not as much a py2/py3 issue, since if we were porting it there would be no issue, the problem is creating a py2/py3 compatible library that takes text input. That being said, lets try to de-mystify it a little. In the following, I have just tested with one version <3 and one version >=3, so there might be subtle differences in there, that would mess with the conclusions, but I don't know of any.

In python 3 all string are created per default as unicode objects, so the text input we get and the strings created are all unicode.

In core.py and in utils.py we have the from __future__ import unicode_literals import, which means that inside of this module, things will behave like python 3 i.e. all the strings that we create will be unicode and add and inplace add and format does not change that, so in python2:

Python 2.7.6 (default, Mar 22 2014, 22:59:56) 
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from __future__ import unicode_literals
>>> 'Hallo World!'
u'Hallo World!'
>>> a = 'bøf' # Danish for steak
>>> a
u'b\xf8f'
>>> b = ' med løg'  # With onions, in Danish
>>> b
u' med l\xf8g'
>>> a + b
u'b\xf8f med l\xf8g'
>>> a += b
>>> a
u'b\xf8f med l\xf8g'
>>> '{} med {}'.format('bøf', 'løg')
u'b\xf8f med l\xf8g'

Of course, if we happen to have gotten a bytestring and try to combine, we get errors, if the bytestring has characters outside of ASCII. So better to avoid combining.

This means, that there are only two cases where things can go wrong:

  1. When calling 3rd party library functions, that might return different string types
  2. When we receive text input from a python2 library, which might be a python2 str i.e. a byte array (they might not be importing unicode_literals or for some other reason reason have this type)

With regards to 1. The only 3rd party library function in use for this, is the quote_url function (which is imported from urllib via the compat module) used in url_escape_path. We see that in python2, it wants byte-arrays and returns byte arrays, but because we have imported unicode literals and because a replace makes a new string, this is turned into a unicode object after the replace:

Python 2.7.6 (default, Mar 22 2014, 22:59:56) 
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from __future__ import unicode_literals
>>> from soco.compat import quote_url
>>> quote_url('bøf')  # Is unicode due to the import
/usr/lib/python2.7/urllib.py:1288: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
  return ''.join(map(quoter, s))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/urllib.py", line 1288, in quote
    return ''.join(map(quoter, s))
KeyError: u'\xf8'
>>> quote_url('bøf'.encode('utf-8'))
'b%C3%B8f'
>>> quote_url('bøf'.encode('utf-8')).replace('/', '%2F')
u'b%C3%B8f'

In python3, things are simpler, qoute_url takes unicode objects and returns unicode objects.

Python 3.4.0 (default, Apr 11 2014, 13:05:11) 
[GCC 4.8.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from __future__ import unicode_literals
>>> from soco.compat import quote_url
>>> quote_url('bøf')
'b%C3%B8f'
>>> # That is unicode, in python 3 unicode is the standard
>>> quote_url('bøf').replace('', '')
'b%C3%B8f'

This means, that to produce a consistent result, the only thing we need, is to ensure that the input to url_escape_path is unicode.

With regards to 2. Unless we want to make consistent input the users responsibility, there is really nothing we can do about this, we might get both unicode objects and byte arrays when asking for text.

On the other hand, it is simple to fix, if we are aware of it. The only thing that we need to do, to make sure that we use unicode everywhere internally (which is also what we expected in 1.) is to call really_unicode whenever we use a user input text string (or to convert it as the very first thing in the methods, but in these methods, they are not used more than once which makes that a bit much)

In conclusion (and I might be wrong here, character encodings and strings are not my strong suit) I would argue, that if you just call really_unicode on any text input to the methods, that will take care of both 1. and 2. and make it just work (see #249).

On a side note, there are of course other solutions one might consider:

  • Drop support below 3 (just kidding I don't think I will become popular with that one yet)
  • In these methods, it seems that the text input is always used in conjunction with url_escape_path, which means that we could of course put the really_unicode call in there, but I really don't like that solution. The reason is that it conflicts with the generally accepted and clean design for text handling for libraries like ours that say: 1. Convert to unicode early 2. Use unicode everywhere internally 3. Convert from unicode on output, if necessary.
  • One might also consider a decorator approach, to make sure text input is unicode, that might actually be sort of elegant, but it is an topic for another time.

@KennethNielsen
Copy link
Member

@dajobe With regards to paging I agree with you, that these specialized searches could (and should) return the complete result. But I would still use a data structure for it (surprise surprise), just create a new new one say CompleteSearchResult, that as the metadata has only the number of items. But that will require a bit of mangling with the inheritance, so we can do that later. But as a I said, I agree of returning a complete result (no paging).

@coveralls
Copy link

Coverage Status

Coverage increased (+1.3%) when pulling ada4e0d on dajobe:search-tracks into 6417a99 on SoCo:master.

@dajobe
Copy link
Contributor Author

dajobe commented Oct 30, 2014

@KennethNielsen I updated this branch with your code from PR #249 so lets focus here.

I had to disable some of the tests since the asserts weren't working - see commit ada4e0d

@coveralls
Copy link

Coverage Status

Coverage increased (+1.3%) when pulling 1c6f76e on dajobe:search-tracks into 6417a99 on SoCo:master.

1 similar comment
@coveralls
Copy link

Coverage Status

Coverage increased (+1.3%) when pulling 1c6f76e on dajobe:search-tracks into 6417a99 on SoCo:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+1.49%) when pulling 5bef509 on dajobe:search-tracks into 6417a99 on SoCo:master.

@dajobe
Copy link
Contributor Author

dajobe commented Nov 2, 2014

I'm pretty happy now with the state of this PR. It still works (firstly) and the code is a lot neater, there are a few more tests that cover the Browse() responses.

@dajobe dajobe changed the title Search for albums and tracks by name Searching for music library tracks, finding all artists' albums and artist's album's tracks. Nov 2, 2014
@KennethNielsen
Copy link
Member

I will look it over as soon as I find a little time.

get_music_library_items('artists', start=0, max_items=100)
get_music_library_items('artists', start=100, max_items=100)

will get the first and next 100 items, respectively. It is also
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's missing the "possible" here, "It is also POSSIBLE to ask .."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@coveralls
Copy link

Coverage Status

Coverage increased (+1.29%) when pulling b11b37b on dajobe:search-tracks into 57343c6 on SoCo:master.

@dajobe
Copy link
Contributor Author

dajobe commented Nov 23, 2014

Travis CI seems broken now, so don't believe the CI build failure.

@coveralls
Copy link

Coverage Status

Coverage decreased (-50.27%) when pulling 21f4bfe on dajobe:search-tracks into 21faab9 on SoCo:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+1.27%) when pulling d9d093b on dajobe:search-tracks into 21faab9 on SoCo:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+1.27%) when pulling e91188c on dajobe:search-tracks into 21faab9 on SoCo:master.

@dajobe
Copy link
Contributor Author

dajobe commented Nov 30, 2014

Hey @KennethNielsen @stefankoegl @lawrenceakka I rebased this patch again (which is getting rather tedious) and we're lucky that Travis CI is working today. Is there any chance we can get this patch merged since it's been cooking for a while?

@KennethNielsen
Copy link
Member

I'm cannot test it right now, but provided you have done some quick testing on it, to confirm that nothing was broken in the rebasing, then +1 from me.

Since we already have a lot of positive feedback, I'll merge by tomorrow unless anyone objects.

@dajobe
Copy link
Contributor Author

dajobe commented Dec 1, 2014

@KennethNielsen Quick testing works with my Unicode album and tracks names which says it's good for me!

stefankoegl added a commit that referenced this pull request Dec 1, 2014
Searching for music library tracks, finding all artists' albums and artist's album's tracks.
@stefankoegl stefankoegl merged commit 05ba11f into SoCo:master Dec 1, 2014
@stefankoegl
Copy link
Member

Merged, thanks for the great work :)

@stefankoegl
Copy link
Member

Can you please add a short summary for the release notes (#261).

@dajobe dajobe mentioned this pull request Dec 1, 2014
@dajobe
Copy link
Contributor Author

dajobe commented Dec 1, 2014

Thanks a lot, added the release notes.

@dajobe dajobe deleted the search-tracks branch December 1, 2014 20:45
@KennethNielsen
Copy link
Member

@dajobe Great work. Worry, I forgot about merging this, but luckily @stefankoegl toke care of it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants