Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bytes concatination error in Client.search() #94

Merged
merged 8 commits into from
May 19, 2022
Merged

Conversation

JOJ0
Copy link
Contributor

@JOJ0 JOJ0 commented Apr 17, 2022

Todo:

  • Find out whether query is always tuple. -> Yes, since *args ensures that.
  • Write/fix tests.

JOJ0 added 2 commits May 5, 2022 07:22
Decide "join method" according to the type of "query".
A version of test_search, passing a bytes string to Client.search().
if anything else than tuple is passed.
Since *args is passed we'll always get a tuple which needs to be looked into
for bytes first.
@AnssiAhola
Copy link
Contributor

AnssiAhola commented May 8, 2022

So I did some testing and came up with this

if query:
     items = []
     for item in query:
         if type(item) == bytes:
             items.append(item.decode())
         else:
             items.append(item)
     fields['q'] = ' '.join(items)
     ...

Edit: More pythonic way

if query:
    items = [item.decode() if type(item) == bytes else item for item in query]
    fields['q'] = ' '.join(items)

This allows for mixed types in *query

client.search("trash", b"80")
client.search("trash", "80")
client.search("trash"  "80")
client.search(b"trash80")
client.search("trash80")

All are valid.

Also to pass the tests with multiple queries
Add a res file in tests/res/database named search_q=trash%2080&per_page=50&page=1.json

I'm sure you can come up with better variable names etc. 😄
could maybe also be more pythonic?

This also needs to be tested with the real API but i'm sure it works fine
I tested few queries resulted from these changes in https://www.discogs.com/search/?q=<query>
and those worked fine

JOJ0 added 3 commits May 17, 2022 07:49
unicode and mixed combinations are accepted. Add a new fixture in
tests/res/database responding to the multivalue search.
Add tests and corresponding fixtures to proof that passing keyword arguments
(**fields) is working.
@JOJ0
Copy link
Contributor Author

JOJ0 commented May 17, 2022

Hi @AnssiAhola, sorry for not getting back to this earlier...other stuff....

Thanks a lot, that's a great solution!

Edit: More pythonic way

if query:
    items = [item.decode() if type(item) == bytes else item for item in query]
    fields['q'] = ' '.join(items)

This allows for mixed types in *query

Exactly what we should aim for here. Nice! I slightly reformatted the list comprehension for readability/linelength, have a look. What do you think?

I'm sure you can come up with better variable names etc. smile

Funnily enough: Not this time! No better ideas than "item" and "items" ;-) And it fits within 79 chars, which I like (even though we don't currently enforce that style in the project). Maybe I do have an idea: "part" and "parts"? Better?

This also needs to be tested with the real API but i'm sure it works fine I tested few queries resulted from these changes in https://www.discogs.com/search/?q=<query> and those worked fine

I'll see if I can do some real tests and also ask users having the problem with Discogs searches in beets if they want to try it.

I also added some tests to check if passing keyword arguments is actually working. We didn't have that yet, right? It's quite a pile of new tests now, please tell me which you think are not necessary or could be combined!

Am I correct that we now support mixed types when passing *query but not when passing **fields. Should we support that too?

raise TypeError("{} method is expecting tuple, got {}.".format(
__name__, type(query)
))

Copy link
Contributor Author

@JOJ0 JOJ0 May 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you agree this check is redundant and should be kicked out? As far as I understand *args always passes tuple.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think this can be left out.

@JOJ0 JOJ0 marked this pull request as ready for review May 17, 2022 09:30
@AnssiAhola
Copy link
Contributor

Exactly what we should aim for here. Nice! I slightly reformatted the list comprehension for readability/linelength, have a look. What do you think?

Looks good to me!

Maybe I do have an idea: "part" and "parts"? Better?

Yeah. I like part/parts better. I think those fit better in this context.

I also added some tests to check if passing keyword arguments is actually working. We didn't have that yet, right? It's quite a pile of new tests now, please tell me which you think are not necessary or could be combined!

Didn't see any at least. And can't have too many tests imo! 😄

Am I correct that we now support mixed types when passing *query but not when passing **fields. Should we support that too?

It seems that **fields are passed straight to update_qs which casts all values to string, which might cause some issues if bytes are passed, but I we should look into that if/when it becomes an issue or at least make an separate PR for it.
Although docs for str() says that anything can be passed into it 🤔
What do you think?

def update_qs(url, params):
"""A not-very-intelligent function to glom parameters onto a query string."""
joined_qs = '&'.join('='.join((str(k), quote(str(v))))
for k, v in params.items())
separator = '&' if '?' in url else '?'
return url + separator + joined_qs

if query:
fields['q'] = ' '.join(query)
items = [
item.decode() if type(item) == bytes else item for item in query
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this can be simplified with str() ?

items = [str(item) for item in query]

Haven't tested if that would work but given the docs say bytes can be passed into it, i don't see why not

Copy link
Contributor Author

@JOJ0 JOJ0 May 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried that when I initially started to work on this issue. str() doesn't change the string type, bytes stay bytes and unicode stays unicode, thus it doesn't avoid different types concat errors. Not 100% sure anymore but I'll doublecheck later today or tomorrow.

Copy link
Contributor Author

@JOJ0 JOJ0 May 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm I did not recall entirely correct: It seems the join does work but its not usable on another end:

            items = [
                str(item) for item in query
            ]
            for item in items:
                print(item)
            fields['q'] = ' '.join(items)
            print(fields)
$ python -m unittest discogs_client.tests.test_models.ModelsTestCase.test_multiterm_mixed_search
trash
b'80'
{'q': "trash b'80'"}
E
======================================================================
ERROR: test_multiterm_mixed_search (discogs_client.tests.test_models.ModelsTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
 File "/home/jojo/git/discogs_client/discogs_client/tests/test_models.py", line 64, in test_multiterm_mixed_search
   self.assertEqual(len(results), 13)
 File "/home/jojo/git/discogs_client/discogs_client/models.py", line 363, in __len__
   return self.count
 File "/home/jojo/git/discogs_client/discogs_client/models.py", line 334, in count
   self._load_pagination_info()
 File "/home/jojo/git/discogs_client/discogs_client/models.py", line 289, in _load_pagination_info
   data = self.client._get(self._url_for_page(1))
 File "/home/jojo/git/discogs_client/discogs_client/client.py", line 113, in _get
   return self._request('GET', url)
 File "/home/jojo/git/discogs_client/discogs_client/client.py", line 110, in _request
   raise HTTPError(body['message'], status_code)
discogs_client.exceptions.HTTPError: 404: Resource not found.

----------------------------------------------------------------------
Ran 1 test in 0.001s

FAILED (errors=1)

But with the decode solution:

            items = [
                item.decode() if type(item) == bytes else item for item in query
            ]
            for item in items:
                print(item)
            fields['q'] = ' '.join(items)
            print(fields)
$ python -m unittest discogs_client.tests.test_models.ModelsTestCase.test_multiterm_mixed_search
trash
80
{'q': 'trash 80'}
.
----------------------------------------------------------------------
Ran 1 test in 0.001s

OK

@JOJ0
Copy link
Contributor Author

JOJ0 commented May 19, 2022

I also added some tests to check if passing keyword arguments is actually working. We didn't have that yet, right? It's quite a pile of new tests now, please tell me which you think are not necessary or could be combined!

Didn't see any at least. And can't have too many tests imo! smile

:-)

Am I correct that we now support mixed types when passing *query but not when passing **fields. Should we support that too?

It seems that **fields are passed straight to update_qs which casts all values to string, which might cause some issues if bytes are passed, but I we should look into that if/when it becomes an issue or at least make an separate PR for it. Although docs for str() says that anything can be passed into it thinking What do you think?

def update_qs(url, params):
"""A not-very-intelligent function to glom parameters onto a query string."""
joined_qs = '&'.join('='.join((str(k), quote(str(v))))
for k, v in params.items())
separator = '&' if '?' in url else '?'
return url + separator + joined_qs

Wow! Now that's one hell of an ugly unreadable function! ;-)
No, seriously: It's not too complicated thus not too ugly to read actually but I would reformat it probably once I get the chance to ;-) And on the topic: Agreed that we shouldn't touch it without a cause. The same applies here: str() won't cast different string types and we are getting mixed types in one string. Not good actually.

Played around a little with the test of this method but I think what I did is a very unlikely case (different types for key and value)

+        base_url = 'http://example.com'
+        unicode_keyval = {'foo': 'bar'}
+        self.assertEqual(u(base_url, unicode_keyval), 'http://example.com?foo=bar')
+        bytes_keyval = 'http://example.com', {b'foo': b'bar'}
+        self.assertEqual(u(base_url, bytes_keyval), 'http://example.com?foo=bar')
+        mixed_keyval = 'http://example.com', {'foo': b'bar'}
+        self.assertEqual(u(base_url, mixed_keyval), 'http://example.com?foo=bar')

the bytes_keyval test:

======================================================================
FAIL: test_update_qs (discogs_client.tests.test_utils.UtilsTestCase)
update_qs helper works as intended
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/jojo/git/discogs_client/discogs_client/tests/test_utils.py", line 22, in test_update_qs
    self.assertEqual(u(base_url, bytes_keyval), 'http://example.com?foo=bar')
AssertionError: "http://example.com?b'foo'=b%27bar%27" != 'http://example.com?foo=bar'
- http://example.com?b'foo'=b%27bar%27
?                    --   - ----   ---
+ http://example.com?foo=bar

the mixed_keyval test:

======================================================================
FAIL: test_update_qs (discogs_client.tests.test_utils.UtilsTestCase)
update_qs helper works as intended
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/jojo/git/discogs_client/discogs_client/tests/test_utils.py", line 24, in test_update_qs
    self.assertEqual(u(base_url, mixed_keyval), 'http://example.com?foo=bar')
AssertionError: 'http://example.com?foo=b%27bar%27' != 'http://example.com?foo=bar'
- http://example.com?foo=b%27bar%27
?                         ----  ---
+ http://example.com?foo=bar


----------------------------------------------------------------------
Ran 7 tests in 0.007s

FAILED (failures=1)

Not sure if this is likely to ever happen, just playing around:

+        multi_mixed_keyval = {'foo': 'bar', b'faz': b'baz'}
+        self.assertEqual(u(base_url, multi_mixed_keyval), 'http://example.com?foo=bar&faz=baz')
======================================================================
FAIL: test_update_qs (discogs_client.tests.test_utils.UtilsTestCase)
update_qs helper works as intended
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/jojo/git/discogs_client/discogs_client/tests/test_utils.py", line 20, in test_update_qs
    self.assertEqual(u(base_url, multi_mixed_keyval), 'http://example.com?foo=bar&faz=baz')
AssertionError: "http://example.com?foo=bar&b'faz'=b%27baz%27" != 'http://example.com?foo=bar&faz=baz'
- http://example.com?foo=bar&b'faz'=b%27baz%27
?                            --   - ---

I'll leave all this for now and let's move on with this PR's actual issue :-)

This reverts commit 6f68311.

Not required at all since *args ensures a tuple.
Copy link
Contributor

@AnssiAhola AnssiAhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 👍

@JOJ0
Copy link
Contributor Author

JOJ0 commented May 19, 2022

Thanks for all the help and review @AnssiAhola 💯 ❤️

@JOJ0 JOJ0 merged commit 6209dc3 into master May 19, 2022
@JOJ0 JOJ0 deleted the fix_search_type_err branch May 19, 2022 13:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TypeError: sequence item 0: expected str instance, bytes found
2 participants