API endpoint for getting data count for multiple recordings (AB-294) #221

loujine · 2017-02-08T01:27:32Z

First working draft

I see as possible improvements:

refactoring some common code between get_many_lowlevel and get_many_count (core.py)
add a NoDataFoundException in db/data.py
more tests :) (with at least a count 2 and a count 0)

loujine · 2017-02-08T18:15:21Z

Also, the JSON response format I chose {"mbid1": count1, "mbid2": count2} is not really consistent with the format of the MBID/count endpoint {"count": count, "mbid": mbid}

loujine · 2017-02-08T23:33:19Z

Added tests
Waiting for feedback for the other points

alastair · 2017-03-02T14:02:02Z

db/data.py

+    of MBID."""
+    with db.engine.connect() as connection:
+        query = text("""
+            SELECT gid, COUNT(*)


Watch the formatting here - we right align SQL keywords

alastair

Looks good, thanks for the patch! Some small changes to be made

alastair · 2017-03-02T14:03:28Z

db/testing.py

@@ -77,3 +78,22 @@ def load_low_level_data(self, mbid):
        """
        with open(self.data_filename(mbid)) as json_file:
            db.data.submit_low_level_data(mbid, json.loads(json_file.read()), gid_types.GID_TYPE_MBID)
+
+    def load_fake_low_level_data(self, mbid):


This is a handy method. good idea.
We should probably make the method submit_fake_low_level_data, instead of load_
Can you add a small docstring explaining what it does?

I don't think it's important to have average loudness a random variable. Why did you make it like this?

I changed the name and added a docstring

The random() allows me to submit several times the same mbid as a new separate entry, to test counts > 1. I tried to make that clear in the docstring, I'm not sure whether I managed.

Ah, I understand now. It would be more explicit to have a separate parameter to this function which you could manually change each time you want a different submission.

This might mean you need a bit more code when adding these items to the database (you won't be able to use a for loop unless you add some extra data), but I think it would make it clearer what the method does.

alastair · 2017-03-02T14:08:15Z

webserver/views/api/v1/core.py

+            "More than 200 recordings not allowed per request")
+
+    mbids = set(r[0] for r in recordings)
+    recording_count = {}.fromkeys(mbids, 0)


This is neat, I didn't know of this method before. Is the idea here to make sure that all MBIDs have a value in the returned dictionary even if it's 0?

This ties in to #223 a little bit too - we have to decide what to return for missing data.
I would be just as happy to return no data for mbids which are not in the database. In this case it's up to the caller to check if the keys exist.

Yes, all mbids returned by _parse_bulk_params (so I guess all valid musicbrainz mbids, whether they have low-level data associated or not) are initialized with count=0.
I guess invalid mbids (i.e. not existing in muscbrainz) are not returned by _parse_bulk_params? I didn"t check though

alastair · 2017-03-02T14:08:43Z

webserver/views/api/v1/core.py

+
+    mbids = set(r[0] for r in recordings)
+    recording_count = {}.fromkeys(mbids, 0)
+    recording_count.update({str(mbid): int(count) for (mbid, count)


Can you make the db method return this dictionary, instead of formatting it in the view?

alastair · 2017-03-02T15:16:04Z

Regarding format, what do you think about

{"mbid": {"count": 1}, "mbid": {"count":2}}

too complex?

loujine · 2017-03-04T15:12:23Z

I changed the response format, this looks fine for me (javascript object = easy to parse keys when receiving the response).

I thought you might want to have the same format as the MBID/count endpoint:
[ {"count": count1, "mbid": mbid1}, {"count": count2, "mbid": mbid2}, ... ]
but that make the response more annoying to parse for specific mbids on the client side

alastair

much better 👍, just some small tweaks remaining

alastair · 2017-03-05T20:28:42Z

db/data.py

+                   IN :mbids
+             GROUP BY gid;""")
+        return {str(mbid): int(count) for mbid, count
+                in connection.execute(query, mbids=tuple(mbids))}


I missed this in the first review - we use a dict as the second argument to execute: {"mbids": tuple(mbids)}. IN fact, I didn't know that this other syntax was valid, but we should stay consistent.

alastair · 2017-03-05T20:29:07Z

db/data.py

+                    , COUNT(*)
+                 FROM lowlevel
+                WHERE gid
+                   IN :mbids


for consistency, we write this as WHERE gid IN :mbids, all on a single line

Of course. I introduced this mistake when rereading my code just before pushing... it was fine beforehand :)

alastair · 2017-03-05T20:35:52Z

webserver/views/api/v1/core.py

+
+    mbids = set(r[0] for r in recordings)
+    recording_count = {}.fromkeys(mbids, {'count': 0})
+    for (mbid, count) in six.iteritems(db.data.count_many_lowlevel(mbids)):


We decided in #223 to not list an mbid in the response if it doesn't exist in the database. I think we should do the same here, this means we can just return jsonify(db.data.count_many_lowlevel(mbids)).
The documentation should say this - that if a mbid doesn't exist in the database then it is omitted from the results.

alastair · 2017-03-05T20:37:06Z

I like the format better now. It fits in with the format of the bulk get method more. You're right, returning a list would be more annoying to consume by the client.

I introduced a mistake in the last fix commit

Dictionary argument, not keyword argument

…t query

alastair · 2017-03-07T15:13:02Z

🥇

loujine force-pushed the master branch from f2e03c8 to 11e6a3b Compare February 8, 2017 18:10

loujine added 3 commits February 8, 2017 23:07

API endpoint for getting data count for multiple recordings (AB-294)

e4f1f8e

Add helper to submit fake data in tests (related to AB-294)

0860fc2

Add test for api/v1/mbid/count

cbd5e19

loujine force-pushed the master branch from 11e6a3b to cbd5e19 Compare February 8, 2017 23:30

alastair reviewed Mar 2, 2017

View reviewed changes

alastair requested changes Mar 2, 2017

View reviewed changes

loujine added 5 commits March 4, 2017 14:15

Fix SQL queries formatting to follow project guidelines

61acab4

Rename and add a docstring to helper function to generate a test dataset

f286a40

Refactorize common dataset generation in count tests

92db6f2

Format low-level count query response in db method rather than in view

fc73118

Change response format for multiple count query

18f7e5d

alastair requested changes Mar 5, 2017

View reviewed changes

alastair reviewed Mar 5, 2017

View reviewed changes

alastair self-assigned this Mar 5, 2017

loujine added 3 commits March 6, 2017 00:37

Fix SQL queries formatting

aca8dcd

I introduced a mistake in the last fix commit

Fix 'connection.execute' call argument for consistency

2486956

Dictionary argument, not keyword argument

Do not return mbids absent from database (count=0) in a multiple coun…

6b23f30

…t query

alastair approved these changes Mar 6, 2017

View reviewed changes

alastair merged commit 6b23f30 into metabrainz:master Mar 7, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API endpoint for getting data count for multiple recordings (AB-294) #221

API endpoint for getting data count for multiple recordings (AB-294) #221

loujine commented Feb 8, 2017 •

edited

Loading

loujine commented Feb 8, 2017

loujine commented Feb 8, 2017

alastair Mar 2, 2017

loujine Mar 4, 2017

alastair left a comment

alastair Mar 2, 2017

loujine Mar 4, 2017

alastair Mar 5, 2017

alastair Mar 2, 2017

loujine Mar 4, 2017

alastair Mar 2, 2017

loujine Mar 4, 2017

alastair commented Mar 2, 2017 •

edited

Loading

loujine commented Mar 4, 2017

alastair left a comment

alastair Mar 5, 2017

loujine Mar 6, 2017

alastair Mar 5, 2017

loujine Mar 6, 2017

alastair Mar 5, 2017

loujine Mar 6, 2017

alastair commented Mar 5, 2017

alastair commented Mar 7, 2017

API endpoint for getting data count for multiple recordings (AB-294) #221

API endpoint for getting data count for multiple recordings (AB-294) #221

Conversation

loujine commented Feb 8, 2017 • edited Loading

loujine commented Feb 8, 2017

loujine commented Feb 8, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alastair left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alastair commented Mar 2, 2017 • edited Loading

loujine commented Mar 4, 2017

alastair left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alastair commented Mar 5, 2017

alastair commented Mar 7, 2017

loujine commented Feb 8, 2017 •

edited

Loading

alastair commented Mar 2, 2017 •

edited

Loading