New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BU-11: Add MusicBrainz database module to brainzutils #14

Merged
merged 14 commits into from Jun 4, 2018

Conversation

Projects
None yet
4 participants
@kartikeyaSh
Contributor

kartikeyaSh commented Mar 31, 2018

Summary

  • This is a…
    • Bug fix
    • Feature addition
    • Refactoring
    • Minor / simple change (like a typo)
    • Other
  • Describe this change in 1-2 sentences:

Problem

MusicBrainz DB is being used by more than one MB projects (CritiqueBrainz, AcousticBrainz) for useful information and MessyBrainz also needs this.

  • JIRA ticket (optional): BU-11

Solution

The code for accessing different entities can be shared by using brainzutils musicbrainz_db module.

Action

Created musicbrainz_db module.

kartikeyaSh added some commits Mar 30, 2018

Add musicbrainz_db module
The musicbrainz_db is currently being used by CritiqueBrainz. Now
MessyBrainz also requires this, so it's better to put this code in
brainzutils.
Add various utility files
Add serialize.py which converts the mbdata model objects data to
python dicts, includes.py which checks valid includes, helpers.py
to contain helper function to get relationships, and utils.py
which provides utility functions common to different entities.
Add tests for recording
Add tests to get single recording and multiple recordings by MBIDs.
@brainzbot

This comment has been minimized.

Member

brainzbot commented Mar 31, 2018

Can one of the admins verify this patch?

2 similar comments
@brainzbot

This comment has been minimized.

Member

brainzbot commented Mar 31, 2018

Can one of the admins verify this patch?

@brainzbot

This comment has been minimized.

Member

brainzbot commented Mar 31, 2018

Can one of the admins verify this patch?

@kartikeyaSh

This comment has been minimized.

Contributor

kartikeyaSh commented Mar 31, 2018

I've written this with python3 syntax. Does it have to work with both versions of python?

@paramsingh

This comment has been minimized.

Member

paramsingh commented Apr 1, 2018

@kartikeyaSh, brainzutils is used by AB which is currently python2, so we don't want to break python2 compatibility.

@paramsingh

This comment has been minimized.

Member

paramsingh commented Apr 1, 2018

@brainzbot add to whitelist.

@paramsingh

Looks pretty good to me.

I have a few points that should be addressed before this can be merged. The serialize entities stuff is not documented or tested really. Since your project only needs recordings right now, I think we should test and document that, while moving the other serialization functions to a different ticket and PR.

Another point that I think we should consider is moving all caching related stuff out of this module. I tried this locally and the first error I got was that cache wasn't initiated. We should let other projects handle caching while keeping this module exclusively for database access.

return data


def serialize_recording(recording, include={}):

This comment has been minimized.

@paramsingh

paramsingh May 18, 2018

Member

The includes={} can cause problems here: https://docs.quantifiedcode.com/python-anti-patterns/correctness/mutable_default_value_as_argument.html

We should do:

def serialize_recording(recording, include=None):
    if include is None:
        include = {}

This comment has been minimized.

@paramsingh

paramsingh May 18, 2018

Member

Another thing I notice is the use of include here and includes in other places. Should be fixed for consistency.

include (list): List of values to be included.
For list of possible values visit https://bitbucket.org/lalinsky/mbdata/wiki/API/v1/includes#!recording
Returns:
Dictionary containing the recordings information with MBIDs as keys.

This comment has been minimized.

@paramsingh

paramsingh May 18, 2018

Member

A format of the dictionary here would be useful.

recording = get_many_recordings_by_mbid([mbid], include)
return recording.get(mbid)

def get_many_recordings_by_mbid(mbids, include=[]):
data['comment'] = recording.comment

if recording.length:
data['length'] = recording.length / 1000.0

This comment has been minimized.

@paramsingh

paramsingh May 18, 2018

Member

A comment should explain why 1000 here.

@@ -6,3 +6,5 @@ certifi
redis==2.10.5
msgpack-python==0.4.8
requests==2.18.1
SQLAlchemy==1.0.8
mbdata==2017.6.2

This comment has been minimized.

@paramsingh

paramsingh May 18, 2018

Member

do these need to be added to requirements_dev.txt too?

This comment has been minimized.

@kartikeyaSh

kartikeyaSh May 24, 2018

Contributor

Once we settle on the version of SQLAlchemy I'll add that to requirements_dev.txt.

@@ -6,3 +6,5 @@ certifi
redis==2.10.5
msgpack-python==0.4.8
requests==2.18.1
SQLAlchemy==1.0.8

This comment has been minimized.

@paramsingh

paramsingh May 18, 2018

Member

Is there a reason you're using SQLAlchemy 1.0.8 here? CritiqueBrainz uses a newer version (1.1.3) and this breaks it.

This comment has been minimized.

@kartikeyaSh

kartikeyaSh May 23, 2018

Contributor

I used it just cz its the version being used by messybrainz. I don't have CB build, so I can't test it. Can you post some logs in which I can try to find out something? And we should try to use one version of libraries in all MB projects whenever possible. Is downgrading or upgrading the libraries a good thing in this case? In CB requirements there is nothing on why it was upgraded in the first place. To me, it looks like its just a case of "trying to use new version just cz it has bug fixes etc."

@paramsingh

This comment has been minimized.

Member

paramsingh commented May 18, 2018

Another option we have is documenting and testing the other serialize functions, but I think that this should be done in a different ticket anyways.

@paramsingh

This comment has been minimized.

Member

paramsingh commented May 19, 2018

@rsh7 has addressed some comments in a203ed1 if you wanna do a git cherry-pick or copypasta.

@rsh7

This comment has been minimized.

Contributor

rsh7 commented May 23, 2018

Added the above required changes here: a203ed1

kartikeyaSh added some commits May 23, 2018

Use includes at all places and remove caching.
Using includes in all places for consistency. Removing
use of brainzutils cache inside of musicbrainz_db module and
leaving that for the user to handle.
Remove functions that are not used and add docstrings.
The functions that are not used currently are removed.
And documentation for the ones being used is written.
Add tests for serialize.py
This includes tests for serialize_recording and
serialize_artist_credits.

@kartikeyaSh kartikeyaSh force-pushed the kartikeyaSh:mbd branch from 88ba340 to 137a608 May 24, 2018



engine = None
DEFAULT_CACHE_EXPIRATION = 12 * 60 * 60 # seconds (12 hours)

This comment has been minimized.

@paramsingh

paramsingh Jun 4, 2018

Member

this shouldn't be needed.

@paramsingh

Awesome! 🥇

@paramsingh paramsingh merged commit 9853d03 into metabrainz:master Jun 4, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment