Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apple metadata cache #2546

Conversation

vardansaini
Copy link
Contributor

@vardansaini vardansaini commented Aug 14, 2023

Problem

The metadata content resolver is used to store metadata of the entire catalog present in an external service's database. Currently, ListenBrainz has such a resolver for Spotify but not for Apple Music.

Solution

As a part of linking their Apple Music account with ListenBrainz, I added a similar content resolver for Apple Music as well.

The code in this pull request defines a Python class AppleCrawlerHandler which is a handler for crawling and fetching data related to Apple Music's albums and artists. Here's a summary of the main functionalities and methods in the code:

  1. The class is a subclass of BaseHandler.

  2. In the constructor (__init__), the class is initialized with various attributes including the name, external service queue, schema name, and cache key prefix. The constructor also initializes instance variables for the app, discovered albums, discovered artists, and an instance of the Apple class.

  3. The method get_seed_albums retrieves Apple album IDs from new releases for all markets by making requests to the Apple Music API.

  4. The method get_items_from_listen extracts album IDs from a given "listen" data and creates a list of JobItem instances.

  5. The method get_items_from_seeder takes a message containing Apple album IDs and creates a list of JobItem instances.

  6. The method transform_album transforms raw album data by organizing tracks and artists into appropriate data structures and returns an Album object.

  7. The method fetch_albums fetches album data from the Apple Music API and transforms it using the transform_album method. It returns a list of transformed albums and new items to be processed.

  8. The method discover_albums looks up albums of a given artist to discover more albums for seeding the job queue. It keeps track of discovered artists and albums, and returns a list of new JobItem instances.

Overall, the AppleCrawlerHandler class is designed to interact with the Apple Music API to fetch and process album and artist data, and generate job items for further processing. The class is part of a larger system or application responsible for crawling and caching Apple Music metadata.

Action

@amCap1712 amCap1712 changed the title "Apple metadata cache " Apple metadata cache Aug 14, 2023
@amCap1712
Copy link
Member

Hi! Thanks for the PR, I have opened #2550 to make it easier to integrate the new apple metadata cache. Once it is merged, you should update the code to implement the BaseHandler class defined in it. That should eliminate a lot of redundant code. I will review the PR after that.

In the meanwhile, you can update the PR description with the workflow of what API call is made when and etc.

@amCap1712
Copy link
Member

That PR is merged now.

@vardansaini
Copy link
Contributor Author

BaseClass has been updated as you said but I am not sure how can I integrate this handler and script with rest of the listenbrainz container, I asked in IRC but no one responded.

@amCap1712 amCap1712 marked this pull request as ready for review August 20, 2023 10:15
amCap1712 added a commit that referenced this pull request Aug 20, 2023
Looking at #2546, I found more avenues to refactor the common code
between the spotify and apple metadata cache. Now the database insert,
code is also shared between both crawlers. The handlers only need to
implement fetching albums from API, discover new album seeds and convert
those into a common format to store in database.
@amCap1712
Copy link
Member

Hi @vardansaini, Thanks for the changes. I will ask you to refactor one more once the second PR is merged. Sorry for the nuisance but it should reduce a lot of redundant code. As to the integration, I will help do the needful changes once the second refactor is done.

@vardansaini
Copy link
Contributor Author

vardansaini commented Aug 22, 2023

Hi @amCap1712, I have merged the branch you mentioned and refactored the code accordingly.

@amCap1712 amCap1712 changed the base branch from master to refactor-spotify-cache-2 August 23, 2023 10:17
@amCap1712 amCap1712 deleted the branch metabrainz:refactor-spotify-cache-2 September 4, 2023 15:52
amCap1712 added a commit that referenced this pull request Sep 4, 2023
Looking at #2546, I found more avenues to refactor the common code
between the spotify and apple metadata cache. Now the database insert,
code is also shared between both crawlers. The handlers only need to
implement fetching albums from API, discover new album seeds and convert
those into a common format to store in database.
@amCap1712 amCap1712 closed this Sep 4, 2023
This was referenced Sep 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants