Skip to content

lastgenre: Global and artist-based genre blacklist #5744

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

JOJ0
Copy link
Member

@JOJ0 JOJ0 commented Apr 20, 2025

Description

Forbids a list of genres or regex patterns configurable for an artist or globally (*). It uses a simple custom file format that could look like this:

fracture:
     ^(heavy|black|power|death)?\s?(metal|rock)$|\w+-metal\d*$
    progressive metal
gilles peterson:
    samba
    bossa nova
*:
    electronic

This configuration gives a practical example within: Artist "Fracture" is a Drum And Bass producer but a also a Metal band exists with the same name. last.fm returns genres for both artists. Removing Metal variants (Power Metal, Heavy Metal, ...) from the global whitelist file would result in not being able to fetch proper genres for any other Metal band in the user's library anymore.

A mixture of regex and simple genre names is possible since regex patterns are precompiled and if that fails, escaped literal strings are added to the blacklist.

This PR also includes some cleanup in the setup() method which got rather longish. It moves all file loading code to separate helper methods.

Notes

After several experiments with YAML and INI as the blacklist file format I came to the conclusion that it's just too much PITA trying to fit regex patterns within these formats and I gave up looking for the perfect format that doesn't stand in the way of userfriendliness and allowing all kinds of regex patterns withouth special escaping and formatting tricks.....I'm back from that hell and decide that a very simple custom format suits this usecase best.

This feature idea also came up during this discussion: #5721 (comment) and is on my personal wish list for ages.

My personal roadmap for the lastgenre plugin: #5915

To Do

  • Documentation.
  • Changelog.
  • Tests.

Copy link

Thank you for the PR! The changelog has not been updated, so here is a friendly reminder to check if you need to add an entry.

@JOJ0 JOJ0 changed the title Lastgenre forbidden Lastgenre forbidden for artist list Apr 20, 2025
@JOJ0 JOJ0 force-pushed the lastgenre_forbidden branch from 256a8a1 to c8199cb Compare August 4, 2025 08:48
Copy link

codecov bot commented Aug 4, 2025

Codecov Report

❌ Patch coverage is 95.52239% with 3 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (master@e3574a7). Learn more about missing BASE report.

Files with missing lines Patch % Lines
beetsplug/lastgenre/__init__.py 95.52% 1 Missing and 2 partials ⚠️
🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@JOJ0 JOJ0 force-pushed the lastgenre_forbidden branch 8 times, most recently from 9cd4947 to a0e6054 Compare August 4, 2025 15:41
@JOJ0 JOJ0 changed the title Lastgenre forbidden for artist list Lastgenre global and artist-based genre _blacklist_ Aug 5, 2025
@JOJ0 JOJ0 changed the title Lastgenre global and artist-based genre _blacklist_ lastgenre: Global and artist-based genre blacklist Aug 5, 2025
@JOJ0 JOJ0 force-pushed the lastgenre_forbidden branch from a3daa01 to 4868326 Compare August 5, 2025 16:10
@JOJ0 JOJ0 marked this pull request as ready for review August 5, 2025 22:28
@Copilot Copilot AI review requested due to automatic review settings August 5, 2025 22:28
Copy link

github-actions bot commented Aug 5, 2025

Thank you for the PR! The changelog has not been updated, so here is a friendly reminder to check if you need to add an entry.

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements a blacklist feature for the lastgenre plugin to forbid specific genres globally or per-artist using regex patterns or plain text matching. It uses a custom file format to maximize compatibility with complex regex patterns and provides both global (*) and artist-specific blacklisting capabilities.

  • Adds blacklist functionality with custom file format supporting regex patterns and plain genre names
  • Refactors the setup() method by extracting file loading logic into separate helper methods
  • Updates genre filtering to check blacklist patterns before accepting genres

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
beetsplug/lastgenre/init.py Core implementation of blacklist functionality, file parsing, and genre filtering
test/plugins/test_lastgenre.py Comprehensive test coverage for blacklist patterns, file format parsing, and error handling
docs/plugins/lastgenre.rst Documentation for the new blacklist feature with examples and configuration options

Comment on lines +193 to +195
with open(bl_filename, "rb") as f:
for lineno, line in enumerate(f, 1):
line = line.decode("utf-8").lower()
Copy link
Preview

Copilot AI Aug 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file is opened in binary mode ('rb') but then decoded line by line. This could fail if the file contains multi-byte Unicode characters that span across buffer boundaries. Consider opening the file in text mode with explicit encoding or using a proper text decoder.

Suggested change
with open(bl_filename, "rb") as f:
for lineno, line in enumerate(f, 1):
line = line.decode("utf-8").lower()
with open(bl_filename, "r", encoding="utf-8") as f:
for lineno, line in enumerate(f, 1):
line = line.lower()

Copilot uses AI. Check for mistakes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if I should follow this advice. I tried to orientate my code on how the simple text file whitelist.txt is loaded. Our guides only tell that paths to store in the database should be made sure to be bytes https://beets.readthedocs.io/en/latest/contributing.html#handling-paths

So actually I coud safely follow this advice. Some guidance please @wisp3rwind or @snejus

Thanks :-)

section = None
with open(bl_filename, "rb") as f:
for lineno, line in enumerate(f, 1):
line = line.decode("utf-8").lower()
Copy link
Preview

Copilot AI Aug 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling .lower() on the entire line before parsing could interfere with regex patterns that depend on case sensitivity. The case conversion should be done after separating the artist name from the patterns, and only applied to the artist name.

Suggested change
line = line.decode("utf-8").lower()
line = line.decode("utf-8")

Copilot uses AI. Check for mistakes.

if not genres:
return []
return [x for x in genres if self._is_valid(x)]
return [
Copy link
Preview

Copilot AI Aug 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _filter_valid_genres method calls _is_forbidden for every genre, which performs regex matching against potentially many patterns. Consider caching compiled regex patterns or implementing early termination strategies for better performance with large genre lists.

Copilot uses AI. Check for mistakes.

JOJ0 added 7 commits August 6, 2025 07:07
Uses a custom text file format since YAML, INI, TOML, ... all have their
flaws with parsing regex patterns.
and use it via _filter_valid_genres. For performance reasons only
consider it if the genre is valid in the first place.
Fall back to literal strings if a regex compile fails. This
automagically supports strings as well as regex patterns in the
blacklist.
@JOJ0 JOJ0 force-pushed the lastgenre_forbidden branch from 5b24d49 to 13a01b1 Compare August 6, 2025 05:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant