lastgenre: Global and artist-based genre blacklist #5744

JOJ0 · 2025-04-20T06:22:47Z

Description

Forbids a list of genres or regex patterns configurable for an artist or globally (*). It uses a simple custom file format that could look like this:

fracture:
     ^(heavy|black|power|death)?\s?(metal|rock)$|\w+-metal\d*$
    progressive metal
gilles peterson:
    samba
    bossa nova
*:
    electronic

This configuration gives a practical example within: Artist "Fracture" is a Drum And Bass producer but a also a Metal band exists with the same name. last.fm returns genres for both artists. Removing Metal variants (Power Metal, Heavy Metal, ...) from the global whitelist file would result in not being able to fetch proper genres for any other Metal band in the user's library anymore.

A mixture of regex and simple genre names is possible since regex patterns are precompiled and if that fails, escaped literal strings are added to the blacklist.

This PR also includes some cleanup in the setup() method which got rather longish. It moves all file loading code to separate helper methods.

Notes

After several experiments with YAML and INI as the blacklist file format I came to the conclusion that it's just too much PITA trying to fit regex patterns within these formats and I gave up looking for the perfect format that doesn't stand in the way of userfriendliness and allowing all kinds of regex patterns withouth special escaping and formatting tricks.....I'm back from that hell and decide that a very simple custom format suits this usecase best.

This feature idea also came up during this discussion: #5721 (comment) and is on my personal wish list for ages.

My personal roadmap for the lastgenre plugin: #5915

To Do

Documentation.
Changelog.
Tests.

github-actions · 2025-04-20T06:23:00Z

Thank you for the PR! The changelog has not been updated, so here is a friendly reminder to check if you need to add an entry.

codecov · 2025-08-04T08:54:27Z

Codecov Report

❌ Patch coverage is 95.52239% with 3 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (master@e3574a7). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
beetsplug/lastgenre/__init__.py	95.52%	1 Missing and 2 partials ⚠️

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

github-actions · 2025-08-05T22:29:11Z

Thank you for the PR! The changelog has not been updated, so here is a friendly reminder to check if you need to add an entry.

Copilot

Pull Request Overview

This PR implements a blacklist feature for the lastgenre plugin to forbid specific genres globally or per-artist using regex patterns or plain text matching. It uses a custom file format to maximize compatibility with complex regex patterns and provides both global (*) and artist-specific blacklisting capabilities.

Adds blacklist functionality with custom file format supporting regex patterns and plain genre names
Refactors the setup() method by extracting file loading logic into separate helper methods
Updates genre filtering to check blacklist patterns before accepting genres

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File	Description
beetsplug/lastgenre/init.py	Core implementation of blacklist functionality, file parsing, and genre filtering
test/plugins/test_lastgenre.py	Comprehensive test coverage for blacklist patterns, file format parsing, and error handling
docs/plugins/lastgenre.rst	Documentation for the new blacklist feature with examples and configuration options

Copilot · 2025-08-05T22:29:52Z

beetsplug/lastgenre/__init__.py

+        with open(bl_filename, "rb") as f:
+            for lineno, line in enumerate(f, 1):
+                line = line.decode("utf-8").lower()


The file is opened in binary mode ('rb') but then decoded line by line. This could fail if the file contains multi-byte Unicode characters that span across buffer boundaries. Consider opening the file in text mode with explicit encoding or using a proper text decoder.

Suggested change

with open(bl_filename, "rb") as f:

for lineno, line in enumerate(f, 1):

line = line.decode("utf-8").lower()

with open(bl_filename, "r", encoding="utf-8") as f:

for lineno, line in enumerate(f, 1):

line = line.lower()

I'm not sure if I should follow this advice. I tried to orientate my code on how the simple text file whitelist.txt is loaded. Our guides only tell that paths to store in the database should be made sure to be bytes https://beets.readthedocs.io/en/latest/contributing.html#handling-paths

So actually I coud safely follow this advice. Some guidance please @wisp3rwind or @snejus

Thanks :-)

Copilot · 2025-08-05T22:29:52Z

beetsplug/lastgenre/__init__.py

+        section = None
+        with open(bl_filename, "rb") as f:
+            for lineno, line in enumerate(f, 1):
+                line = line.decode("utf-8").lower()


Calling .lower() on the entire line before parsing could interfere with regex patterns that depend on case sensitivity. The case conversion should be done after separating the artist name from the patterns, and only applied to the artist name.

Suggested change

line = line.decode("utf-8").lower()

line = line.decode("utf-8")

Copilot · 2025-08-05T22:29:53Z

beetsplug/lastgenre/__init__.py

        if not genres:
            return []
-        return [x for x in genres if self._is_valid(x)]
+        return [


The _filter_valid_genres method calls _is_forbidden for every genre, which performs regex matching against potentially many patterns. Consider caching compiled regex patterns or implementing early termination strategies for better performance with large genre lists.

and add return types.

Uses a custom text file format since YAML, INI, TOML, ... all have their flaws with parsing regex patterns.

and use it via _filter_valid_genres. For performance reasons only consider it if the genre is valid in the first place.

Fall back to literal strings if a regex compile fails. This automagically supports strings as well as regex patterns in the blacklist.

JOJ0 changed the title ~~Lastgenre forbidden~~ Lastgenre forbidden for artist list Apr 20, 2025

JOJ0 force-pushed the lastgenre_forbidden branch from 256a8a1 to c8199cb Compare August 4, 2025 08:48

JOJ0 force-pushed the lastgenre_forbidden branch 8 times, most recently from 9cd4947 to a0e6054 Compare August 4, 2025 15:41

JOJ0 changed the title ~~Lastgenre forbidden for artist list~~ Lastgenre global and artist-based genre _blacklist_ Aug 5, 2025

JOJ0 changed the title ~~Lastgenre global and artist-based genre _blacklist_~~ lastgenre: Global and artist-based genre blacklist Aug 5, 2025

JOJ0 force-pushed the lastgenre_forbidden branch from a3daa01 to 4868326 Compare August 5, 2025 16:10

JOJ0 marked this pull request as ready for review August 5, 2025 22:28

Copilot AI review requested due to automatic review settings August 5, 2025 22:28

JOJ0 requested review from jackwilsdon, snejus, wisp3rwind and semohr August 5, 2025 22:29

Copilot AI reviewed Aug 5, 2025

View reviewed changes

JOJ0 added 7 commits August 6, 2025 07:07

lastgenre: Move file loading to helpers

7df91f5

and add return types.

lastgenre: Load configured blacklist file

cde70c6

Uses a custom text file format since YAML, INI, TOML, ... all have their flaws with parsing regex patterns.

lastgenre: Add _is_forbidden method

e5e5f68

and use it via _filter_valid_genres. For performance reasons only consider it if the genre is valid in the first place.

lastgenre: Precompile blacklist regex patterns

838b593

lastgenre: Add test_blacklist_patterns

5d8c2fe

lastgenre: Handle blacklist regex compile errors

b7bacbe

Fall back to literal strings if a regex compile fails. This automagically supports strings as well as regex patterns in the blacklist.

lastgenre: Add blacklist file format tests

d008951

JOJ0 added 5 commits August 6, 2025 07:07

lastgenre: poe format

0e3c36d

lastgenre: Add complex regex patterns to test

78941a5

lastgenre: Test invalid regex that gets escaped

06b094f

lastgenre: Polish blacklist docstrings

9aab5c5

lastgenre: Docs for blacklist feature

13a01b1

JOJ0 force-pushed the lastgenre_forbidden branch from 5b24d49 to 13a01b1 Compare August 6, 2025 05:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

lastgenre: Global and artist-based genre blacklist #5744

lastgenre: Global and artist-based genre blacklist #5744

Uh oh!

JOJ0 commented Apr 20, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Apr 20, 2025

Uh oh!

codecov bot commented Aug 4, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Aug 5, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Aug 5, 2025

Uh oh!

JOJ0 Aug 7, 2025

Uh oh!

Copilot AI Aug 5, 2025

Uh oh!

Copilot AI Aug 5, 2025

Uh oh!

Uh oh!

	line = line.decode("utf-8").lower()
	line = line.decode("utf-8")

lastgenre: Global and artist-based genre blacklist #5744

Are you sure you want to change the base?

lastgenre: Global and artist-based genre blacklist #5744

Uh oh!

Conversation

JOJ0 commented Apr 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Notes

To Do

Uh oh!

github-actions bot commented Apr 20, 2025

Uh oh!

codecov bot commented Aug 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions bot commented Aug 5, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

JOJ0 Aug 7, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

JOJ0 commented Apr 20, 2025 •

edited

Loading

codecov bot commented Aug 4, 2025 •

edited

Loading