fix(metadata): improve ComicInfo.xml detection and normalize fallback titles #2080
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
🚀 Pull Request
📝 Description
Bug report was submitted via Discord
This pull request improves accuracy of comic book metadata extraction, particularly for ComicInfo.xml files embedded in different archive formats. It introduces improved filename handling, more flexible detection of ComicInfo.xml files, and adds comprehensive tests to ensure correct parsing across various scenarios.
🛠️ Changes Implemented
Metadata extraction improvements:
processFilenamemethod inCbxMetadataExtractorto clean up archive filenames by replacing underscores and hyphens with spaces before using them as fallback titles. This results in more human-readable titles when metadata is missing.CbxMetadataExtractorto use the processed filename instead of the raw base name for the book title.ComicInfo.xml detection enhancements:
isComicInfoNameto robustly detect ComicInfo.xml files regardless of case or subdirectory location within archives, and updated all archive entry search logic to use this method for CBZ, CBR, and CB7 formats.Testing improvements:
ComicInfoParsingIssuesTestcovering a wide range of ComicInfo.xml extraction scenarios, including different cases, subdirectory locations, fallback logic, extended fields, and special characters. This ensures the extractor behaves correctly for real-world comic archives.🧪 Testing Strategy
📸 Visual Changes (if applicable)
developbranch./gradlew testfor backend)💬 Additional Context (optional)