Skip to content

I can't get "import --from-scratch" to import "from scratch" #3706

@jaimet

Description

@jaimet

Problem

An (ID3 tag) frame contained inside the to-be-imported mp3 file remains in the file after it has been imported.

Here's the procedure:

Start with any mp3 file:
$ wget -O test.mp3 "https://ccrma.stanford.edu/~jos/mp3/pno-cs.mp3"

Delete any (all) ID3 tags from the file:
$ mid3v2 -D test.mp3

Check that the file contains no ID3 tags:

$ mid3v2 test.mp3
IDv2 tag info for test.mp3
No ID3 header found; skipping.

Add "COMM" (comment) ID3 frame with garbage content (this also adds the ID3v2.4.0 tag needed to contain the frame):
$ mid3v2 -c wibble test.mp3

Import the mp3 as a track

$ beet -vv -l libraryTest.db import --from-scratch -C -t test.mp3
user configuration: /home/user/.config/beets/config.yaml
data directory: /home/user/.config/beets
plugin paths:
Sending event: pluginload
library database: /home/user/test/libraryTest.db
library directory: /home/user/Music
Sending event: library_opened
Sending event: import_begin
Sending event: import_task_created
Sending event: import_task_start
Looking up: /home/user/test/test.mp3
Tagging  -
No album ID found.
Search terms:  -
Album might be VA: True
Evaluating 0 candidates.

/home/user/test/test.mp3 (1 items)
Sending event: before_choose_candidate
No matching release found for 1 tracks.
For help, see: http://beets.readthedocs.org/en/latest/faq.html#nomatch
[S]kip, Use as-is, as Tracks, Group albums, Enter search, enter Id, aBort? T
Sending event: import_task_choice
Sending event: import_task_created
Sending event: import_task_start
Looking up: /home/user/test/test.mp3
Item search terms:  -
Found 0 candidates.

/home/user/test/test.mp3
Sending event: before_choose_candidate
No matching recordings found.
[S]kip, Use as-is, Enter search, enter Id, aBort? I
Enter recording ID: 07227795-b0b8-4c73-b010-c96f73990dc4
Searching for track ID: 07227795-b0b8-4c73-b010-c96f73990dc4
Sending event: trackinfo_received
Sending event: before_choose_candidate
Correcting track tags from:
     -
To:
    Gene Autry - Rudolph, the Red-Nosed Reindeer
URL:
    https://musicbrainz.org/recording/07227795-b0b8-4c73-b010-c96f73990dc4
(Similarity: 0.0%) (title, length)
Apply, More candidates, Skip, Use as-is, Enter search, enter Id, aBort? A
Sending event: import_task_choice
Sending event: import_task_apply
0 of 1 items replaced
Sending event: database_change
Sending event: database_change
Sending event: write
Sending event: after_write
Sending event: database_change
Sending event: import_task_files
Sending event: item_imported
Sending event: import
Sending event: cli_exit

But the garbage tag is still there:

$ mid3v2 test.mp3 | grep wibble
COMM==eng=wibble

Setup

  • OS: Debian GNU/Linux 10 ("Buster")
  • Python version: 3.7.3
  • beets version: 1.4.7
  • Turning off plugins made problem go away (yes/no): no (I don't have any plugins)
$ beet version
beets version 1.4.7
Python version 3.7.3
no plugins loaded
$ beet config
{}

Activity

added
needinfoWe need more details or follow-up from the filer before this can be tagged "bug" or "feature."
and removed
needinfoWe need more details or follow-up from the filer before this can be tagged "bug" or "feature."
on Jul 31, 2020
sampsyo

sampsyo commented on Aug 1, 2020

@sampsyo
Member

Hello! Maybe a good way to investigate this would be by looking at the song's metadata with beet info to see what happens with and without the option.

But because it seems like you're interested in how the actual on-disk tags get affected, I recommend you give the scrub plugin a try. It deletes all the tags from a file before writing any new ones.

jaimet

jaimet commented on Aug 2, 2020

@jaimet
Author

Hello! Maybe a good way to investigate this would be by looking at the song's metadata with beet info to see what happens with and without the option.

$ beet config
plugins: info

Firstly, without --from-scratch:

$ mid3v2 -D test.mp3
$ mid3v2 -c wibble test.mp3
$ rm ./libraryTest.db
$ beet -vv -l libraryTest.db import -C -t test.mp3
user configuration: /home/user/.config/beets/config.yaml
data directory: /home/user/.config/beets
plugin paths:
Sending event: pluginload
library database: /home/user/test/libraryTest.db
library directory: /home/user/Music
Sending event: library_opened
Sending event: import_begin
Sending event: import_task_created
Sending event: import_task_start
Looking up: /home/user/test/test.mp3
Tagging  -
No album ID found.
Search terms:  -
Album might be VA: True
Evaluating 0 candidates.

/home/user/test/test.mp3 (1 items)
Sending event: before_choose_candidate
No matching release found for 1 tracks.
For help, see: http://beets.readthedocs.org/en/latest/faq.html#nomatch
[S]kip, Use as-is, as Tracks, Group albums, Enter search, enter Id, aBort? T
Sending event: import_task_choice
Sending event: import_task_created
Sending event: import_task_start
Looking up: /home/user/test/test.mp3
Item search terms:  -
Found 0 candidates.

/home/user/test/test.mp3
Sending event: before_choose_candidate
No matching recordings found.
[S]kip, Use as-is, Enter search, enter Id, aBort? I
Enter recording ID: 07227795-b0b8-4c73-b010-c96f73990dc4
Searching for track ID: 07227795-b0b8-4c73-b010-c96f73990dc4
Sending event: trackinfo_received
Sending event: before_choose_candidate
Correcting track tags from:
     -
To:
    Gene Autry - Rudolph, the Red-Nosed Reindeer
URL:
    https://musicbrainz.org/recording/07227795-b0b8-4c73-b010-c96f73990dc4
(Similarity: 0.0%) (title, length)
Apply, More candidates, Skip, Use as-is, Enter search, enter Id, aBort? A
Sending event: import_task_choice
Sending event: import_task_apply
0 of 1 items replaced
Sending event: database_change
Sending event: database_change
Sending event: write
Sending event: after_write
Sending event: database_change
Sending event: import_task_files
Sending event: item_imported
Sending event: import
Sending event: cli_exit

$ beet info ./test.mp3
/home/user/test/test.mp3
       arranger:
            art: False
         artist: Gene Autry
  artist_credit: Gene Autry
    artist_sort: Autry, Gene
       bitdepth: 0
        bitrate: 128000
            bpm: 0
       channels: 2
       comments: wibble
           comp: False
           disc: 0
      disctotal: 0
         format: MP3
         genres:
         length: 20.062625
         lyrics:
    mb_artistid: 675b7627-6b5d-4a46-a728-785cb24a299e
     mb_trackid: 07227795-b0b8-4c73-b010-c96f73990dc4
  original_year: 0
r128_album_gain: 0
r128_track_gain: 0
     samplerate: 48000
          title: Rudolph, the Red-Nosed Reindeer
          track: 0
     tracktotal: 0
           year: 0

$ beet info ./test.mp3 | md5sum
04d332add5bba85c0f932850573a3260  -

Second time round, with --from-scratch:

$ mid3v2 -D test.mp3
$ mid3v2 -c wibble test.mp3
$ rm ./libraryTest.db
$ beet -vv -l libraryTest.db import --from-scratch -C -t test.mp3
user configuration: /home/user/.config/beets/config.yaml
data directory: /home/user/.config/beets
plugin paths:
Sending event: pluginload
library database: /home/user/test/libraryTest.db
library directory: /home/user/Music
Sending event: library_opened
Sending event: import_begin
Sending event: import_task_created
Sending event: import_task_start
Looking up: /home/user/test/test.mp3
Tagging  -
No album ID found.
Search terms:  -
Album might be VA: True
Evaluating 0 candidates.

/home/user/test/test.mp3 (1 items)
Sending event: before_choose_candidate
No matching release found for 1 tracks.
For help, see: http://beets.readthedocs.org/en/latest/faq.html#nomatch
[S]kip, Use as-is, as Tracks, Group albums, Enter search, enter Id, aBort? T
Sending event: import_task_choice
Sending event: import_task_created
Sending event: import_task_start
Looking up: /home/user/test/test.mp3
Item search terms:  -
Found 0 candidates.

/home/user/test/test.mp3
Sending event: before_choose_candidate
No matching recordings found.
[S]kip, Use as-is, Enter search, enter Id, aBort? I
Enter recording ID: 07227795-b0b8-4c73-b010-c96f73990dc4
Searching for track ID: 07227795-b0b8-4c73-b010-c96f73990dc4
Sending event: trackinfo_received
Sending event: before_choose_candidate
Correcting track tags from:
     -
To:
    Gene Autry - Rudolph, the Red-Nosed Reindeer
URL:
    https://musicbrainz.org/recording/07227795-b0b8-4c73-b010-c96f73990dc4
(Similarity: 0.0%) (title, length)
Apply, More candidates, Skip, Use as-is, Enter search, enter Id, aBort? A
Sending event: import_task_choice
Sending event: import_task_apply
0 of 1 items replaced
Sending event: database_change
Sending event: database_change
Sending event: write
Sending event: after_write
Sending event: database_change
Sending event: import_task_files
Sending event: item_imported
Sending event: import
Sending event: cli_exit

$ beet info ./test.mp3
/home/user/test/test.mp3
       arranger:
            art: False
         artist: Gene Autry
  artist_credit: Gene Autry
    artist_sort: Autry, Gene
       bitdepth: 0
        bitrate: 128000
            bpm: 0
       channels: 2
       comments: wibble
           comp: False
           disc: 0
      disctotal: 0
         format: MP3
         genres:
         length: 20.062625
         lyrics:
    mb_artistid: 675b7627-6b5d-4a46-a728-785cb24a299e
     mb_trackid: 07227795-b0b8-4c73-b010-c96f73990dc4
  original_year: 0
r128_album_gain: 0
r128_track_gain: 0
     samplerate: 48000
          title: Rudolph, the Red-Nosed Reindeer
          track: 0
     tracktotal: 0
           year: 0

$ beet info ./test.mp3 | md5sum
04d332add5bba85c0f932850573a3260  -

(That's the same md5sum as without --from-scratch)

But because it seems like you're interested in how the actual on-disk tags get affected, I recommend you give the scrub plugin a try. It deletes all the tags from a file before writing any new ones.

According to the documentation, that's what the --from-scratch option (or the from_scratch configuration option) is for:

  1. (From https://beets.readthedocs.io/en/v1.4.7/reference/cli.html#import)

When beets applies metadata to your music, it will retain the value of any existing tags that weren’t overwritten, and import them into the database. You may prefer to only use existing metadata for finding matches, and to erase it completely when new metadata is applied. You can enforce this behavior with the --from-scratch option, or the from_scratch configuration option.

  1. (From https://beets.readthedocs.io/en/v1.4.7/reference/config.html#from-scratch)

Either yes or no (default), controlling whether existing metadata is discarded when a match is applied. This corresponds to the --from_scratch flag to beet import.

  1. (From https://beets.readthedocs.io/en/v1.4.6/changelog.html#december-21-2017)

A new from_scratch configuration option makes the importer remove old metadata before applying new metadata. This new feature complements the zero and scrub plugins but is slightly different: beets clears out all the old tags it knows about and only keeps the new data it gets from the remote metadata source.

  1. (From beets issue 934)

--from-scratch: every field in the Item should be zeroed before applying the matched metadata (this zeroing should only happen if the user actually chooses apply.)

  1. (From beets issue 1173)

"from-scratch" "is about tagging the file from a completely blank slate, ie., removing all data from the file before writing new data".

Re the scrub plugin, I am preparing a bug report for that too, but that's a different bug report - I want to keep this bug report focused on the --from-scratch option (and the from_scratch configuration option) only.

Is this observed behaviour of the --from-scratch option "by design"?

jaimet

jaimet commented on Aug 12, 2020

@jaimet
Author

This issue currently displays a "needinfo" label. Does this issue need more details or a follow-up from me, or is this (the fact that this issue is showing a "needinfo" label) a bug?

sampsyo

sampsyo commented on Aug 12, 2020

@sampsyo
Member

Sorry for the silence. I admit I'm a little overwhelmed by all the data here—is it possible to distill what you're observing with the --from-scratch mode that contradicts your expectations? I think the important thing is to start by making sure you're observing differences in the beets library database first (as observed by beet ls or beet info -L or similar), as opposed to the on-disk tags (as observed by mid3v2). Then, as a separate question, we can investigate the association between the database data and the tags.

stale

stale commented on Oct 11, 2020

@stale

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

jaimet

jaimet commented on Oct 17, 2020

@jaimet
Author

(Thank you, stale bot, for the prod! 👍)

Adrian, apologies for the delay in getting back to you.

I only use beets to tag my audio files - I don't actually use the beets library database at all.

I would like beets to completely remove all pre-existing tags in my audio files when it tags them i.e. I do not want to retain any pre-existing tags in the audio files i.e. the only ID3 tags that I want in my audio files are those tags that beets adds during processing. I think that according to the documentation, the --from-scratch option should do this i.e. the --from-scratch option should remove all pre-existing tags from my audio files during processing.

However, according to my testing, the --from-scratch option does not do this i.e. the --from-scratch option does not remove all_ pre-existing tags from my audio files (this is what I show above - there's an unwanted tag in my audio file before processing, and using the --from-scratch option does not remove it.)

I think that there is a discrepancy between the documentation and the observed behaviour. I do not know whether the documentation is wrong or the observed behaviour is wrong - I just think that they do not match up.

I realise that you said (above) that I should look at scrub plugin. This makes me think that the --from-scratch option is not supposed to remove all pre-existing tags from my audio files during processing. Is this correct?

I hope that this comment makes sense - if it doesn't make sense, then please let me know and I'll try to explain a different way.

sampsyo

sampsyo commented on Oct 18, 2020

@sampsyo
Member

I think the main thing to clarify here is that --from-scratch only applies to fields that beets actually supports as columns in its database. That's why it's a little hard to talk about this (and measure the effect) in a database-free setting. The scrub plugin is responsible for removing metadata that beets does not support, so using them together might be close to what you're after.

stale

stale commented on Dec 17, 2020

@stale

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

jaimet

jaimet commented on Dec 23, 2020

@jaimet
Author

Hi stale bot. Unfortunately, I haven't yet had the time to continue working on this issue, but I want to keep it open as I don't yet think that it has been satisfactorily resolved (although I'm starting to wonder whether I consider this to be more a documentation issue rather than a code issue). I'm adding this comment so you don't close this issue tomorrow.

11 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    needinfoWe need more details or follow-up from the filer before this can be tagged "bug" or "feature."stale

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Participants

      @sampsyo@jaimet@the-confessor@rdy2go

      Issue actions

        I can't get "import --from-scratch" to import "from scratch" · Issue #3706 · beetbox/beets