Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test all with full-remote-verification shows "Extra" hashes from error in compact #4693

Open
2 tasks done
ts678 opened this issue Mar 19, 2022 · 13 comments · May be fixed by #4982
Open
2 tasks done

test all with full-remote-verification shows "Extra" hashes from error in compact #4693

ts678 opened this issue Mar 19, 2022 · 13 comments · May be fixed by #4982

Comments

@ts678
Copy link
Collaborator

ts678 commented Mar 19, 2022

  • I have searched open and closed issues for duplicates.
  • I have searched the forum for related topics.

Environment info

  • Duplicati version: 2.0.6.100 Canary but problem is very old
  • Operating system: Windows 10 Professional Version 21H2
  • Backend: Local folder

Description

Running the test command over all files with the full-remote-verification option may find errors, displayed as Extra: <hash>
This has had quite a few forum reports, has been thought harmless, has bothered some people a lot, but was not understood.

Tracking it down well took a time-series (move/copy) of databases and an rclone sync with --backup-dir option to save files.
After each backup, I sync, test, and error if exit code is nonzero. I can then trace hash to compact, and with luck can go earlier.
A profiling log is not enough because even with profile-all-database-queries, some details one might want are just not visible.

Steps to reproduce

  1. Set up an unencrypted backup of a source folder, with a retention of 1 backup and with full-remote-verification option.
  2. Download Duplicati 2.0.6.3, 2.0.6.100, 2.0.6.101 as .zip files. Consider having .zip and .sqlite tools around to look.
  3. Create A.txt containing A. Without a line ending, the file's hash is VZrq0IJk1XldOQlxjN0Fq9SVcuhP5VWQ7vMaiKCP3/0=
  4. Create B.txt containing B.
  5. Create duplicati-2.0.6.3_beta_2021-06-17.zip
  6. Backup 1
  7. Delete A.txt
  8. Create C.txt
  9. Create duplicati-2.0.6.100_canary_2021-08-11.zip
  10. Backup 2 and consider looking in the DeletedBlock table to find A.txt hash, or look in the dblock or dindex vol file for it
  11. Create A.txt
  12. Create duplicati-2.0.6.101_canary_2022-03-13.zip
  13. Backup 3 and consider looking in the Block table to find A.txt hash, or look in the dblock or dindex vol file for it.
  14. Delete duplicati-2.0.6.3_beta_2021-06-17.zip
  15. Delete duplicati-2.0.6.100_canary_2021-08-11.zip
  16. Backup 4 which results in a compact of the files from backup 1 and backup 2, but leaves the files from backup 3.
  17. Run the test command on all files to see if the backup 3 files and backup 4 files coexist, or both claim to have A.txt
  • Actual result:
    Extra: VZrq0IJk1XldOQlxjN0Fq9SVcuhP5VWQ7vMaiKCP3/0= from dindex file and dblock file from the colliding backups
  • Expected result:
    No errors

The test sequence has each backup have a large file to make wasted space on demand, plus a small file to stop full deletion.
The A.txt file is special because it comes, goes, and returns. When it goes, then backup 2 happens, it goes into DeletedBlock.
When it returns, then backup 3 happens, it is not recovered from DeletedBlock (should it be?) thus gets a new entry in Block.
Deleting the large files initially in backups 1 and 2 makes wasted space to compact backups 1 and 2 together after backup 4.
Compact checks which blocks are still used, moving those to the new volume. A.txt came in backup 1 and again in backup 3.
Code confusion appears to be that the compact check looks in the Block table, sees the second A.txt, and so also keeps first.

using (var f = new BlockVolumeReader(inst.CompressionModule, tmpfile, m_options))
{
foreach (var e in f.Blocks)
{
if (q.UseBlock(e.Key, e.Value, transaction))

public BlockQuery(System.Data.IDbConnection con, System.Data.IDbTransaction transaction)
{
m_command = con.CreateCommand();
m_command.Transaction = transaction;
m_command.Parameters.Clear();
m_command.CommandText = @"SELECT ""VolumeID"" FROM ""Block"" WHERE ""Hash"" = ? AND ""Size"" = ? ";
m_command.AddParameters(2);
}
public bool UseBlock(string hash, long size, System.Data.IDbTransaction transaction)
{
m_command.Transaction = transaction;
m_command.SetParameterValue(0, hash);
m_command.SetParameterValue(1, size);
var r = m_command.ExecuteScalar();
return r != null && r != DBNull.Value;
}

I don't understand the design theory well enough to define a fix, but it seems like there were two possible points to stop this.
First might be to not let the A.txt block get duplicated from the second appearance of A.txt. Another might be to fix compact.

Once the error gets into the destination files, it can survive database recreate. 2.0.2.1 breaks backups but test doesn't detail.
Pointing a more recent Duplicati (e.g. 2.0.4.5) at the 2.0.2.1 destination for DB recreate and then test command details "Extra".

I'm still not sure if any further damage comes from this problem though. Now that I look for it, I see that I get these regularly.

Screenshots

Debug log

@ts678
Copy link
Collaborator Author

ts678 commented Mar 20, 2022

Now that I look for it, I see that I get these regularly.

More precisely, I got one while I was writing up the previous one above, but it was late, so I decided to continue and amend.
Having a flow of these gives me a chance and a motivation to figure out how to better spot and (maybe) repair the damage.

A problem with a problem-alarm and an unfixed problem is that the alarm goes off annoyingly often, slowing more findings.
This happened before when I was testing interrupted backups which is IMO still a good place for an issue-hunter to go hunt.

Issue-fixers, unfortunately, are scarce at the moment, but if anyone wants a reproducible backup corruption, here's one case.
The Duplicati project only exists and improves if people give time or talent somehow, be it testing, fixing, features, whatever.

To illustrate how the time-series (one per backup) of databases makes this problem more visible, read my Extra: hash history.
This is a chronological survey through the database series, showing where the extra block was then -- in which table and file.

Latest Remotevolume table:
394 duplicati-b2038103e5b674f238d68c6991baa3410.dblock.zip.aes Deleted
468 duplicati-b107368bda8404624865f7ed9dc5e6fbd.dblock.zip.aes Verified
474 duplicati-bfbc2c12432f24f5e8a5fabcbd6fe8bec.dblock.zip.aes Verified

Deletion of a version notes some blocks as wasted space in DeletedBlock
The same block shows up again, so goes (right or wrong) in as new Block
Compact runs, saving some blocks from DeletedBlock, but keeps duplicate
Database got adjusted for this work, so owner-of-record is the new file

Using the database series:
Version Block   DeletedBlock
.9      394     N/A
.8      394     N/A
.7      N/A     394
.6      N/A     394
.5      N/A     394
.4      N/A     394
.3      N/A     394
.2      N/A     394
.1      468     394
Current 474     N/A

Error from test all seemingly blames the old file whose block got stolen

duplicati-i1f8d3bb8c2a245f1a84024fe837f070a.dindex.zip.aes: 1 errors
	Extra: UoAzUJ05Oobk7gOJx+wICyg4ATAw2IzKSZXSmRUmme8=

duplicati-b107368bda8404624865f7ed9dc5e6fbd.dblock.zip.aes: 1 errors
	Extra: UoAzUJ05Oobk7gOJx+wICyg4ATAw2IzKSZXSmRUmme8=

Seeing that a compact has just run is relatively simple. It gets a section in the job log summary, with relevant details inside.

image

Searching through the job's About --> Show log --> Stored can be slow. Looking in database can be easy but needs tools.
In red is the new owner of the block. One loser is in blue but is now deleted. The other loser isn't here but took the blame.

image

@ts678 ts678 changed the title test with full-remote-verification shows "Extra" hashes from error in compact test all with full-remote-verification shows "Extra" hashes from error in compact Mar 25, 2022
@duplicatibot
Copy link

This issue has been mentioned on Duplicati. There might be relevant details there:

https://forum.duplicati.com/t/is-there-a-work-arround-for-google-drive-403/14148/8

@ts678
Copy link
Collaborator Author

ts678 commented Mar 28, 2022

I'm leaving the errored backup alone (in case it needs further examination) and switching back to an older one which I think was used in earlier analysis. That one had 10 issues, so hit the reporting limit. Trying to get the rest did a compact and cleared up the problem for now. We'll see if it comes back. I don't consider this issue to be super serious if it ends at test all but I don't know what other problems may arise from possible logic flaw. Nevertheless I set the backup corruption label on this one because it is.

I think this is as good a test case as one could reasonably get. At least it was my intent to make it easily reproducible by anyone. Anybody want to try? It doesn't have to be one of the "regulars". Duplicati is always encouraging new people to step up to help. Lucky winner may then see a reproduced label on this. Then maybe that will be enough to get an expert to help on an actual fix.

@ts678
Copy link
Collaborator Author

ts678 commented Apr 3, 2022

The other backup got 22 errors each in a dindex and dblock today. The problem ending looks just like with previous problems, where the Extra: blocks were in both the DeletedBlock and the Block table, then compact ran on a dblock containing the block, which propagated into the new volume erroneously because it was in the Block table. Problem is it was in it for another dblock.

VolumeID        Name and RemoteOperation history

1205            duplicati-b75be110ff47647b491879826990a25f2.dblock.zip.aes
                put     March 31, 2022 7:52:14 AM
                get     April 2, 2022 7:57:15 AM
                delete  April 2, 2022 7:58:41 AM

compacted into

1339            duplicati-bc713e91fa7f648d9b852cd95a0305e1e.dblock.zip.aes
                put     April 2, 2022 7:57:15 AM

while block was also held by

1290            duplicati-b612ce90afa5c452e9c1a46722037196f.dblock.zip.aes
                put     April 1, 2022 12:52:48 PM

After compact "stole" blocks from 1290, 1290 and its dindex got blamed for Extra.

This one is more mysterious on how DeletedBlock got the block. History ran short.
It's also not clear how the block got into 1290, because 1205 still existed then.
Profiling log looks like it did it at backup start, so not a post-backup special.

@ts678
Copy link
Collaborator Author

ts678 commented Apr 20, 2022

Got another one (on a different backup of same source to a different destination).

duplicati-b2b632cf6898b463cb949d797cc00d2fc.dblock.zip.aesG: 1 errors
        Extra: WTx2xHkQY6m0Omhsp6yHFDCCccydeXaqX7X1YSL6630=

duplicati-i8d787009ef2c416985e1f179cdc7435a.dindex.zip.aes: 1 errors
        Extra: WTx2xHkQY6m0Omhsp6yHFDCCccydeXaqX7X1YSL6630=

duplicati-b472120b1edc042108e2316799e5342c4.dblock.zip.aes: 1 errors
        Extra: O4B1s7rjoIeG55lfg6vrPpD5HUE/Evzcr3P012TGxlA=

duplicati-i905611858c4a459abaed84069d9c6e09.dindex.zip.aes: 1 errors
        Extra: O4B1s7rjoIeG55lfg6vrPpD5HUE/Evzcr3P012TGxlA=

duplicati-b44b2bc9fe3f74c1e9e91f4979a6ab030.dblock.zip.aes: 4 errors
        Extra: +HdL9kdwXDt5KYyReuJQ8i6rs/dUPQYJZrV8WdJJbwg=
        Extra: 10BoqWrEBxJHgyLWg41D+GaFQsAqecxfEuV+0I4HVUg=
        Extra: 65MDtM8t6XJKggESuGrH40REVXzfbRoCGrLqh25F6ws=
        Extra: x0FGSdGXGBHo7kThSklH22CxuXMdOq+xicpkN4sklwE=

duplicati-i5db75839d4f247bfb21a08c568db715e.dindex.zip.aes: 4 errors
        Extra: +HdL9kdwXDt5KYyReuJQ8i6rs/dUPQYJZrV8WdJJbwg=
        Extra: 10BoqWrEBxJHgyLWg41D+GaFQsAqecxfEuV+0I4HVUg=
        Extra: 65MDtM8t6XJKggESuGrH40REVXzfbRoCGrLqh25F6ws=
        Extra: x0FGSdGXGBHo7kThSklH22CxuXMdOq+xicpkN4sklwE=

Here's a timeline going back left-to-right (with a skip) showing what had blocks.
To left of comma is the Block table. To right of comma is the DeletedBlock table.

My expanded database history let me trace blocks back to a normal state this time.
Here I mean a block is only in Block table. In table below, volume is on the left.

I'd consider volume on the right normal too. Volume on both sides is questionable.
Regardless, that is what sets up for the compact stealing blocks from prior owner.

now     .1      .2      .3      .4      .5      .6      .7      .8      .9      .17     .18
WTx2xHkQY6m0Omhsp6yHFDCCccydeXaqX7X1YSL6630=
911,--- 881,773 881,773 881,773 881,773 881,773 ---,773 ---,773 ---,773 773,---

O4B1s7rjoIeG55lfg6vrPpD5HUE/Evzcr3P012TGxlA=
911,--- 882,773 882,773 882,773 882,773 882,773 ---,773 ---,773 ---,773 773,---

+HdL9kdwXDt5KYyReuJQ8i6rs/dUPQYJZrV8WdJJbwg=
911,--- 870,737 870,737 870,737 870,737 870,737 870,737 870,737 ---,737 ---,737 ---,737 737,---

10BoqWrEBxJHgyLWg41D+GaFQsAqecxfEuV+0I4HVUg=
911,--- 870,737 870,737 870,737 870,737 870,737 870,737 870,737 ---,737 ---,737 ---,737 737,---

65MDtM8t6XJKggESuGrH40REVXzfbRoCGrLqh25F6ws=
911,--- 870,737 870,737 870,737 870,737 870,737 870,737 870,737 ---,737 ---,737 ---,737 737,---

x0FGSdGXGBHo7kThSklH22CxuXMdOq+xicpkN4sklwE=
911,--- 870,737 870,737 870,737 870,737 870,737 870,737 870,737 ---,737 ---,737 ---,737 737,---

Files with Extra messages. They were block holders of record in .1 before compact:

881     Verified        duplicati-b2b632cf6898b463cb949d797cc00d2fc.dblock.zip.aes
885     Verified        duplicati-i8d787009ef2c416985e1f179cdc7435a.dindex.zip.aes

882     Verified        duplicati-b472120b1edc042108e2316799e5342c4.dblock.zip.aes
884     Verified        duplicati-i905611858c4a459abaed84069d9c6e09.dindex.zip.aes

870     Verified        duplicati-b44b2bc9fe3f74c1e9e91f4979a6ab030.dblock.zip.aes
872     Verified        duplicati-i5db75839d4f247bfb21a08c568db715e.dindex.zip.aes

These dblock files had DeletedBlock entries in .1 before the compact deleted them:

737     Deleted         duplicati-bb484aa3da36d4c9d9c7aa3e90e512bbb.dblock.zip.aes
773     Deleted         duplicati-b5be6ba26b08c42539933530ec0f42988.dblock.zip.aes

Here's the dblock file that came out of the compact which caused the Extra errors:

911     Verified        duplicati-bdb9eadcf722e41ca8c37f7bc42138b2c.dblock.zip.aes

RemoteOperation table. Files with Extra weren't in compact, but had blocks stolen:

Operation       Path
get             duplicati-bb484aa3da36d4c9d9c7aa3e90e512bbb.dblock.zip.aes
get             duplicati-b8c0aac6ef98d454facc3c28e3f607ca4.dblock.zip.aes
get             duplicati-b8b0c3322b5374d4880d45125a1758546.dblock.zip.aes
get             duplicati-b5be6ba26b08c42539933530ec0f42988.dblock.zip.aes
get             duplicati-b57f35d783e4b4113853a0d3f7cdd08aa.dblock.zip.aes
get             duplicati-b58fab0b45b5b448a9afabf7170e795b5.dblock.zip.aes
get             duplicati-b381349846f9b4f04bf0d772e7e502f61.dblock.zip.aes
get             duplicati-b8dcf2d3ccbb94954ba04ed3a51f765b4.dblock.zip.aes
get             duplicati-b6e0be33b4c4d4daeb816c1235c7083cb.dblock.zip.aes
get             duplicati-b948fefabfe684434984206dd2e1a6bd2.dblock.zip.aes
get             duplicati-be7f18a1d11c14e5dae1b2a7b7afdaea6.dblock.zip.aes
get             duplicati-b9fa337ecf861411894d67f9d6bf661ba.dblock.zip.aes
get             duplicati-b4f3a5c9a4e2a4ba5b560ca17125e144e.dblock.zip.aes
put             duplicati-bdb9eadcf722e41ca8c37f7bc42138b2c.dblock.zip.aes
put             duplicati-i88e1646eff7c4e3f8dd5fd701026fd23.dindex.zip.aes
delete          duplicati-bb484aa3da36d4c9d9c7aa3e90e512bbb.dblock.zip.aes
delete          duplicati-i61fba43b5b114944ad4673b3f76f481a.dindex.zip.aes
delete          duplicati-b8c0aac6ef98d454facc3c28e3f607ca4.dblock.zip.aes
delete          duplicati-i6eda025078f34d2a8fc637325defa3c3.dindex.zip.aes
delete          duplicati-b8b0c3322b5374d4880d45125a1758546.dblock.zip.aes
delete          duplicati-i6e7b4f8dc11940cc8c6e66769b945fd4.dindex.zip.aes
delete          duplicati-b5be6ba26b08c42539933530ec0f42988.dblock.zip.aes
delete          duplicati-i7de28e27ea4c4fbc92f2d918eb399716.dindex.zip.aes
delete          duplicati-b57f35d783e4b4113853a0d3f7cdd08aa.dblock.zip.aes
delete          duplicati-icd7bde8dd72e4dabad5e7b485aefd9ea.dindex.zip.aes
delete          duplicati-b58fab0b45b5b448a9afabf7170e795b5.dblock.zip.aes
delete          duplicati-i2245f82ca4794e58a18711305fde7f2f.dindex.zip.aes
delete          duplicati-b381349846f9b4f04bf0d772e7e502f61.dblock.zip.aes
delete          duplicati-iab1d460ec95446a68552d0d322f8bd3b.dindex.zip.aes
delete          duplicati-b8dcf2d3ccbb94954ba04ed3a51f765b4.dblock.zip.aes
delete          duplicati-ide091604ee1648a0a231a8076f0e54de.dindex.zip.aes
delete          duplicati-b6e0be33b4c4d4daeb816c1235c7083cb.dblock.zip.aes
delete          duplicati-i3f40d0f7688d4ab185546718e82b761c.dindex.zip.aes
delete          duplicati-b948fefabfe684434984206dd2e1a6bd2.dblock.zip.aes
delete          duplicati-i3980688b2e7d4ac28284a08a4c04fc92.dindex.zip.aes
delete          duplicati-be7f18a1d11c14e5dae1b2a7b7afdaea6.dblock.zip.aes
delete          duplicati-i38b6df1b376f4932ad0ec2334c30159a.dindex.zip.aes
delete          duplicati-b9fa337ecf861411894d67f9d6bf661ba.dblock.zip.aes
delete          duplicati-i5b3a910c3c4b4663b8213a71a36e0a25.dindex.zip.aes
delete          duplicati-b4f3a5c9a4e2a4ba5b560ca17125e144e.dblock.zip.aes
delete          duplicati-i89a5b99d830044abbee8b5ea1e7f0d91.dindex.zip.aes

I think this is enough looking to demonstrate that this problem pattern holds.
The system is needed for other tests, but unfortunately test gives this noise.

@duplicatibot
Copy link

This issue has been mentioned on Duplicati. There might be relevant details there:

https://forum.duplicati.com/t/test-command-produces-extra-errors/13773/19

@duplicatibot
Copy link

This issue has been mentioned on Duplicati. There might be relevant details there:

https://forum.duplicati.com/t/best-way-to-verify-backup-after-each-incremental-run/14925/2

@Jojo-1000
Copy link
Contributor

Jojo-1000 commented Jun 26, 2023

I can reproduce this in the latest version. The error luckily does not break database recreate, it seems that will choose one of the volumes containing the block. The other is put in the DuplicateBlock table instead.

From the comments on the schema, the DeletedBlock table should be used for wasted space computations. In theory, it should be possible to move a block back from deleted to the normal table. FixMissingBlocklistHashes called by repair seems to try and find blocklist hashes from that table. However, the current database recreate process does not populate the DeletedBlock table. Instead it looks like all blocks are put into the Blocks table, so after a database recreate after backup 2 the issue should not be reproducible.

That is good, but it also means any space computations on recreated databases are incorrect. This should be reproducible by getting a dblock file close to the compact threshold, recreating the database and then deleting enough data to get over the threshold. From what I understand, the recreated database would not run compact, but the original database would.

So, to fix this I would suggest two changes:

  1. When adding blocks, run a second query on DeletedBlock to see if the block was previously deleted. If that is the case, use the queries and checks from FixMissingBlocklistHashes to move it into Blocks. We should probably be extra sure and also check that the corresponding volume exists and is verified. It would be pretty bad if the block is lost because the DeletedBlock entry was out of date.
    This would fix this issue. Maybe we would need to evaluate the performance of this (with auto compact and default threshold the table can in the worst case contain 25% of blocks, but it should be a lot less in practice).
  2. When running a database recreate, after adding all blocks, check the Block table for blocks that do not appear in any file. These are not referenced and should go into the DeletedBlock table (I hope). After this, the space computations for compact should be correct again.

If 1 is implemented, there should be no situation where a duplicate block is added, so the DuplicateBlock table should be empty and the test command would not find extra blocks.

Edit:

To note, this is definitely not an issue in the compact algorithm. I was able to simplify the reproduction of the extra hashes on test command to this:

  • Create local backup with --no-auto-compact, and --full-remote-verification
  • Add A.txt and B.txt, backup 1
  • Delete A.txt, add C.txt, backup 2
  • Add A.txt, backup 3
  • Test: No errors
  • Database recreate
  • Test: Errors as above

It seems, if the deleted blocks are in the deleted table, test does not complain. Database recreate (and probably compact also) screw up the deleted block table which is why it complains about the extra block. Nevertheless, the fixes above are still valid, it just shows that compact is not to blame.

@ts678
Copy link
Collaborator Author

ts678 commented Jun 27, 2023

it just shows that compact is not to blame.

It shows a method not needing a compact. The original steps were quite a bit different, with compact rather than recreate.
This seems like it could be viewed as a second path to the same end result, while not saying much about the original path.
Better proof of how compact fits in is to split it out in the original steps, e.g. do a test before and after manual compact.

When adding blocks, run a second query on DeletedBlock

I do find it an appealing idea to recycle rather than having one-way trip to waste (and duplicated block at the destination).

Maybe we would need to evaluate the performance of this

Block table has performance help DeletedBlock now lacks, such as many INDEX, and use-block-cache memory to avoid DB.

@Jojo-1000
Copy link
Contributor

The reason why test does not always complain about extra blocks is that the extra blocks are in the DeletedBlock table. Both Block and DeletedBlock are combined for the expected blocks in any given volume.

After a bit more digging, you are correct that the compact error is related, but different:
Compact ( DoCompact in CompactHandler.cs line 160+) checks all blocks in a volume whether they are still in use (that is, in the Block table). It does not check that the Block reference links to the same volume. That means if a block is duplicated in a volume that is compacted, it is assumed to be the only place where that block is. The block is then set to the new volume ID in the database via an UPDATE based on hash and size. Instead of moving from the compacted volume (from backup 1) to the new volume, it is moved from the kept volume (backup 3) where the block is duplicated. This leaves the kept volume with a block that is missing an entry.

In short, compact assumes there are no duplicate blocks and can break if there are. To prevent this, we could either trust that changes to AddBlock prevent duplicated blocks, or the check for UseBlock could also check that the volume is correct. I think that would be pretty cheap since it only needs one additional query to get the volume ID per compacted volume.

Block table has performance help DeletedBlock now lacks

The main reason why the block table performs so poorly is because it is so large. I don't think the deleted block table would get so big (although maybe you could check that in a long running backup of yours with a retention policy, mine keep all versions). In addition, these checks would only be performed after the block memory cache or table are checked, so only for new blocks. To test the worst case performance, I could imagine a scenario where large amounts of new files are added every backup, and 25% of those are deleted for every version (just less than the compact threshold). That should give a good idea how much of an impact this has. I am going to make a draft PR until the performance impact is evaluated. If it is too bad we could still add an index as well.

and use-block-cache memory to avoid DB

It seems like that option exists, but it is never used anywhere in the code. There is even a Dictionary for the cache, but it is always set to null and never used.

@ts678
Copy link
Collaborator Author

ts678 commented Jun 27, 2023

It seems like that option exists, but it is never used anywhere in the code.

Handling the maybe-easier strange finding first. It's possible it's an incomplete removal in Remove unsed method and call #3584 where --disable-filepath-cache (line 514) was deprecated, but not --disable-block-cache. Maybe just an oversight then? You can see from either CLI help or GUI how deprecation is presented. Ideally the manual gets updated, but people can forget.

The main reason why the block table performs so poorly is because it is so large.

We have some way-over-recommended-block-count backups in the user community, and the thought of doing linear SQL scans through 25% (default compact threshold) of them on any new block addition sounds slow. You've seemingly looked at the waste computations a little. Anyway, testing is in order and I agree an index could help if it's needed, so on the right track, and we'll see.

it is assumed to be the only place where that block is

Fix database issue where block was not recognized as being used #3801 comes to mind. although I'd have to study both awhile longer. It's just a feeling so far about the perils of block tracking, spread across different tables, and the assumptions one makes.

BTW I appreciate you looking at this and making good progress. I did have another test case partly written. Not sure it's needed.

@Jojo-1000
Copy link
Contributor

I did have another test case partly written. Not sure it's needed.

I would love to have more test cases, also to catch old bugs that might reappear. The problem is that they already run so slow (1h) because it is all I/O bound.

Jojo-1000 added a commit to Jojo-1000/duplicati that referenced this issue Jun 27, 2023
This prevents duplicated blocks after a block was deleted and re-added (duplicati#4693).
Also fix RemoveMissingBlocks in LocalListBrokenFilesDatabase, which did not clear the DeletedBlock table.
Jojo-1000 added a commit to Jojo-1000/duplicati that referenced this issue Jun 27, 2023
Check that blocks which are moved are recorded for the volume to be deleted. If duplicate blocks exist and one is in the DeletedBlock table, this can erase a block entry on an unrelated volume (duplicati#4693).
Jojo-1000 added a commit to Jojo-1000/duplicati that referenced this issue Jun 27, 2023
This prevents duplicated blocks after a block was deleted and re-added (duplicati#4693).
Also fix RemoveMissingBlocks in LocalListBrokenFilesDatabase, which did not clear the DeletedBlock table.
Jojo-1000 added a commit to Jojo-1000/duplicati that referenced this issue Jun 27, 2023
Check that blocks which are moved are recorded for the volume to be deleted. If duplicate blocks exist and one is in the DeletedBlock table, this can erase a block entry on an unrelated volume (duplicati#4693).
@Jojo-1000 Jojo-1000 linked a pull request Jun 27, 2023 that will close this issue
@duplicatibot
Copy link

This issue has been mentioned on Duplicati. There might be relevant details there:

https://forum.duplicati.com/t/database-recreation-not-really-starting/16948/87

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants