Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Progress reporting makes large jumps, doesn't make it to 100%, on large multi-volume archives #67

Closed
skito opened this issue Dec 5, 2017 · 12 comments
Assignees
Labels

Comments

@skito
Copy link

skito commented Dec 5, 2017

Hi There,

There is an issue when observing progress with multivolume archives. For example:
override func observeValue(forKeyPath keyPath: String?, of object: Any?, change: [NSKeyValueChangeKey : Any]?, context: UnsafeMutableRawPointer?)
{
print("Extracting: \(progress.fractionCompleted)");
}

on single volume archive of around 50mb will produce:

Extracting: 8.08055588766906
Extracting: 15.2171524119645
Extracting: 25.2250117802612
Extracting: 32.4267074345756
Extracting: 40.5386480479931
Extracting: 49.5652731965651
Extracting: 57.9420499061096
Extracting: 65.1048882809258
Extracting: 75.3979794214661
Extracting: 82.5477628053197
Extracting: 90.9352648273049
Extracting: 100.0

which is perfect, but in 2.4GB multivolume archive(3 volumes * 800MB) the result is only:

Extracting: 26.7179072845146
Extracting: 37.7852841529133

which causes long progress freeze and sharp finish...

Could you check it please?

P.S. I got v2.9-beta8

Thanks,
Dimitar

@abbeycode
Copy link
Owner

@skito Hmm, interesting. Are you able to share a Dropbox link (Or equivalent) to the archive for me to test with? I don't have any archives like that. You can DM me with it on Twitter (@DovFrankel)

@abbeycode abbeycode self-assigned this Dec 7, 2017
@abbeycode abbeycode added the bug label Dec 7, 2017
@skito
Copy link
Author

skito commented Dec 8, 2017

@abbeycode Sure. The data in the archive above is confidential, but here is similar archive that behaves the same:
https://my.pcloud.com/publink/show?code=kZJOyJ7Z1xpp9zAH5tkGuTQTITqLhfKDF727

My output:
Start extraction
Extracting: 7.02339954843241e-05
Extracting: 7.09789539958747e-05
Extracting: 0.0770643468305657
Extracting: 0.0771007285253158 - immediately after start
Extracting: 25.0103128331547
Extracting: 25.0182603104251 - 2 minutes after start (approximately)
Extracting: 50.0425241028979 - 4 minutes after start (approximately)
Extraction finished - 8-10 minutes after start (approximately)

Let me know if you have difficulties replicating the issue.

@skito
Copy link
Author

skito commented Dec 15, 2017

Also .uncompressedSize property of URKArchive doesn't seems to be correct on multivolume archives.

Multivolume archive that in uncompressed state is 3.18 GB is shown as 7.7 GB when accessing .uncompressedSize before starting the extraction...

@abbeycode
Copy link
Owner

@skito the uncompressedSize is another interesting one. Would you mind reporting as a separate issue?

abbeycode added a commit that referenced this issue Dec 24, 2017
@abbeycode
Copy link
Owner

Alright, so I was able to reproduce with the archive you linked to. I added some additional diagnostic logging. These are the two things I notice:

  1. The jumps in progress you mentioned
  2. Progress ends at 50% or so, rather than 100%

The first item is expected behavior. When you extract files, the progress increments as each file completes. The archive you sent has a couple large files, and the rest are tiny. That's why you see those jumps. If it's important to you to see more granular updates, you can extract the files by listing them and then extracting to a buffer and writing to disk in your own code, so you'll see as each block completes.

The second one isn't expected, but as far as I can tell, the archive header is reporting inaccurate metadata (a total uncompressed archive size of 5.38 GB instead of 2.89 GB it actually is). That's the same issue you mentioned above, but I'm not yet sure what's causing it. I'll have to see which header(s) the total uncompressed size comes from, and check if the archive you sent has correct data stored. If not, then I'll have to fix it on my end.

abbeycode added a commit that referenced this issue Dec 26, 2017
…archives, which resulted in incorrect total uncompressed size being reported. It looks like the file that spans multiple archive parts would get reported for each of the parts that contained it (Issue #67)
@abbeycode
Copy link
Owner

Looking into it, there was definitely a bug that resulted in an incorrect total uncompressed size being reported. Files that spanned multiple volumes would get listed twice by -listFileInfo:. I added uniquing of the paths reported by -listFileInfo:, resolving the second issue above. No need to report it as a separate issue, since I don't think there's anything for me to do with the progress updating being sparse, as explained above. I'll include this update in the next beta.

@skito
Copy link
Author

skito commented Dec 26, 2017

Thanks for the diagnostics!

  1. About the progress - is there some sort of callback function that I can pass from Swift and manually handle the progress as you suggested? If so, could you provide very basic example of it?

  2. About the uncompressedSize property - it's possible to be wrong headers, but I noticed this issue with several multi-volume archives compressed with different software on both Mac and Windows. All of them reported about twice their actual size. Maybe it's due to double counting somewhere while reading the headers? Do you still want me to reported as separate issue?

@skito
Copy link
Author

skito commented Dec 26, 2017

Oh, alright then. Could you please provide basic example of handling the progress blocks?

abbeycode added a commit that referenced this issue Dec 26, 2017
@skito
Copy link
Author

skito commented Dec 26, 2017

I just saw that you already provided the example in the readme.

BOOL success = [archive extractBufferedDataFromFile:@"a file in the archive.jpg" error:&error action: ^(NSData *dataChunk, CGFloat percentDecompressed) { NSLog(@"Decompressed: %f%%", percentDecompressed); // Do something with the NSData chunk }];

Sorry for the dummy request :)

@abbeycode
Copy link
Owner

No problem :) This is what I had in mind. It compiles, but I haven't tested it:

let archiveURL: URL = //
let outputDirURL: URL = //

guard let archive = try? URKArchive(url: archiveURL) else {
    return
}

guard let fileInfos = try? archive.listFileInfo() else {
    return
}

let totalArchiveSize = archive.uncompressedSize!.int64Value
var totalExtracted = Int64(0)

for fileInfo in fileInfos {
    guard let fileHandle = try? FileHandle(forWritingTo: outputDirURL.appendingPathComponent(fileInfo.filename)) else {
        continue
    }
    
    defer {
        fileHandle.closeFile()
    }
    
    do {
        try archive.extractBufferedData(fromFile: fileInfo.filename) { (data, progress) in
            fileHandle.write(data)
            totalExtracted += Int64(data.count)
            
            NSLog("%f%% done with \(fileInfo.filename)", progress)
            NSLog("%f%% done with archive", (Double(totalExtracted) / Double(totalArchiveSize)))
        }
    } catch let extractError {
        NSLog("Error extracting \(fileInfo.filename): \(extractError)")
        continue
    }
}

@abbeycode abbeycode changed the title Strange NSProgress observe behaviour on multivolume archives Progress reporting makes large jumps, doesn't make it to 100%, on large multi-volume archives Dec 26, 2017
@skito
Copy link
Author

skito commented Dec 26, 2017

Awesome. Thanks!

@abbeycode
Copy link
Owner

I'm closing this, since the fix was merged in for the v2.9 release. Look for it in the next beta!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants