New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fmt/468 definition not matching valid files #82
Comments
Thanks for raising this Ross and apologies that it has taken some time to come back on this. The ISOs which are not identified as fmt/468 have an additional occurrence of the character sequence after the first occurrence at 32769 and before the ÿCD001 (in all the real world examples I've looked at, this additional occurrence is at offset 34817) . Overwriting this second occurrence with alternate bytes will result in the ISOs being identified in DROID. Conversely, editing an ISO which is identified correctly to add a further CD001 after the first one, and before the ÿCD001 will result in the file no longer being identified as fmt/468 in DROID. As a short term fix, changing the signature's maximum sequence offset (i.e signature ID 730, SubSeqMaxOffset="34817") will result in the offending ISOs being identified. However, it would of course also allow malformed ISOs to be identified as false positives, since it would not enforce the required occurrence at 32769 - so is not a long term solution. Rewriting the signature more thoroughly is another option - however the way DROID processes left (and right) fragments in relation to sequence offsets comes into play here and may be relevant in other edge cases we've seen, investigation continues. |
This appears to be fixed in the build at: |
Hi Brian, As I have issues building from source because of various unit test failures previously reported, if you had a way of providing a test release of this version of DROID I could do some user testing if you feel it might be valuable. Cheers, Ross |
Hi Ross |
Thanks for sending the JAR Brian. Just letting you know it was blocked by my work's mail server. If you've my gmail (Alex or David might) then please feel free to send it there. What's your deadline for testing/release on this? Also, does the fix take care of the right fragment reported for some signatures definitions such as x-fmt/263 |
Hi Ross, Having issues sending it directly to your Gmail too, so have sent an invite to my G-Drive to both your work and personal addresses. Let me know of any problems accessing it. David |
Cheers David, have managed to access it, but will also download it at work in the morning. Tried to download a new droid on a recent system to test it tonight but experiencing unusual download speeds... Something your side? (30mbs in 20 mins) Thanks for the help! Hope to be able to feed back to you next week.
|
Seems okay here right now (~30 secs, both internal and external, plus mobile connections) is slowness persisting? I'll flag with our Web Engineers... |
After all that, I've found this fix doesn't work in cases (unlike the present one) where there is more than one left fragment - so unfortunately the revised code will need some further reworking (and could be made more efficient anyway). We don't seem to have quite the same issue with right fragments, possibly a variation on a theme in such cases, needs further investigation... |
Thanks both! - I'll hold off testing for now. I guess it was a good exercise nontheless!! (I hope not too bothersome :) ) David - download from my work was just over a minute so will check what's going on at home. Thank you nonetheless. |
As I suspected, this is indeed related to an issue we've found where intermediate left/right fragments with variable offsets occur more than once within their valid offset range - leading to subsequent fragments not matching as expected. |
Fixed in 6.3 |
DROID: 6.1.5
Sig: v84
Container: 20160112
I have a set of ISO's created using different tools that aren't being matched by the current PRONOM definition.
The internal structure suggests that they should be matching the signature definition. The output from Siegfried also shows that this should work:
The distance between both signatures should be 4095 in each of these instances which would fit into the PRONOM definition:
Attached are a set of skeleton files created from the ISO's that I am looking at.
Also attached is my standard skeleton file. This unfortunately does not suffer the same issue. Please note, that the files attached are okay to be wrapped in any future test suite you create depending on the resolution of this issue.
isos.zip
The text was updated successfully, but these errors were encountered: