Skip to content
This repository has been archived by the owner on Apr 25, 2022. It is now read-only.

Improve format autodetection #17

Closed
rr- opened this issue Apr 4, 2015 · 0 comments
Closed

Improve format autodetection #17

rr- opened this issue Apr 4, 2015 · 0 comments
Assignees
Labels

Comments

@rr-
Copy link
Member

rr- commented Apr 4, 2015

Right now the program just loops through all possible format readers and tries to unpack with each one of them. Exception happened? Try another reader.

This causes ugly anomalies. For example:

  • I see XP3, and I know for sure it's XP3 thanks to XP3\r\nWhatever magic number inside. But I haven't supplied --plugin, which causes me to throw an exception. The file is mistakenly delivered to next reader.
  • I see ANM archive, and it's nice. It correctly unpacks 0 files. But in reality it was a NWA sound file all along, I just didn't happen to know this because ANM was earlier on my check list.

This should be fixed. I could add new exception, say, RecognitionError and catch only that, but this won't make any good to detecting false positives (example from second point).

What should be done instead:

  • Add to every Transformer (Add common interface for archive and converter #14) a new method, is_recognized
  • Before calling unpack, loop through all Transformers
  • See how many of them recognize the file
    • If it's just 1 - great! Call unpack on it, and then call it a day.
      • The unpacker now throws an exception urging you to provide --plugin parameter. Everything goes well, because the file is no longer passed to the next transformer.
    • If it's more than 1 - this is good, too. I can tell it to the user and advise him to select the --fmt manually.
    • If it's 0, tough luck. File is not recognized.

Cons of this solution are limited and, in my opinion, negligible:

  • no account for situation when is_recognized needs --plugin to be able to recognize the file perfectly. In these cases we can try looking at file extensions, their names, etc.
  • I need to implement is_recognized for every support format. This is going to be tough!
rr- pushed a commit that referenced this issue Apr 4, 2015
rr- pushed a commit that referenced this issue Apr 6, 2015
rr- pushed a commit that referenced this issue Apr 12, 2015
@rr- rr- closed this as completed in ed841aa Apr 12, 2015
rr- pushed a commit that referenced this issue Apr 12, 2015
@rr- rr- added the general label Jul 10, 2016
@rr- rr- self-assigned this Jul 10, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

1 participant