Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend BMP format support #233

Merged
merged 8 commits into from
Jan 8, 2020
Merged

Conversation

generalmimon
Copy link
Member

Current BMP spec in formats repository is somewhat rudimentary, it is not very much usable. I tried to improve that.

Every BMP file contains a DIB header (bitmap information header) part. There are several different variants. Every of them has a header_size field in the beginning, which is used for determining which header is present in the image.

The existing implementation advised to switch over this field and select appropriate type representing the particular header. Although it seems to make sense, it complicates the access to common properties (e.g. width and height). To access width in the header, you would have to do this:

instances:
  image_width:
    value: >-
      len_dib_header == header_type::bitmap_core_header.to_i ?
        header.dib_header.as<bitmap_core_header>.width :
        len_dib_header == header_type::bitmap_info_header.to_i ?
          header.dib_header.as<bitmap_info_header>.width :
          len_dib_header == header_type::bitmap_v4_header.to_i ?
            header.dib_header.as<bitmap_v4_header>.width :
            ... ?
              ... :
              0

The headers don't contain completely distinct fields, they are extending each other. The new implementation takes advantage of this: instead of rebuilding each header from nothing, it consists of several extensions that form together the required header. For example, if the present header is of type BITMAPV4HEADER, it will contain the fields from BITMAPCOREHEADER (these are inlined in bitmap_header type), bitmap_info_extension and bitmap_v4_extension.

Here is a diagram which shows supported DIB headers and their KSY type counterpart. There is a dashed line from BITMAPCOREHEADER to BITMAPINFOHEADER, because BITMAPCOREHEADER uses 16-bit integers for image width and height, while BITMAPINFOHEADER and all headers extending it use 32-bit integers (that's where the "+ 4" came from).
The DIB headers diagram

@generalmimon generalmimon requested a review from a team December 15, 2019 21:49
@generalmimon
Copy link
Member Author

I wanted to try the updated BMP spec in some real application, so I created BMP Tool, which is a debugging tool for BMP images. Currently it can only convert BMP to PNG (with some additional dumps to the screen), but I intend it to also throw some warnings and repair ill-formed files.

I made a rigorous testing against all images from BMP Suite, which is an excellent collection of samples which test reading support of BMP files quite in depth. Happy to announce that there are only 4 files whose parsing failed (all from the folder marked as "clearly invalid"), all good and questionable samples are passing. You can actually run the tests yourself, I made it quite straightforward, just install BMP Tool and follow the steps in Testing section of README.

I did my best to produce a spec to the format gallery that will be easy to use as much as possible, ensuring high format compliance with minimal effort. I think it's hard to find a BMP parser implementation that will excel more.

What do you think?

image/bmp.ksy Show resolved Hide resolved
image/bmp.ksy Show resolved Hide resolved
image/bmp.ksy Outdated Show resolved Hide resolved
image/bmp.ksy Outdated Show resolved Hide resolved
- OS21XBITMAPHEADER
doc-ref:
- https://docs.microsoft.com/cs-cz/windows/win32/api/wingdi/ns-wingdi-bitmapcoreheader
- https://www.fileformat.info/format/os2bmp/egff.htm#OS2BMP-DMYID.3.1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Array in doc-ref is KS 0.9+ feature.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I haven't realized it. I thought about throwing the lists in doc-ref out to make it compatible with 0.8, but I anyway have to use .as<f4> (which is a 0.9 feature I suppose, doesn't work even in devel Web IDE) in fixed_point_* decimal types and I unfortunately haven't figured out how to do it in 0.8. So I updated the ks-version to 0.9.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this is another great point: we currently don't have a solid way to check these PRs before merging them. Implementing CI-style checks that will clearly show that this is not ok due to (following error messages) would be great...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... and I completely agree, there's little point in clinging to 0.8. 0.9 is so much more advanced...

image/bmp.ksy Outdated Show resolved Hide resolved
image/bmp.ksy Outdated Show resolved Hide resolved
@GreyCat
Copy link
Member

GreyCat commented Jan 3, 2020

Thanks for this tremendous effort and, again, sorry for taking so long to review it. I've added some minor comments — please take a look? Otherwise, I think it's ready to be merged in.

@@ -16,7 +16,7 @@ meta:
wikidata: Q192869
endian: le
license: CC0-1.0
ks-version: 0.8
ks-version: 0.9
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, this change makes it slightly more complicated: if we'll merge it as is, it will fail compilation on https://formats.kaitai.io/ at all, as it currently only fetches "latest stable" (v0.8) to use.

Ideal solution would be (finally) releasing 0.9 to make it "officially stable", and continue using that — otherwise, we'll need to have something clever like bringing multiple ksc versions into build...

@GreyCat
Copy link
Member

GreyCat commented Jan 8, 2020

Ok, so it turned out that we have this problem with 0.9 on major scale already, so one more format with version set to 0.9 doesn't make a difference — we'll need to solve it on a higher level.

Merging in, thanks for this massive effort again!

@GreyCat GreyCat merged commit 02be57c into kaitai-io:master Jan 8, 2020
@generalmimon generalmimon deleted the bmp-format branch January 29, 2020 16:48
generalmimon added a commit to generalmimon/bmptool that referenced this pull request Apr 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants