Conversation
Vutshi
commented
Apr 25, 2025
- Fix imagesize and filesize in BMP header
- Implement 4bpp BMP file support for drawing and saving
Implement 4bpp BMP file support for drawing and saving
|
This is great @Vutshi, thanks! Are you now seeing half the size for export.bmp using 4BPP, in the ~115k range or so? BTW, there are a couple cases where your use of Thank you! |
yes, i did it because 8BPP didn’t fit into my fdd1440. RLE is next.
yes, please. I’m doing things mostly blindly, iterating until compiler is happy and it works :) |
|
Hello @Vutshi,
Well, after looking more closely, it seems both the IA16 and OWC compilers optimized out my concerns. However, the C86 compiler doesn't, for what that's worth. The couple of issues I noticed were: In this case, there's no need to explicitly cast the three operands, since there's no chance of an overflow using 16-bit addition here to later store into long hdrsize. While both IA16 and OWC recognized this and didn't generate long add code, C86 won't and lots of extra code is generated. There isn't an issue with loading a 16 bit result into a long (hdrsize). In this case, the IA16 compiler was smart enough that it realized that hdrsize itself doesn't need to be declared long, and doesn't allocate space for it. It doesn't need to be long, just because bmpf.fBoffBits is DWORD. (The previous draw_bmp routine declared hdrsize DWORD (unsigned long) because it executes an The other case was: Here, there is the possibility of 16-bit multiply overflow, so one of the operands does need to be cast (long). In C, when one operand is a higher bit-width than the other, the remaining operand is automatically up-converted to the higher bit-width. There are sometimes cases where the compiler can generate better code multiplying 16x32 instead of 32x32 if the second (long) were removed, but this isn't one of them. Both IA16 and OWC generate right shifts for the /32 and left shifts for * 4 here, which is fast. I haven't checked yet what C86 does if/when optimization constant DIVs and MULs. (I can't yet easily checkout C86 output since I have not yet completed our disassembler for the C86 assembler AS86's .o output object format yet - coming). So sometimes it is better to use left/right shifts instead of multiply/divide. (This gets complicated, as we found recently that IA16 was converting two left shifts back to MUL * 80 with -Os specified, which is why we changed here to -O2). If a right shift and left shift were coded here, it might be possible to change that combo to a single right shift, with an appropriate comment; although it is possible the IA16 and OWC (but not C86) compilers are already optimizing that. Phew! :) Anyways, your code looks great. In general, my recommendation is to use casts only when needed, as explicit casts can cause problems later or with other compilers. The compilers can figure out what to do with multiple operand sizes and/or assignments, but don't always detect overflow. In general, it looks like our IA16 and OWC compilers are pretty smart, while C86 not as much. I just happen to be compiling Paint with C86 to keep any eye on C86 code generation, so am being a bit more watchful. Thank you! |