Conversation
Paint cursors update
|
@ghaerr, we switch XOR on and off on every call of |
|
Thanks for the cursors @Vutshi! I can see we might need to start a cursor collection though? IMO, the XOR cursors look pretty slick, but for outlined (normal) cursors, the small cursor seems pretty small. The large cursor looks good, but it's got a pretty short tail. Did you design that one in order to be faster than the original 16x16 cursor for display on slow 8088 systems?
Actually, I don't recommend that. The reason being that we're then making an assumption before every draw routine that XOR would be ON, which could get problematic. The nice thing about the current design is that lower level routines don't make any assumption about the OP_SET/OP_XOR state, which speeds them up. Microwindows, for example, has to set the OP mode before every routine, which effectively slows things down, although the reason for that is the drivers have to work with any application-level graphics context. The other reason is that changing OP mode when moving the cursor is very fast (only 4 machine instructions!), compared with actually drawing the pixels, e.g. it's ~50x slower drawing the pixels since XOR only has to change once per mouse event, versus once per pixel.
Other reasons are, for instance, calls to draw_bmp would have to be modified, even for the latest "load bmp file" request, since it would then need to know that the upper level had decided that XOR was standard. If XOR was the standard drawing mode for the low-level drawing routines themselves, it would make more sense IMO. |
|
On the subject of 8088 optimization and elapsed time for say, XOR mode on/off versus drawing a pixel, I use the following table of oft-used or very slow instructions to give an idea of where time is being taken that really matters: Note that the 4-instruction set_op macro (see vgalib.h) uses ~(4+2+2+12)=20 cycles total, where as a PUSH/POP pair is 27, and a function call with standard overhead is 72+ cycles before doing anything. That means that just calling the drawpixel routine uses 72+ cycles, even though the drawpixel routine itself is written in ASM. I haven't yet bothered to count the drawpixel ASM cycles, but its lots lots more than 20 cycles, and gets called ~30 times for the XOR small cursor. So we're probably talking 4500+ cycles just for the drawpixels, versus 40 to turn XOR on then off again. The show cursor routine itself is ~320 instructions long, even with a (very low) average of 4 cycles/instruction, =1280 cycles, plus 4500+ cycles.
This is potentially a very good idea, although it'd only work for XOR cursors on EGA/VGA (that is, not portable!). The tricky part will be perfecting rotating the cursor bits to match the X byte alignment of the cursor and display, and then doing that for each Y line quickly. But looking at drawpixel ASM, it spends a third of the routine just figuring out the memory address of the X,Y pixel before doing anything. So it seems that a fast "mini-blit" where an aligned monochrome bitmask (in other words, an XOR cursor) could be transferred quickly to memory in a single function call, would really speed things up. Another idea, as I'm looking at the source, would be to have a drawpixel routine that takes a memory address, rather than X, Y location, would allow the show/hidecursor routines to calculate an address and mask once, then very quickly use a macro to draw the pixel, or a group of pixels within the single address being passed. Overall, a lot more could be done for speed, especially by using a few well-thought-out macros. I'll think more about it. |
|
Thanks for the very useful table. I am surprised that |
XOR:
