Permalink
Switch branches/tags
Nothing to show
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
809 lines (629 sloc) 29.8 KB

Journal

I plan to follow Maxim's tutorial. It's going to be a messy stream-of-consciousness type of writing, documenting my progress, pain points and victories alike.

WLA-DX

I downloaded latest source code for WLA-DX. After reviewing it, it seemed it's using CMake. I downloaded current Windows win64-x64 Installer and installed CMake. In CMake GUI, I pointed to wla-dx folder for source and wla-dx\binaries for build files. I generated Visual Studio 14 (2015) x86 solution, which compiled fine. I dropped release build of executables in a folder that's listed in PATH variable.

MEKA

MSVCR71.dll

I downloaded last official release: Meka 0.73 from 2010. Meka Windows build doesn't work, it is missing MSVCR71.dll. 71 is a reference to Visual C++ 7.1, which is part of Visual Studio .NET 2003. I hope MSVCR71.dll is part of .NET Framework 1.1 redistributable. Downloaded it, got dotnetfx.exe file (24,265,736 bytes). I don't think you can install this ancient software on Windows 10 x64. I ran it with question mark parameter: dotnetfx.exe /?. This popped a dialog, telling me I can use /Q for quiet mode, /T:full-path to use explicit temporary folder, /C to extract only and /C:cmd to override default install command. I ran dotnetfx.exe /T:%USERPROFILE%\Desktop\dotnet11 /C. This created 5 files in dotnet11 folder on my desktop: InstMsi.exe, InstMsiW.exe, install.exe, netfx.msi and netfx1.cab. Next, I ran admin install on msi file to extract compressed files within: msiexec /A netfx.msi TARGETDIR=%USERPROFILE%\Desktop\dotnet11-admin /Qb. I then found MSVCR71.dll in dotnet11-admin\Win\Microsoft.NET\Framework\URTInstallPath folder. After copying MSVCR71.dll to folder with mekaw.exe, it launched.

Initial setup

On first launch Meka prompted me for initial setup:

Meka for Windows initial setup screenshot

I clicked on checkbox to enable debugger and left other options as-is. Meka launched, and to my mild surprise, it turned out to be a full screen program.

Building Meka from source on Windows

There is a more recent version on Omar Cornut's Meka GitHub project. For whatever reason, it's enormous: ~70 MB (◔_◔). Cloning it in GitHub desktop client takes forever. Ok, cloning is done, and it contains huge history: ~50 MB in .git folder. It also contains a bunch of large and totally unnecessary binary files:

binary file size in bytes
allegro-msvc2013-x86-5.1.12.zip 11,590,153
allegro_deps-msvc2013-x86-1.2.0.zip 2,851,056
nasmw.exe 324,096
upx.exe 295,936
zip.exe 135,168

This could probably be fixed by carefully editing .git history. May be I should try it, just it for fun.

I only have access to Visual Studio 2015 at this time. I opened meka\meka\srcs\projects\msvc\Meka.sln solution. Visual Studio did some conversion activity and opened solution successfully.

First attempt at compiling failed due to missing Allegro library files:

missing libs/allegro5/allegro.h

I unzipped meka\meka\libs\allegro-msvc2013-x86-5.1.12.zip and meka\meka\libs\allegro_deps-msvc2013-x86-1.2.0.zip. This fixed first build issue.

Second build issue is missing hq2x16.obj:

missing hq2x16.obj

I tried to just kludge through by running bundled meka\meka\libs\nasm\nasmw.exe meka\meka\srcs\hq2x16.asm. This didn't work, throwing tonns of hq2x16.asm:NNN: error: binary output format does not support external references, where NNN are line numbers. Looking at makefile rule for asm files shows that they are built with this command: $(ASM) $(DEF_OS) -f $(OTYPE) $< -o $@, where ASM = nasm, DEF_OS=-DARCH_WIN32 and OTYPE=win32. Brief web search revealed that $< stands for source file and $@ for generated file.

I ran the following command: meka\meka\libs\nasm\nasmw.exe -DARCH_WIN32 -f win32 meka\meka\srcs\hq2x16.asm -o meka\meka\srcs\hq2x16.obj. It worked, producing hq2x16.obj. I tried to plant hq2x16.obj in Release folder, but Visual Studio solution rebuild still failed the same. I think Visual Studio removes all files in build folders as part of build.

I searched in Visual Studio project files for hq2x16 and found a reference in meka\meka\srcs\projects\msvc\Meka.vcproj. Here is the custom build rule command-line: CommandLine="&quot;..\..\..\libs\nasm\nasmw.exe&quot; -O1 -DWIN32 -f win32 &quot;$(InputPath)&quot; -o &quot;$(OutDir)/$(InputName).obj&quot;&#x0D;&#x0A;"

I found hq2x16.asm file in Visual Studio Solution Explorer:

hq2x16.asm in Solution Explorer

I right-clicked and selected Properties, this opened custom build dialog:

hq2x16.asm custom build dialog

I modified custom build rule from: "..\..\..\libs\nasm\nasmw.exe" -O1 -DWIN32 -f win32 "%(FullPath)" -o "$(OutDir)%(Filename).obj" to notepad.exe %(FullPath), hoping that this will open notepad with asm file in it. This didn't happen on solution rebuild, but it does happen if I right-click on hq2x16.asm in Solution Designer and choose Compile. I replaced nasm executable with my troubleshooting program printargs and got this on right-click Compile:

printargs output when it's called as nasmw.exe

I restored proper nasm executable and ran right-click Compile, it produced meka\meka\objs\meka\Release\hq2x16.obj. This means custom build rule is fine as is, but for some reasons solution and project doesn't attempt to build hq2x16.asm.

Original project file in github master is from Visual Studio 2008. I wonder if conversion to 2015 breaks it somehow. A good test would be to install Visual Studio 2008 on my Windows XP laptop and try to build the solution.

I'll pause build from source effor for now. I may attempt to create a fresh project and solution from scratch later. Even better would be to generate project and solution using CMake.

Documentation

Tutorial points to Charles MacDonald's web site that used to be at cgfm2.emuviews.com. At some point in early 2015 it moved to dreamjam.co.uk/emuviews/. I downloaded msvdp.txt, sc3000h.txt, smsmap.txt and smstech.txt. Sadly links to zip files with sample code are broken and I can't find any mirrors.

I also got richard.txt and z80cpu_um.pdf from Maxim's tutorial's page and WLA-DX manual, originally from Ville Helin's site.

Tools

In addition to WLA-DX, Maxim's tutorial recommends getting frhed hex file editor and Maxim's bmp2tile graphics conversion tool.

Setup

Ok, now I'm on the next page of Maxim's turorial. He's talking about his set up with some text editor called ConTEXT. I don't really want to use yet another editor. My personal preference is Far Manager 3 built-in editor. I did take his Compile.bat file for reference, originally from his WLA-DX binary build.

First program

Maxim provides a "Hello, World!" program source code. I renamed "Hello World.asm" to ex1.asm. I then tried to use Compile.bat as-is, but ran into issues. The biggest is that wlalink.exe v5.8b doesn't understand combined options as in: wlalink.exe -drvs linkfile.tmp ex1.sms, as in Compile.bat example from Maxim. It's fine if you specify options separately: wlalink.exe -d -r -v -s linkfile.tmp ex1.sms I modified Compile.bat and renamed it to build.bat, because that is what I normally call simple build scripts on Windows. I added call to mekaw.exe on successful build. Example worked fine, thank you Maxim!

"Hello, World!" program output in Meka

Meka is a very unusual Windows application, owing to it's use of Allegro 5.1 game library. I discovered that ESC switches between full-screen and windowed mode. Pressing Alt-X quits Meka, just like in 1987-1992 Borland IDEs! To add a tool window, you have to choose it from top menu Tools. To hide a tool window, you have to choose it from the menu again! Clicking on little star in top-right corner doesn't remove the window, even through I expected it would. After some mucking around, I modified mekaw.cfg: changed start_in_gui = 1 and bios_logo = 1 to start_in_gui = 0 and bios_logo = 0. Also changed gui_video_mode = 640x480 to gui_video_mode = 1920x1200 (my display's native resolution). It made it a bit faster to load, which is great if you constantly lauch it as part of build script.

There is one very annoying feature that I didn't find a way to fix easily: if I start in windowed mode and then press ESC to switch to full-screen mode, Meka does that several seconds long animation of menu items from right to left. I really wish there was an easy to use setting to disable that.

At this point, I assembled and ran an assembly program for Sega Master System! Now, onto the next page of Maxim's tutorial.

Comments

WLA-DX takes both traditional assembler ; single-line and C-style /* */ multi-line comments:

; this is a single-line comment
	ld a,0	; another single-line comment, this time after assembly instruction

/*
	Hello there!
	This is a C-style multi-line comment.
	Thank you for your time. Goodbye!
 */

Numbers

WLA-DX understands bitwise numbers. For example 21 is 16+4+1, so it can be written as %00010101. So ld a,9 and ld a,%00010101 are equivalent.

I have to check if it supports bitwise representation for 16-bit values. Can I write ld bc,%0000000111110100 instead of ld bc,500?

WLA-DX will obviously work with hex numbers. Canonical form is with $ prefix. For example ld bc,500 is equivalent to ld bc,$01f4. Supposedely WLA-DX will also take hexadecimal numbers in several other syntaxic conventions: $01f4 is the same as 0x1f4 (C-style), &1f4 (huh?) and 1f4h (tasm/masm style). Surprisingly no octadecimal support as in C, not that I've seen anyone using it.

Directives

Here are some of WLA-DX supported directives:

Memory layout

directive description
.org memory offset?
.bank same as offset, but at page border?
.memorymap no idea...
.rombankmap ditto...
.section still no idea...
.slot nope...
.ramsection something about ram organization and debug symbols

Data definition

directive description
.db unsigned byte, i.e. .db $ff
.dsb signed byte, i.e. .dsb -1 (same as above)
.dw unsigned word, i.e. .db $ffff
.dsw signed word, i.e. .dsw -1 (same as above)
.dbsin probably takes argument and stores it's sin(arg)
.dbcos see above, but cos(arg)
.dbrnd define byte with different content on each build?
.dwrnd see above, but word
.incbin include content from binary file as-is
.struct tbd
.dstruct tbd
.enum tbd

SMS specific directive

.sdsctag directive generates valid SMS ROM header that will pass SMS BIOS check. This is also useful for ROM management tools that need identification data.

Conditional assembly directives

.if is apparently similar fo C preprocessor #ifdef.

Macros

WLA-DX supports macros, more on this later...

Now onto the next page of Maxim's tutorial!

Labels

A label is just a pretty name of a specific memory address. Syntax-wise it is a word, followed by colon:

  ld a,trigger1	; load address where we set a byte with value 123
 
trigger1:
  db 123

One WLA-DX specific syntax is:

 ld a,:trigger1	; this will load bank number for the label instead of it's address

Another WLA-DX specific label syntax:

-:	ld a,(hl)
	; some code here
	jp -

-: is an anonymous label for jumping backwards. There is also +: for jumping forward. Or you can use labels starting with two underscores and a letter to jump back and forth: __b:, __f:. The purpose of anonymous labels is that programmer donesn't need to come up with unique name for each. Assembler translating a jump to anonymous label will look for the nearest match.

Labels are replaced by specific numeric memory addresses at linking time.

To the next page!

Opcodes

Ok, I'm tired. I'll stop here for today! (9 hours later) I'm back for the second day of this journey!

A short reference: z80oplist.txt, originally from z80.info.

ROM

Read-only memory, usually on game cartridge. This memory is exposed to CPU when it's switched in one of the memory pages, I think. We can use it to read data or execute code, but we can't modify data in it.

RAM

Random-access memory. Starts either zeroed or undefined. Our ROM code could fill it with code and/or data and then even jump execution to it. All bytes in RAM are modifiable, both code and data. Access time is typically faster then ROM. I do know that SRAM is not normal RAM. It's battery backed memory on some cartridges, used to store save data between play sessions. SEGA Master System has 8KB of RAM.

Mirroring

According to Maxim this is when you connect a smaller ROM to a memory slot. If chip size is 2K, at 2K+1 you'll see a first ROM byte again etc.

SMS and GG memory is mirrored. There is 8KB of RAM in a 16KB memory slot.

SMS memory map

Z80 can address 64KB. On SMS, this memory is mapped as following:

address content size
$0000-$00ff ROM (system rom?) 1KB
$0100-$3fff ROM slot 1 15KB
$4000-$7fff ROM slot 2 16KB
$8000-$bfff ROM slot 3 16KB
$c000-$dfff RAM 8KB
$e000-$ffff RAM mirror 8KB

Larger games could swap content of each slot to a different bank in ROM. Small games will typically load up in slot 1 or slots 1 and 2. To the next page!

Registers

Z80 has 8 and 16-bit general purpose and specialized registers.

register function
a accumulator, used for arithmetic operations
f flag bits for previous opcode
b, c, d, e, h, l general purpose 8-bit registers
bc, de, hl general purpose 16-bit registers, build as pairs
ix, iy index registers for memory access
pc program counter
sp stack pointer
i interrupt page address
r memory refresh
a'-f', h', l' Z80 has a second set of registers, swap all at once

Lower and higher parts of ix/iy can be used as individual 8-bit registers. Those are not officially defined, but are often called ixh/ixl and iyh/iyl.

Ports

Z80 provides 256 potential ports. SMS hardware supports only 8, but those are mirrored, i.e. same port can be accessed at different value.

port input output
$3e memory control
$3f i/o port control i/o port control
$7e V counter
$7f H counter PSG
$be VDP data
$bf VDP control
$dc controller 1 controller 1
$dd controller 2 controller 2

Bold means often used. To the next page!

Simple program walkthrough

We will review ex1.asm, step-by-step. Next page.

Directives

Our program doesn't use much memory, so we won't need to swap slots at run-time. We do still need to do a basic memory setup.

;==============================================================
; WLA-DX banking setup
;==============================================================
.memorymap
defaultslot 0
slotsize $8000
slot 0 $0000
.endme

.rombankmap
bankstotal 1
banksize $8000
banks 1
.endro

It looks like we tell WLA-DX that we plan to use 32KB in slot 0 ($0000-$7fff).

I'm a bit confused about this. $8000 is 32KB. Does it mean we can use either 3 16KB slots or 1 32KB slot? And how does the setup look in actual Z80 opcodes? Is there a good Z80 disassembler that is SMS-aware?

SDSC tag

;==============================================================
; SDSC tag and SMS rom header
;==============================================================
.sdsctag 1.10,"Hello World!","SMS programming tutorial program (bugfixed)","Maxim"

So this is apparently a result of cooperation between Ville Helin, WLA-DX author and Maxim from SMS Power. It does generate a valid ROM header, but also embeds extra bits about author, program name, version and build date that is specific to homebrew. The only program that I am aware of that can read this extended identification is Maxim's Header Reader.

Program memory setup expectations

.bank 0 slot 0
.org $0000

This tells that the follow up code is going to bank 0 (the only in our case) and it is expected to be in first slot (again, the only one in our case). Next page.

Boot

;==============================================================
; Boot section
;==============================================================
    di              ; disable interrupts
    im 1            ; Interrupt mode 1
    jp main         ; jump to main program

This is a standard SMS program boot process. di disables interrupts until we have interrupt handlers setup. im 1 means interrupt mode 1. Just roll with it, SMS program never use im 2. jp main jumps to code that follows main label. Apparently SMS expects a gap between $0000 and normal code. Probably something to do with interrupt handlers.

Pause button handler

.org $0066
;==============================================================
; Pause button handler
;==============================================================
    ; Do nothing
    retn

Maxim explains how di can't turn NMI (non-maskable interrupts) off, and how pressing the pause button is the only NMI on SMS. This NMI will stop code execution and jump to $0066. Whatever code is at $0066 will execute, ending with retn, at which point normal program will resume. At minimum we should put retn at $0066, so pressing Pause won't do anything. That is exactly what we did in code above.

Initialization

;==============================================================
; Main program
;==============================================================
main:
    ld sp, $dff0

Set up stack by pointing sp register to location 16 bytes below the last byte of RAM. I think stack grows down the memory. Also $dff1-$dfff is probably reserved for something else.

Set up VDP registers

    ;==============================================================
    ; Set up VDP registers
    ;==============================================================
    ld hl,VdpData
    ld b,VdpDataEnd-VdpData
    ld c,$bf
    otir
; VDP initialisation data
VdpData:
.db $04,$80,$00,$81,$ff,$82,$ff,$85,$ff,$86,$ff,$87,$00,$88,$00,$89,$ff,$8a
VdpDataEnd:

Here we use otir opcode to send some data (VdpDataEnd-VdpData bytes) starting at VdpData address to port $bf.

Clear VRAM

VRAM is attached to VDP and is accessible via VDP ports. At start it normally contains SEGA logo from system bios. Following code will clear it out:

    ;==============================================================
    ; Clear VRAM
    ;==============================================================
    ; 1. Set VRAM write address to 0 by outputting $4000 ORed with $0000
    ld a,$00
    out ($bf),a
    ld a,$40
    out ($bf),a
    ; 2. Output 16KB of zeroes
    ld bc, $4000    ; Counter for 16KB of VRAM
    ClearVRAMLoop:
        ld a,$00    ; Value to write
        out ($be),a ; Output to VRAM address, which is auto-incremented after each write
        dec bc
        ld a,b
        or c
        jp nz,ClearVRAMLoop

No idea why Maxim did the ident shift. Assembler is not C or Python. I'd format all opcodes in a single row.

We output $00 and then $40 to VDP control port $bf. VDP expects lsb byte of word first, so to send $4000 to VDP, we send $00 and then $40. I assume that tells VDP we're about to send it some data to VDP data port. VDP should write it to it's attached VRAM. I should read msvdp.txt document to understand how VDP works.

We then use a loop to clear out 16KB of VRAM. I assume here that parts that won't show on screen or will contain our own data we'll load up later don't need to be zeroed out.

Another tought, instead of zeroing out all VRAM, we can fill one page with a starter screen and switch to that VRAM page to show it.

Loop is built using conditional jump instruction jp nz, label. It checks f for z flag, set by opcode executed just prior: or c. I think or c does bitwise a OR c, where a is loaded up with current value of b (ld a,b). or c will set z flag to 0 only if both a and c are zero, so only if bc is zero. Yay! Our first loop! A venerable workhorse of any program!

Maxim also show all f flags that can be used in a conditional jump opcode. Next page.

Setting up graphics

VDP supports palette (also CMAP for color map), tiles, tilemap and sprite table. Backgound is made using palette, tiles and tilemap. Sprites are made of palette, tiles and sprite table.

Palette

Our program only sets the background. Any changes to VRAM are done via VDP ports.

To set up palette, we have to write to CRAM. To write to byte N of CRAM, we have to OR it with $c000 and then output the value to VDP control port $bf. Low then high byte:

    ;==============================================================
    ; Load palette
    ;==============================================================
    ; 1. Set VRAM write address to CRAM (palette) address 0 (for palette index 0)
    ; by outputting $c000 ORed with $0000
    ld a,$00
    out ($bf),a
    ld a,$c0
    out ($bf),a
PaletteData:
.db $00,$3f ; Black, white
PaletteDataEnd:

We can use otir again to send bytes to CMAP:

    ; 2. Output colour data
    ld hl,PaletteData
    ld b,(PaletteDataEnd-PaletteData)
    ld c,$be
    otir

This only works because we send less then 256 values. If it was more, we'd have to use conditional jump loop and out opcode. There are 16 colors we can set up at once, we only set up 2 right now.

Maxim goes into more details in his palette sidebar page.

It's some riveting stuff! CRAM is 32 bytes long, 16 for background and 16 for sprites. Each byte holds 2 bits per color, bitwise it looks like: %xxbbggrr, where 2 top bits are ignored. Palette index 0 is transparent color in sprites. In background it's drawn behind sprites, even if the rest of background is on top of sprites.

Game Gear has twice the CRAM, and each palette element is 2 bytes with 4 bits per color: %xxxxbbbbggggrrrr, which allows 2^12 = 4096 colors.

Ok back to main tutorial!

Tiles

Tiles are loaded to VRAM at $0000. Each tile is 8x8 pixels. Each pixel takes up 4 bits. Value of those 4 bits is an index of color we want in CRAM.

 ;==============================================================
    ; Load tiles (font)
    ;==============================================================
    ; 1. Set VRAM write address to tile index 0
    ; by outputting $4000 ORed with $0000
    ld a,$00
    out ($bf),a
    ld a,$40
    out ($bf),a

Then we get a loop to copy data from ROM to VRAM:

    ; 2. Output tile data
    ld hl,FontData              ; Location of tile data
    ld bc,FontDataEnd-FontData  ; Counter for number of bytes to write
    WriteTilesLoop:
        ld a,(hl)        ; Get data byte
        out ($be),a      ; Output it
        inc hl           ; Add one to hl so it points to the next data byte
        dec bc           ; Decrement the counter and repeat until it's zero
        ld a,b
        or c
        jp nz,WriteTilesLoop

This loop introduces indirect addressing: ld a,(hl) will load a byte form memory at address stored in hl to a.

Maxim goes into more details in his tiles sidebar page.

I tried to read it, Maxim calls it planar. I think it's kind of like the old DOS VGA mode-x with separate bit planes. I can't understand it right now, but I'll experiemnt with it soon and hopefully get it then. Once I get it, I'll draw my own picture where bits get attribution. Somehow first 4 bytes of a tile represent first 8 pixels.

Maxim recommends to use tools to convert graphics to tile data.

He mentions that up to 14K of 16K VRAM can be used for tiles, which somehow gives us up to 448 tiles. Maxim recommends to use compression when storing tile data in ROM. Game Gear tiles work the same way.

Ok back to main tutorial!

Tilemap

Tilemap tells VDP which tile goes where on video page. There are also some extra options for how each tile is displayed. Tilemap lives at $3800 in VRAM.

Here is how we prepare for tilemap setup loop:

    ;==============================================================
    ; Write text to name table
    ;==============================================================
    ; 1. Set VRAM write address to tilemap index 0
    ; by outputting $4000 ORed with $3800+0
    ld a,$00
    out ($bf),a
    ld a,$38|$40
    out ($bf),a

Copy tilemap data to VRAM:

    ; 2. Output tilemap data
    ld hl,Message
    ld bc,MessageEnd-Message  ; Counter for number of bytes to write
    WriteTextLoop:
        ld a,(hl)    ; Get data byte
        out ($be),a
        inc hl       ; Point to next letter
        dec bc
        ld a,b
        or c
        jp nz,WriteTextLoop

And data is defined like this (I assume each byte corresponds to a letter in "Hello World!"):

Message:
.dw $28,$45,$4c,$4c,$4f,$00,$37,$4f,$52,$4c,$44,$01
MessageEnd:

Maxim goes into more details in his tilemap sidebar page.

Screen is composed from 32x28 tiles (896 in total). For each tile, tilemap uses two bytes: %xxxpcvhnnnnnnnnn:

  • x are unused bits;
  • p is a priority bit, I assume if it's set, tile is drawn on top of sprites;
  • c is for first or second palette;
  • v is a vertical flip bit;
  • h is a horizontal flip bit;
  • nnnnnnnnn is the index of tile: it can have value 0-511, but maximum tiles VRAM can hold is 448.

Turn on the screen

Screen was off since program set up VDP registers. We should switch it on. To do this, we should set bit 6 of VDP register 1:

    ; Turn screen on
    ld a,%01000000
;          ||||||`- Zoomed sprites -> 16x16 pixels
;          |||||`-- Doubled sprites -> 2 tiles per sprite, 8x16
;          ||||`--- Mega Drive mode 5 enable
;          |||`---- 30 row/240 line mode
;          ||`----- 28 row/224 line mode
;          |`------ VBlank interrupts
;          `------- Enable display
    out ($bf),a
    ld a,$81
    out ($bf),a

We then halt the program by having it run in infinite loop:

    ; Infinite loop to stop program
    Loop:
         jp Loop

Next page.

Enhance program

Maxim proposes to use .define directive to give mnemonic names to various numbers. This is a lot like equ in classic tasm/masm.

He then suggest replacing repeated code with functions, leveraging call and ret. Those use stack. Maxim goes on a bit about how stack works, which I think I understand.

Maxim suggests we start using relative jump instruction jr. It's smaller, but needs address calculation. Maxim recommends using it for loops.

Taking a break here. Up next "Convert ASCII text to tile numbers".