# Mechlib / 3D models

As we saw in the overview, there's no need to worry about variations between the different version. A quick examination also reveals the same file table as the sounds use.

In [1]:
from pathlib import Path
from struct import Struct, unpack_from

MECHLIB_FOOTER = Struct("<2I")
MECHLIB_RECORD = Struct("<2I64s76x")


def extract_table(data):
    offset = len(data) - MECHLIB_FOOTER.size
    _, count = MECHLIB_FOOTER.unpack_from(data, offset)

    for _ in range(count):
        # walk the table backwards
        offset -= MECHLIB_RECORD.size
        # not sure what extra is
        start, length, name = MECHLIB_RECORD.unpack_from(data, offset)
        name = name.rstrip(b"\x00").decode("ascii")
        yield name, data[start : start + length]


data = Path("install/v1.0-us-post/zbd/mechlib.zbd").read_bytes()

output_path = Path.cwd() / "models" / "mechlib"
output_path.mkdir(parents=True, exist_ok=True)

contents = {}
for name, model in extract_table(data):
    if name in contents:
        print("duplicate:", name)
    contents[name] = model
    (output_path / name).write_bytes(model)

From the `.flt` file endings, I expected the other files to be OpenFlight scenes.

> OpenFlight format were rapidly adopted by the early commercial flight simulation industry in the later 80's and early 90's &mdash; https://en.wikipedia.org/wiki/OpenFlight

MechWarrior 3 was released in May 1999 with DirectX 6. At that time, the latest OpenFlight specification they might have had access to is version 15.6.0 (OpenFlight 15.5.1 [was published in July 1998](https://portal.presagis.com/support/solutions/articles/19000072973-openflight-15-5-1), OpenFlight 15.4.1 was July 1998, and OpenFlight 15.0 in October 1996) . Thanks to this being an industry specification, you can [see the publication dates and still download all specifications](https://www.presagis.com/en/glossary/detail/openflight/)!

Except these files don't seem to follow the OpenFlight spec. To be fair, OpenFlight does say custom binary serialisations are allowed. The first hint is OpenFlight is supposed to be big endian (the order in which bytes are written for multi-byte data types is known as [endianness](https://en.wikipedia.org/wiki/Endianness)). But most data in these files seems to be little endian. It's obvious from e.g. deserialising integers or floating point values in both big and little endian that some values make more sense range-wise in little endian. This also makes sense, as x86 processors are little endian, and in a time with limited CPU power, you'd want to avoid having to convert endianness. But even then, it isn't OpenFlight.

Trying to cross-match other 3D file formats also yielded no results. It wasn't a DirectX `.x` model, Wavefront `.obj`, not `.fbx` Filmbox, and not an Autodesk `.3ds` format.

---

Really, the only way to do this then was to reverse engineer the model loading code from the executable, instead of reverse engineering the file format.

The bad news is it's C++ code (MFC). And the compiler (Visual Studio circa 1999) produces absolutely horrible assembly. Reordered instructions, inlined functions like `memcpy`/`strcpy` et al., and interesting optimisations around calling conventions.

The good news. While a lot of the debug functions are stubbed out (probably via `#ifndef DEBUG` or something), the debug strings are still present in early versions. There also seems to be dead/duplicate code for loading mesh/models, some using `ReadFileA` (seems to be used in production) and some using `fread` (maybe debug functionality, since the density of debug messages is higher).

With this, I was able to reliably extract most model data. But I don't really know how to show this process. I have annotated the code in the standalone library with the debug strings I found, so it you're interested, you should be able to follow along. Hopefully I can release the Ghidra project at some point, but I'm unsure if it makes sense without the executable, and I'd rather not distribute that (although without other files, it might be OK legally speaking).

Back to analysing `mechlib.zbd` a bit. The files `format`, `version`, and `materials` are the odd ones out. `format` and `version` are uninformative without context:

In [2]:
from helpers import hexdump

hexdump(contents["format"])
hexdump(contents["version"])

0|01 00 00 00            |....
0|1B 00 00 00            |....


`materials` is a binary file that seems to map textures to models and even 'mech variants:

```bash
$ strings "models/mechlib/materials" | sort | uniq
[...]
annihilator01
annihilator02
annihilator03
annihilator04
annihilator05
annihilator06
annihilator07
annihilator08
annihilator09
annihilator10
annihilator11
annihilator12
annihilator13
annihilator14
annihilator15
[...]
$ strings "models/mechlib/materials" | wc -l
     434
```

In [3]:
data = (output_path / "materials").read_bytes()

count, = unpack_from("<I", data, 0)
count

442

That isn't worlds apart from the line counts. First and last 64 bytes:

In [4]:
from helpers import hexdump

hexdump(data[:64], width=16)
print("---")
hexdump(data[-64:], width=16)

00|BA 01 00 00 FF 11 FF 7F 00 00 7F 43 00 00 7F 43|...........C...C
16|00 00 7F 43 5C 81 61 00 00 00 00 00 00 00 00 3F|...C\.a........?
32|00 00 00 3F 00 00 00 00 00 00 00 00 09 00 00 00|...?............
48|6D 65 63 68 62 61 79 30 32 FF 11 FF 7F 00 00 7F|mechbay02.......
---
00|61 6E 69 5F 63 30 31 FF 11 FF 7F 00 00 7F 43 00|ani_c01.......C.
16|00 7F 43 00 00 7F 43 04 C5 61 00 00 00 00 00 00|..C...C..a......
32|00 00 3F 00 00 00 3F 00 00 00 00 00 00 00 00 0D|..?...?.........
48|00 00 00 6D 61 64 63 61 74 5F 63 70 69 74 30 35|...madcat_cpit05


In [5]:
(len(data) - 4) / count

52.86199095022624

String length seems to be variable, hence why the record size isn't constant. We also know the name seems to come last.

In [6]:
unpack_from("<2I", data, 4 + 36)

(0, 9)

In [7]:
unpack_from("9s", data, 4 + 36 + 8)

(b'mechbay02',)

In [8]:
count, = unpack_from("<I", data, 0)
offset = 4
for i in range(count):
    *unk, length = unpack_from("<40BI", data, offset)
    offset += 40 + 4
    name = data[offset : offset + length]
    try:
        name = name.decode("ascii")
    except UnicodeDecodeError as e:
        print(e, offset, name[:128])
        break
    offset += length
    print(i, length, name)

0 9 mechbay02
1 9 mechbay01
'ascii' codec can't decode byte 0xac in position 12: ordinal not in range(128) 154 b'\x00\x00\x7fC\x00\x00\x7fC\x00\x00\x7fC\xac\x81a\x00\x00\x00\x00\x00\x00\x00\x00?\x00\x00\x00?\x00\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00POLE\xff\x11\xff\x7f\x00\x00\x7fC\x00\x00\x7fC\x00\x00\x7fC\xd4\x81a\x00\x00\x00\x00\x00\x00\x00\x00?\x00\x00\x00?\x00\x00\x00\x00\x00\x00\x00\x00\x07\x00\x00\x00POLETOP\xff\x11\xff\x7f\x00\x00\x7fC\x00\x00\x7fC\x00\x00\x7fC\xfc\x81a\x00\x00\x00\x00\x00\x00\x00\x00?\x00\x00\x00?\x00'


Dang. Seems like we're a few bytes out for the next record - 40 to be exact, which is uncanny since that's the exact size of the unknown record data. I'll spare you the trial and error and get straight to a result:

In [9]:
MAT_HEADER_SIZE = 40
count, = unpack_from("<I", data, 0)
offset = 4
items = []
for i in range(count):
    record = data[offset : offset + MAT_HEADER_SIZE]
    offset += MAT_HEADER_SIZE
    length, = unpack_from("<I", data, offset)
    offset += 4
    name = data[offset : offset + length]
    try:
        name = name.decode("ascii")
    except UnicodeDecodeError as e:
        name = None
        offset -= 4
    else:
        offset += length
    items.append((name, record))

So, some materials have no name. I assume this means that materials are looked up by index, and also that some materials have no texture. 

This could make sense if material names aren't used in the lookup, but e.g. index. But trail and error isn't a good parsing strategy. There should be a flag that indicates that.

In [10]:
values = [set() for _ in range(MAT_HEADER_SIZE)]

for name, record in items:
    for i, r in enumerate(record):
        values[i].add(r)

for i, val in enumerate(values):
    print(i, sorted(val))

0 [0, 255]
1 [16, 17]
2 [0, 255]
3 [0, 127]
4 [0]
5 [0]
6 [0, 69, 89, 126, 127, 238]
7 [0, 66, 67]
8 [0]
9 [0]
10 [0, 19, 31, 126, 127, 138, 238]
11 [0, 66, 67]
12 [0]
13 [0]
14 [0, 19, 69, 126, 127, 238]
15 [0, 66, 67]
16 [0, 4, 12, 20, 28, 36, 44, 52, 60, 68, 76, 84, 92, 100, 108, 116, 124, 132, 140, 148, 156, 164, 172, 180, 188, 196, 204, 212, 220, 228, 236, 244, 252]
17 [0, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197]
18 [0, 97]
19 [0]
20 [0]
21 [0]
22 [0]
23 [0]
24 [0]
25 [0]
26 [0]
27 [63]
28 [0]
29 [0]
30 [0]
31 [63]
32 [0]
33 [0]
34 [0]
35 [0]
36 [0]
37 [0]
38 [0]
39 [0]


Okay, we can make some guesses. The last 20 bytes seem to be 4 byte/32 bit integers. Bytes 16 and 17 are extremely varied, and could be little endian integers, or maybe colour (RGB555)? Bytes 4-15 are quite regular, although I have no clue what they mean.

One of the first four bytes is a good candidate for the name switch. It can't be the first, as some of the unnamed materials have the same values as named materials. It isn't the second (tried it), but I got lucky on the third:

In [11]:
count, = unpack_from("<I", data, 0)
offset = 4

mat_to_tex = []
for i in range(count):
    has_name = data[offset + 2] == 255
    offset += MAT_HEADER_SIZE
    if has_name:
        length, = unpack_from("<I", data, offset)
        offset += 4
        texture = data[offset : offset + length].decode("ascii")
        offset += length
    else:
        texture = None
    mat_to_tex.append(texture)

import json

with open("materials.json", "w", encoding="utf-8") as f:
    json.dump(mat_to_tex, f, indent=2)

Boom. It's a good start, good enough to load the textures into a 3D program. I wrote the library to export models to JSON, and then a script for [Blender](https://www.blender.org/) to create a 3D scene from the JSON. It largely works :)

For the 'mechs, the "head" mesh (texture index 13) is just a blank rectangle - it's one of the ones without a name. So a bit more understanding of the material attributes might be required.

## Next up

[Motion/animations](08-motion.ipynb)

**WARNING**: Work in progress below

In [12]:
named_values = [set() for _ in range(MAT_HEADER_SIZE)]
unnamed_values = [set() for _ in range(MAT_HEADER_SIZE)]

for name, record in items:
    if name:
        for i, r in enumerate(record):
            named_values[i].add(r)
    else:
        for i, r in enumerate(record):
            unnamed_values[i].add(r)

for i, (named, unnamed) in enumerate(zip(named_values, unnamed_values)):
    print(i, sorted(named), sorted(unnamed), sep="\t")

0	[255]	[0, 255]
1	[17]	[16]
2	[255]	[0]
3	[127]	[0]
4	[0]	[0]
5	[0]	[0]
6	[127]	[0, 69, 89, 126, 127, 238]
7	[67]	[0, 66, 67]
8	[0]	[0]
9	[0]	[0]
10	[127]	[0, 19, 31, 126, 127, 138, 238]
11	[67]	[0, 66, 67]
12	[0]	[0]
13	[0]	[0]
14	[127]	[0, 19, 69, 126, 127, 238]
15	[67]	[0, 66, 67]
16	[4, 12, 20, 28, 36, 44, 52, 60, 68, 76, 84, 92, 100, 108, 116, 124, 132, 140, 148, 156, 164, 172, 180, 188, 196, 204, 212, 220, 228, 236, 244, 252]	[0]
17	[129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197]	[0]
18	[97]	[0]
19	[0]	[0]
20	[0]	[0]
21	[0]	[0]
22	[0]	[0]
23	[0]	[0]
24	[0]	[0]
25	[0]	[0]
26	[0]	[0]
27	[63]	[63]
28	[0]	[0]
29	[0]	[0]
30	[0]	[0]
31	[63]	[63]
32	[0]	[0]
33	[0]	[0]
34	[0]	[0]
35	[0]	[0]
36	

So we can discard the last 20 bytes, since they don't vary.

In [13]:
HALF = MAT_HEADER_SIZE // 2
named_values = [set() for _ in range(HALF)]
unnamed_values = [set() for _ in range(HALF)]

for name, record in items:
    if name:
        for i, r in enumerate(record[:HALF]):
            named_values[i].add(r)
    else:
        for i, r in enumerate(record[:HALF]):
            unnamed_values[i].add(r)

for i, (named, unnamed) in enumerate(zip(named_values, unnamed_values)):
    print(i, sorted(named), sorted(unnamed), sep="\t")

0	[255]	[0, 255]
1	[17]	[16]
2	[255]	[0]
3	[127]	[0]
4	[0]	[0]
5	[0]	[0]
6	[127]	[0, 69, 89, 126, 127, 238]
7	[67]	[0, 66, 67]
8	[0]	[0]
9	[0]	[0]
10	[127]	[0, 19, 31, 126, 127, 138, 238]
11	[67]	[0, 66, 67]
12	[0]	[0]
13	[0]	[0]
14	[127]	[0, 19, 69, 126, 127, 238]
15	[67]	[0, 66, 67]
16	[4, 12, 20, 28, 36, 44, 52, 60, 68, 76, 84, 92, 100, 108, 116, 124, 132, 140, 148, 156, 164, 172, 180, 188, 196, 204, 212, 220, 228, 236, 244, 252]	[0]
17	[129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197]	[0]
18	[97]	[0]
19	[0]	[0]
