# Duplicator Info (DI) data decoder (FM encoding)
This notebook illustrates how to extract information from an FM sector as recorded in a G64 file.

You might want to replace the embedded data block with one from a different title and test the process.

## FM sector structure

Pre-index gap (It starts right after the first edge of the index hole signal):
 - 40 bytes filled with 0xff or 0x00
 - 6 bytes filled with 0x00

Index mark:
 - 1 byte index mark 0xfc = %11111100 (written with a clock sequence that, grouped on its own, results in a byte value of 0xd7 = %11010111)

Post index gap:
 - 26 bytes filled with 0xff or 0x00
 - 6 bytes filled with 0x00

ID record (This starts the sector):
 - 1 byte ID address mark 0xfe = %11111110 (written with a clock sequence that, grouped on its own, results in a byte value of 0xc7 = %11000111)
 - 1 byte Cylinder number (0-based)
 - 1 byte Side number (0-based)
 - 1 byte Sector number (1-based)
 - 1 byte Sector length (log2(length of user data field)−7)
 - 2 bytes CRC

ID gap:
 - 11 bytes filled with 0xff or 0x00
 - 6 bytes filled with 0x00

Data record:
 - 1 byte data address mark 0xfb = %11111100 (written with a clock sequence that, grouped on its own, results in a byte value of 0xc7 = %11000111)
 - User data
 - 2 bytes CRC

Data gap:
 - 27 bytes filled with 0xff
 - 6 bytes filled with 0x00

Reference: "The floppy user guide" by Michael Haardt, Alain Knaff, David C. Niemi.

## Initialization

In [1]:
# Check versions
import sys

print "Python is:", sys.version

# Import required libraries
import numpy as np

Python is: 2.7.15 | packaged by conda-forge | (default, Mar  5 2020, 14:56:06) 
[GCC 7.3.0]


## G64 data
We shall be working with partial data from track 36 of "California Games", taken from a G64 file. Only picking a block of the data allows us to speed up the decoding process, without loss of generality.

In [2]:
y = np.array([
0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,
0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,
0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,
0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,
0x55,0x55,0x55,0x55,0x55,0x55,0x55,0x55,0x55,0x55,0x55,0x55,0xEA,0xFD,0x5D,0x5F,
0x55,0x55,0x55,0x57,0x55,0x55,0x7D,0x55,0x75,0xFD,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,
0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0x55,
0x55,0x55,0x55,0x55,0x55,0x55,0x55,0x55,0x55,0x55,0x55,0xEA,0xDF,0x55,0x57,0x55,
0x5D,0x55,0x5F,0x55,0x75,0x55,0x77,0x55,0x55,0x55,0x75,0x55,0x55,0xD5,0x55,0x5D,
0x55,0x55,0xD5,0x55,0x5D,0x55,0x55,0x57,0x55,0xD5,0x7F,0x55,0xD5,0x5D,0x77,0x57,
0x55,0x5D,0xD5,0x55,0x57,0x55,0x55,0x55,0x57,0x55,0x55,0x5F,0x77,0x5F,0x55,0x5F,
0xD5,0x5D,0xF7,0x5F,0x55,0x5F,0x75,0x5F,0x55,0x75,0x57,0x5D,0x55,0x5D,0x55,0x5D,
0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x75,0x5F,0x5F,0x7D,0x5F,
0x75,0x5D,0x55,0x77,0x7F,0x75,0xFF,0x75,0xFD,0x75,0x75,0x75,0x77,0x77,0x5D,0x75,
0xF7,0x75,0x57,0x77,0x75,0x5D,0x55,0x75,0xFD,0x75,0xFF,0x77,0x5D,0x75,0xF7,0x75,
0x57,0x75,0xF5,0x55,0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,
0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,
0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,
0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x55,0x55,0x55,0x55,0x55,
0x55,0x55,0x55,0x5F,0x57,0x5F,0x5D,0x5F,0xD5,0x5D,0x55,0x5D,0xD5,0x5F,0x5F,0x5F,
0x55,0x5F,0x55,0x7F,0x7F,0x55,0x55,0x55,0x75,0x55,0x55,0x7F,0x7F,0x55,0x55,0xD7,
0x5D,0x55,0x57,0x55,0x55,0x55,0x55,0x55,0x55,0x55,0x55,0x55,0x55,0x55,0x55,0x55,
0x55,0x55,0x55,0x55,0x55,0x55,0x55,0x55,0x55,0x55,0x55,0x55,0x55,0x55,0x55,0x55,
0x55,0x55,0x55,0x55,0x55,0x55,0x55,0x55,0x55,0x55,0x55,0x55,0x55,0x55,0x55,0x55,
0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,
0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,
0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,
0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,
0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,
0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x5D,0x55,0x55,
0x55,0x55,0x55,0x55,0x55,0x55,0x55,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,
0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,
], dtype=np.uint8)

## Convert to bits
We convert the data to a bit stream.

In [3]:
bitstream = np.unpackbits(y)

In [4]:
"".join(map(lambda b: str(b), bitstream))

'111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111110101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101011110101011111101010111010101111101010101010101010101010101010111010101010101010101111101010101010111010111111101111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101111010101101111

## Find Data record positions
For FM, if we use the last two bytes of the ID gap and the first byte of the Data record as search pattern, what we are looking for is the below:

```
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 1 1  = clock sequence 0xff 0xff 0xc7
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 1 1 = data sequence  0x00 0x00 0xfb
101010101010101010101010101010101111010101101111 = clock and data combined 0xaa 0xaa 0xaa 0xaa 0xf5 0x6f
```

We can use this pattern in order to realign the stream.

In [5]:
searchseq = [1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,1,1,1,0,1,0,1,0,1,1,0,1,1,1,1]
N = len(searchseq)
possibles = np.where(bitstream == searchseq[0])[0]

syncpositions = []
for p in possibles:
    check = bitstream[p:p+N]
    if np.all(check == searchseq):
        syncpositions.append(p)

print(syncpositions)

[951]


  


## Shift array to the data record
We are now shifting bits in the bitstream so that the latter is byte aligned and we can better recognize the ID gap and data record.

In [6]:
b = np.roll(bitstream, -syncpositions[0])

In [7]:
"".join(map(lambda b: format(b, "02x"), np.packbits(b)))

'aaaaaaaaf56faaabaaaeaaafaabaaabbaaaaaabaaaaaeaaaaeaaaaeaaaaeaaaaabaaeabfaaeaaebbabaaaeeaaaabaaaaaaabaaaaafbbafaaafeaaefbafaaafbaafaabaabaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaabaafafbeafbaaeaabbbfbaffbafebabababbbbaebafbbaabbbbaaeaabafebaffbbaebafbbaabbafaaaaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaaaaaaaaaaaaaaaaafabafaeafeaaeaaaeeaafafafaaafaabfbfaaaaaabaaaaabfbfaaaaebaeaaabaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaeaaaaaaaaaaaaaaaaaaffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffaaaaaaaaaaaaaaaaaaaaaaaaf57eaeafaaaaaaabaaaabeaabafefffffffffffffffffffffffffffffffff

## Remove clock bits and decode data

Knowing that each other bit is a clock bit we can remove clock bits, so we are left with the data record.

In [8]:
c = b[1::2]

In [9]:
"".join(map(lambda b: format(b, "02x"), np.packbits(c)))

'0000fb01020304050004008020080200108708251028010001003530382d30343041202020202020202043363420574f4e4445524d4154204e4f524d414c0020202020202020202020202020202020202020202020202020202000000000313238202833303077000400770092010000000000000000000000000000000000000000000000202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202000000000fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff000000000000fe23000100604efffffffffffffffffffff00000000'

Let's now decode as ASCII. The string "C64 WONDERMAT NORMAL" is what stands out clearly now.

In [10]:
"".join(map(lambda b: chr(b), np.packbits(c)))

'\x00\x00\xfb\x01\x02\x03\x04\x05\x00\x04\x00\x80 \x08\x02\x00\x10\x87\x08%\x10(\x01\x00\x01\x00508-040A        C64 WONDERMAT NORMAL\x00                           \x00\x00\x00\x00128 (300w\x00\x04\x00w\x00\x92\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00                                               \x00\x00\x00\x00\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xf0\x00\x00\x00\x00\x00\x0f\xe20\x00\x10\x06\x04\xef\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\x00\x00\x00\x00'

---