# Parsing Yacht Devices' RAW Format

This notebook contains notes and explanatory Python code for parsing Yacht Devices' RAW format

### About Yacht Devices

Yacht Devices, https://www.yachtd.com/, produces and sells a variety of electronic devices for boats. Notably for this context, it has several devices for connecting a personal computer to a NMEA 2000 network. For example, its YDNU-02 NMEA 2000 USB Gateway.

The YDNU-02 supports multiple modes of data interchange between the NMEA 2000 network and the USB side of the bridge. However, Yacht Devices recommends RAW mode--a readable text format--for developers, "because it is the easiest option."

### About this Notebook

Yacht Devices provides some guidance on parsing the RAW format but seems to assume an understanding of NMEA 2000 and access to the NMEA 2000 specification. The former which I lacked when I started this notebook; the latter which must be purchased from NMEA and cannot be disclosed if it is.

This notebook was my attempt at explaining Yacht Devices' RAW format and NMEA 2000 more generally to myself. It is based on publicly available information from various sources, both primary and secondary. Notably, the CANboat project, which has built a series of tools for working with NMEA 2000 data and that has a collection of NMEA 2000 PGNs (Parameter Group Numbers) that have been reverse engineered. https://github.com/canboat/canboat.

While this notebook is directed specifically at Yacht Devices' RAW format--it is the device I have at this time--I believe that the format is sufficiently close to generic NMEA 2000 that this notebook may be directionally informative for users of devices from other manufactures, e.g., Actisense.

## Overview of RAW Format
As described by Yacht Devices, the RAW format messages have the following form:

`hh:mm:ss.ddd D msgid b0 b1 b2 b3 b4 b5 b6 b7<CR><LF>`

where:
  * hh:mm:sss.ddd — time of message transmission or reception, ddd are milliseconds;
  * D — direction of the message (‘R’ — from NMEA 2000 to PC, ‘T’ — from PC to NMEA 2000);
  * msgid — 29-bit message identifier in hexadecimal format (contains NMEA 2000 PGN and other fields);
  * b0..b7 — message data bytes (from 1 to 8) in hexadecimal format;
  * \<CR>\<LF> — end-of-line symbols (carriage return and line feed, decimal 13 and 10).

https://www.yachtd.com/downloads/ydnu02.pdf (Last accessed Dec 28, 2021).

## Parsing Raw Format Strings

### 1. Time Stamp

The time stamp is more-or-less self-explanatory. However, there are a few facets worth mentioning.

  * __hh are 24 hours__ - self-explanatory
  * __ddd is milliseconds__ - as indicated above, the ddd portion of the time stamp is in milliseconds, not tenths, hundredths, and thousandths as the decimal might suggest.
  * __Time is UTC or from device start__ - if the device, e.g., YDNU-02, received the time from the NMEA 2000 network the time stamp is UTC; otherwise, the time stamp is the time from the device's start.
  * __Zero padding__ - All values are zero padded, e.g., 5 minutes would be written as 05.
  
Python's strftime function can be used to parse this timestamp.

In [1]:
from datetime import datetime

timestamp_str = '17:33:21.107'
timestamp_obj = datetime.time(datetime.strptime(timestamp_str, '%H:%M:%S.%f'))
timestamp_obj

datetime.time(17, 33, 21, 107000)

### 2. Direction

The direction is also largely self-explanatory. 'R' indicates the message is from the NMEA 2000 network to the PC; 'T' indicates the other direction.

In [2]:
'R' # From NMEA 2000 to PC

'R'

In [3]:
'T' # From PC to NMEA 2000

'T'

### 3. Message Identifier

As disclosed by Yacht Devices, the message identifier is 29 bits in hexadecimal format, and appears to correspond to the CAN identifier field as described in the 1999 article from NMEA, titled "NMEA 2000 Explained - The Latest Word." https://www.nmea.org/Assets/2000-explained-white-paper.pdf (Last accessed Dec 28, 2021).

The below table was reproduced from the 1999 NMEA article.

| Bits    | Field          | Description                                                                 |
|:--------|:---------------|:----------------------------------------------------------------------------|
| 26 - 28 | Priority       | These bits have the most impact during network access arbitration           |
| 24 - 25 | Reserved       | Reserved for future use                                                     |
| 16 - 23 | Data ID Byte A | High-order byte of the parameter group number of the data being transmitted |
| 08 - 15 | Data ID Byte B | • Low-order byte of the parameter group number for global addresses, or <br> • the destination address for non-global data groups    |
| 00 - 07 | Source Address | Address of the transmitter 

However, according to the CANboat project the two reserved bits are now part of the PGN. See, https://github.com/canboat/canboat/issues/248.

The 1999 NMEA article also states that "NMEA 2000 adopts the J1939 / ISO 11783 use of the identification field." A more recent document describing J1939, shows bits 24 and 25 as the "Data Page" and "Extended Data Page" respectively, and states that they are part of the PGN. https://assets.vector.com/cms/content/know-how/_application-notes/AN-ION-1-3100_Introduction_to_J1939.pdf (Last accessed Dec 28, 2021).

The below table was reproduced from the above linked introduction to J1939.

| Priority | Extended Data Page | Data Page |PDU Format | PDU Specific | Source Address |
|:---------|:-------------------|:----------|:----------|:-------------|:---------------|
| 3 bit    | 1 bit              | 1 bit     | 8 bit     | 8 bit        | 8 bit          |

It should also be noted that bits 8-15 are part of the parameter group number (PGN) if it is sent to all addresses (broadcast); otherwise, it is the destination address. Values 239 or less are specific addresses, values 240 or greater are implied to be broadcast. See above referenced J1939 introduction.

Synthesizing the above, the message identifier can be described by the below visual.

<center><img src="https://i.ibb.co/35xFDzd/can-msg-id.png" width="800"></center>

#### 3.1 Extracting Fields from Message Identifier

This section shows the use of bitwise operators to isolate the various components of the message identifier.

The message identifier is in hexadecimal format; however, it is easier to understand its parsing in binary format.

In [4]:
msgid = 0x9F50323
format(msgid, '#031b')  # Show the value as binary with leading 0s as needed

'0b01001111101010000001100100011'

##### 3.1.1 Priority
Shift the message identifier right 26 bits to get the left most three bits.

Note that lower numbers have higher priority and if two messages have the same priority bits the remaining message identifier bits (read from left to right) will be used to determine priority

In [5]:
priority = 0b01001111101010000001100100011 >> 26
print(format(priority, '#05b'))  # Show the value as binary with leading 0s as needed

0b010


##### 3.1.2 Extended Data Page
Shift the message identifier right 25 bits to get the left most four bits.

In [6]:
shifted = 0b01001111101010000001100100011 >> 25
print(format(shifted, '#06b'))

0b0100


Apply a bitwise-and to the shifted value and `0b1` to isolate the right most bit from the shifted value.

In [7]:
extended_data_page = shifted & 0b1
del shifted
print(format(extended_data_page, '#03b'))

0b0


##### 3.1.3 Data Page
Shift the message identifier right 24 bits to get the five left most bits. Then apply a bitwise-and to the shifted value and `0b1` to isolate the right most bit from the shifted value

In [8]:
data_page = (0b01001111101010000001100100011 >> 24) & 0b1
print(format(data_page, '#03b'))

0b1


##### 3.1.4 PDU Format

Shift the message identifier 16 bits to get the left most 13 bits. Then apply a bitwise-and to the shifted value and `0b11111111`, to isolate the right most eight bits from the shifted value.

In [9]:
pdu_format = (0b01001111101010000001100100011 >> 16) & 0b11111111
print(format(pdu_format, '#010b'))

0b11110101


##### 3.1.5 PDU Specific
Shift the message identifier 8 bits to get the left most 21 bits. Then apply a bitwise-and to the shifted value and `0b11111111`, to isolate the right most eight bits from the shifted value.

In [10]:
pdu_specific = (0b01001111101010000001100100011 >> 8) & 0b11111111
print(format(pdu_specific, '#010b'))

0b00000011


##### 3.1.6 Source
Then apply a bitwise-and to the message identifier and `0b11111111`, to isolate the right most eight bits from the message identifier.

In [11]:
source = 0b01001111101010000001100100011 & 0b11111111
print(format(source, '#010b'))

0b00100011


#### 3.2 Assembling the PGN

As described above, the PGN is either:
  * the combination of the Extended Data Page, Data Page, PDU Format, and PDU Specific if the message is directed to a specific address--known as __PDU1__ format; or,
  * those same fields minus PDU specific if the message is broadcast to all addresses--known as __PDU2__ format.

Shift the extended data page 17 bits to the left, which leaves room (zeros) for the other fields to be added.

_Note: as far as I can tell, the extended data page is always 0 (as of this writing) so this move is somewhat insignificant._

In [12]:
extended_data_page_shifted = extended_data_page << 17
print(format(extended_data_page_shifted, '#020b'))

0b000000000000000000


Similarly, shift the extended data page by 16 bits and the PDU format by 8 bits.

In [13]:
data_page_shifted = data_page << 16
print(format(extended_data_page_shifted, '#019b'))

0b00000000000000000


In [14]:
pdu_format_shifted = pdu_format << 8
print(format(pdu_format_shifted, '#018b'))

0b1111010100000000


##### 3.2.1 PDU1
Use PDU1 where the PDU Specific is 239 or less

_In this case the PDU is not 239 or less_

In [15]:
if pdu_format <= 239:
    pgn = extended_data_page_shifted + data_page_shifted + pdu_format_shifted
    print(format(pgn, '#020b'))

##### 3.2.2 PDU2
Use PDU2 where the PDU Specific is 240 or more

In [16]:
if pdu_format >= 240:
    pgn = extended_data_page_shifted + data_page_shifted + pdu_format_shifted + pdu_specific
    print(format(pgn, '#020b'))

0b011111010100000011


#### 3.3 A Complete Parsing Function
Combining the above, a function to parse the message id can be written as follows.

In [17]:
def parse_msgid(msgid):
    """Parses msgid into its component parts.

    Parameters
    ----------
    msgid: str
        msgid as hex encoded string.

    Returns: dict
        {
            priority,
            source,
            destination,
            png
        }

    """

    msgid = int(msgid, base=16)

    priority = msgid >> 26
    extended_data_page = (msgid >> 25) & 0b1
    data_page = (msgid >> 24) & 0b1
    source = msgid & 0b11111111

    # Assemble PGN
    extended_data_page_shifted = extended_data_page << 17
    data_page_shifted = data_page << 16
    pdu_format_shifted = pdu_format << 8

    if pdu_format < 240:
        # PDU1 format
        pgn = (extended_data_page << 17) + (data_page_shifted << 16) + (pdu_format_shifted << 8)
        destination = pdu_specific
    else:
        # PDU2 format
        pgn = extended_data_page_shifted + data_page_shifted + pdu_format_shifted + pdu_specific
        destination = 0b11111111  # Implies global destination

    return {
        'priority': priority,
        'source': source,
        'destination': destination,
        'png': pgn
    }


parse_msgid('9F50323')

{'priority': 2, 'source': 35, 'destination': 255, 'png': 128259}

#### 3.4 Deciphering PGNs

NMEA has a publicly accessible PGN search, available at https://www.nmea.org/content/STANDARDS/nmea_2000_pgn__search.

Searching for the above extracted PGN, 128259, shows that the PGN is for "Speed" and further describes it as, _"The purpose of this PGN is to provide a single transmission that describes the motion of a vessel."_

### 4. Message Data Bytes
The values themselves, e.g., the speed, are stored in the eight message data bytes that follow the message id.

A full and conical understanding of the message data bytes is only possible with access to the NMEA 2000 specification, which as noted in the introduction, must be purchased and cannot be disclosed once it is. That said, there is a fair amount of information that has been publicly disclosed by NMEA or been reverse engineered by others, notably by the CANboat project, also mentioned in the introduction.

One of the most useful documents publicly available on NMEA's website is a 2015 listing of PGNs along with their fields, marked v2.101. https://www.nmea.org/Assets/20151026%20nmea%202000%20pgn_website_description_list.pdf (Last accessed Dec. 28, 2021)

The NMEA PGN document enumerates six fields for PGN 128259 (Speed); the below table was reproduced from the 1999 NMEA article.


| Field # | Field Description           |
|:-------:|:----------------------------|
|       1 | Sequence ID                 |
|       2 | Speed Water Referenced      |
|       3 | Speed Ground Referenced     |
|       4 | Speed Water Referenced Type |
|       5 | Speed Direction             |
|       6 | NMEA Reserved               |

#### 3.4.1 Speed Water Referenced
The above linked PNG descriptions do not specify how many bytes a given field takes; however, looking at the below message data--PGN 128259 from an Airmar DST810--it's clear that some fields are more than one byte.  The above table describes six fields but there are eight bytes in the data message.

Looking at some sample data, it appears that only two of the bytes are changing. Presumably these must be associated with field two, Speed Water Referenced—it’s the only value that I believe should be changing.

In [18]:
msg_data = [
    'FF 00 00 FF FF 00 FF FF',
    'FF 17 00 FF FF 00 FF FF',
    'FF 25 00 FF FF 00 FF FF',
    'FF 4A 00 FF FF 00 FF FF',
    'FF 01 01 FF FF 00 FF FF',
    'FF 9A 01 FF FF 00 FF FF',
    'FF 15 02 FF FF 00 FF FF',
    'FF C2 02 FF FF 00 FF FF']

To understand the speed that the message data bytes are encoding, I sent the messages to a test NMEA 2000 network consisting of a Yacht Device YDNU-02 NMEA 2000 USB Gateway and a B&G Triton display.  The speed values shown by the Triton display for each message are below. 

_Note: Be careful that the display is not dampening the output._

In [19]:
msg_data_w_values = {
    'FF 00 00 FF FF 00 FF FF': 0,
    'FF 17 00 FF FF 00 FF FF': 0.4,
    'FF 25 00 FF FF 00 FF FF': 0.7,
    'FF 4A 00 FF FF 00 FF FF': 1.4,
    'FF 01 01 FF FF 00 FF FF': 5.0,
    'FF 9A 01 FF FF 00 FF FF': 8.0,
    'FF 15 02 FF FF 00 FF FF': 10.4,
    'FF C2 02 FF FF 00 FF FF': 13.7}

The speed values shown on the Triton display are in nautical miles per hour (knots).  However, the message data bytes appear to use meters per second (m/s), as discussed in this issue on the canboatjs project, https://github.com/canboat/canboatjs/issues/172.

A nautical mile is 1,852 meters.  As such 1 m/s is equal to (1 meter / 1,852 meter) * 60 seconds * 60 minutes or roughly 1.944 knots.

In [20]:
def ms_to_kn(ms):
    """Converts meters per second to knots"""
    METERS_IN_NM = 1852
    SECONDS_IN_HOUR = 3600
    km_in_ms = (1 / METERS_IN_NM) * SECONDS_IN_HOUR
    return ms * km_in_ms


# A simple test to ensure ms_to_kn is returning the expect result
from math import isclose
assert isclose(ms_to_kn(1), 1.944, abs_tol=1e-3)

As described above, the speed water referenced appears to be encoded in bytes 1 and 2. Further, experimentation appears to indicate that b1 is the lest significant byte.

| b0 |   b1   |   b2   | b3 | b4 | b5 | b6 | b7 |
|----|--------|--------|----|----|----|----|----|
| FF | __9A__ | __01__ | FF | FF | 00 | FF | FF |

_Note: the division by one hundred was added after seeing that the order of magnitude was off._

In [21]:
ms_to_kn(0x019A) / 100

7.96976241900648

####  3.4.2 Other Fields
The other fields are seemingly of less interest, and I have not taken the effort to independently reverse engineer them.  Although, the CANboat project shows that 0 in byte five indicates a "Paddle wheel" which would be correct in the case of the DST810 sensor. https://github.com/canboat/canboat/blob/master/analyzer/pgns.json