Skip to content

DSU Protocol Implementation

hifihedgehog edited this page Mar 19, 2026 · 8 revisions

DSU Protocol Implementation

PadForge implements the cemuhook DSU (DualShock UDP) protocol to stream controller motion data (gyroscope and accelerometer) to emulators. The server is compatible with Cemu, Dolphin, Yuzu, Ryujinx, and any client supporting the cemuhook protocol.

File: PadForge.App/Services/DsuMotionServer.cs Namespace: PadForge.Services Protocol Spec: https://github.com/v1993/cemuhook-protocol


MotionSnapshot

public struct MotionSnapshot
{
    public float AccelX, AccelY, AccelZ;
    public float GyroPitch, GyroYaw, GyroRoll;
    public long TimestampUs;
    public bool HasMotion;
}

Snapshot of a single slot's motion data, ready for DSU transmission. Units are already converted to DSU conventions:

  • Accel: g-force (1g = 9.80665 m/s^2)
  • Gyro: degrees per second

SDL-to-DSU Axis Mapping

SDL uses a right-handed coordinate system. The DS4/DSU protocol expects specific sign conventions. The mapping was derived from Switch Pro Controller's known-working BetterJoy-to-DSU mapping, translated through SDL standard coordinates, and verified with DualSense (all axes match DS4/DSU convention across both DualSense and Switch 2 Pro Controller).

DSU Field SDL Source Sign Derivation
AccelX ax Inverted (-ax) DS4 X-accel is opposite to SDL X
AccelY ay Inverted (-ay) DS4 Y-accel is opposite to SDL Y
AccelZ az Inverted (-az) DS4 Z-accel is opposite to SDL Z
GyroPitch gx Inverted (-gx) DS4 pitch is opposite to SDL gyro X
GyroYaw gy Not inverted (gy) Same sign in both coordinate systems
GyroRoll gz Inverted (-gz) DS4 roll is opposite to SDL gyro Z

Key insight: Accel and gyro must be in the same coordinate frame. AccelX and GyroPitch must reference the same physical axis. Four of six axes are inverted; only GyroYaw preserves sign.


DsuMotionServer

public sealed class DsuMotionServer : IDisposable

Constants

Constant Value Description
MaxSlots 4 DSU protocol slot limit (PadForge slots 4-15 skip DSU broadcast)
ProtocolVersion 1001 cemuhook protocol version
HeaderSize 16 Packet header size in bytes
MsgTypeVersion 0x100000 Server -> client: version response
MsgTypeControllerInfo 0x100001 Server -> client: controller info
MsgTypePadData 0x100002 Server -> client: pad data with motion
ClientTimeoutMs 5000 Client subscription expiry (5 seconds)
SIO_UDP_CONNRESET 0x9800000C IOControl to suppress ICMP port-unreachable resets

Instance State

Field Type Description
_socket Socket UDP socket bound to IPAddress.Loopback
_receiveThread Thread Background receive loop thread
_running volatile bool Server running flag
_serverId uint Unique server ID (from Environment.TickCount)
_port int Listening port
_packetCounters uint[4] Per-slot packet counter (incremented per broadcast)
_subscriptions Dictionary<(EndPoint, int), long> Per-slot client subscriptions with Stopwatch.GetTimestamp()
_allSlotSubscriptions Dictionary<EndPoint, long> All-slot client subscriptions with timestamp
_slotConnected bool[4] Per-slot connection state reported to clients
_slotHasMotion bool[4] Per-slot motion capability (device has gyro/accel sensors)
_disposed bool Dispose guard

Events

public event EventHandler<string> StatusChanged;

Raised when server status changes. Localized values from Strings.Instance:

  • "Listening on :{port}" -- successful bind
  • "Port {port} in use" -- SocketError.AddressAlreadyInUse
  • "Failed to start" -- other exceptions
  • "Stopped" -- after Stop()

Request/Response Flow

sequenceDiagram
    participant Client as Emulator (Client)
    participant Server as PadForge (Server)
    participant Poll as Polling Thread

    Note over Client,Server: All packets: DSUC (client) / DSUS (server) magic + CRC32

    Client->>Server: Version Request (0x100000)
    Server->>Client: Version Response (version=1001)

    Client->>Server: Controller Info Request (0x100001)<br/>slots=[0,1,2,3]
    Server->>Client: Controller Info (slot 0, connected, full gyro)
    Server->>Client: Controller Info (slot 1, not connected)
    Server->>Client: Controller Info (slot 2, not connected)
    Server->>Client: Controller Info (slot 3, not connected)

    Client->>Server: Pad Data Request (0x100002)<br/>flags=0x01, slot=0<br/>(subscribes to slot 0)

    loop Every ~1ms (1000Hz)
        Poll->>Server: BroadcastMotion(slot=0, snapshot, connected=true)
        Server->>Server: GetSubscribers(0) -> [Client]
        Server->>Client: Pad Data (slot 0, motion data)
    end

    Note over Client,Server: Subscription expires after 5 seconds<br/>Client must re-subscribe
Loading

Lifecycle

Start()

public bool Start(int port = 26760)
  1. Returns true immediately if already running (_running == true).
  2. Sets _port and generates _serverId from Environment.TickCount.
  3. Creates Socket(AddressFamily.InterNetwork, SocketType.Dgram, ProtocolType.Udp).
  4. Applies SIO_UDP_CONNRESET IOControl (_socket.IOControl(0x9800000C, new byte[4], null)) to suppress ICMP port-unreachable causing SocketException on the next ReceiveFrom. Catches exceptions (non-Windows or older OS).
  5. Binds to new IPEndPoint(IPAddress.Loopback, port).
  6. Sets _running = true.
  7. Creates and starts background receive thread:
    • Name: "PadForge.DsuServer"
    • IsBackground = true (does not prevent process exit)
  8. Raises StatusChanged with listening message.
  9. Returns true on success.

Error handling:

  • SocketException with AddressAlreadyInUse: disposes socket, raises StatusChanged with port-in-use message, returns false.
  • Any other exception: disposes socket, raises StatusChanged with failure message, returns false.

Stop()

public void Stop()
  1. Returns immediately if !_running.
  2. Sets _running = false.
  3. Closes socket (_socket?.Close(), exception caught).
  4. Joins receive thread with 2-second timeout (_receiveThread?.Join(2000)).
  5. Sets _receiveThread = null, _socket = null.
  6. Under lock(_subscriptions): clears both _subscriptions and _allSlotSubscriptions.
  7. Zeroes all _packetCounters.
  8. Raises StatusChanged with stopped message.

Dispose()

public void Dispose()

Calls Stop() if not already disposed. Sets _disposed = true. Not thread-safe (no lock on _disposed).


Public API

BroadcastMotion()

public void BroadcastMotion(int slot, MotionSnapshot snapshot, bool connected)

Called from the InputManager polling thread at ~1000Hz. This is the primary data path.

  1. Guard: returns immediately if !_running, _socket == null, or slot is out of range [0, MaxSlots).
  2. Updates _slotConnected[slot] and _slotHasMotion[slot] from parameters.
  3. Calls GetSubscribers(slot) to get active subscribers.
  4. Returns immediately if no subscribers (no packet allocation).
  5. Calls BuildPadDataPacket(slot, snapshot, connected) to construct the packet.
  6. Sends packet to each subscriber via _socket.SendTo(packet, ep). Exceptions are caught silently (client gone, will timeout).

Performance optimization: Packet allocation and serialization only happen when there are active subscribers. At 1000Hz with no subscribers, the method returns after a dictionary lookup.


Packet Format

Header (16 bytes, all messages)

All packets (client and server) share this header format:

Byte offset  Size  Field              Notes
[0..3]       4     Magic              "DSUS" (server->client) or "DSUC" (client->server)
[4..5]       2     Protocol version   Little-endian uint16, value 1001
[6..7]       2     Payload length     Little-endian uint16, excludes header (16 bytes)
[8..11]      4     CRC32              Little-endian uint32, zeroed before computation
[12..15]     4     ID                 Server ID (server->client) or Client ID (client->server)
private void WriteHeader(byte[] packet, int payloadLength, uint msgType)

Writes the 16-byte header with "DSUS" magic, protocol version, payload length, and server ID. The CRC32 field is left zeroed (filled later by FinalizeCrc). Also writes the message type as the first 4 bytes of the payload (at offset HeaderSize).

Message Types

Version Request (Client -> Server, type 0x100000)

No additional payload beyond the message type.

Byte offset  Size  Field
[0..15]      16    Header (magic="DSUC", CRC, clientId)
[16..19]     4     Message type: 0x100000

Total packet: 20 bytes. Payload length: 4 bytes.

Version Response (Server -> Client, type 0x100000)

Byte offset  Size  Field
[0..15]      16    Header (magic="DSUS", CRC, serverId)
[16..19]     4     Message type: 0x100000
[20..21]     2     Protocol version: 1001
[22..23]     2     Padding (zero)

Total packet: 24 bytes. Payload length: 8 bytes.

Controller Info Request (Client -> Server, type 0x100001)

Byte offset  Size  Field
[0..15]      16    Header
[16..19]     4     Message type: 0x100001
[20..23]     4     Number of ports requested (int32 LE)
[24..N]      N     Slot indices (one byte each)

Validated: numPorts must be in [0, MaxSlots] and packet must contain enough bytes for all slot indices. Each valid slot triggers a SendControllerInfo() response.

Controller Info Response (Server -> Client, type 0x100001)

One response sent per requested slot.

Byte offset  Size  Field              Value
[0..15]      16    Header             magic="DSUS"
[16..19]     4     Message type       0x100001
[20]         1     Slot number        0-3
[21]         1     Slot state         0=not connected, 2=connected
[22]         1     Device model       0=N/A, 2=full gyro
[23]         1     Connection type    0=N/A
[24..29]     6     MAC address        00:00:00:00:00:{slot}
[30]         1     Battery status     0x05 (charged)
[31]         1     Padding            0x00

Total packet: 32 bytes. Payload length: 16 bytes.

MAC address: Fake but unique per slot. The last byte is the slot number (0-3), all others are 0x00.

Device model: Set to 2 (full gyro) when _slotHasMotion[slot] is true, 0 otherwise.

Pad Data Request / Subscription (Client -> Server, type 0x100002)

Byte offset  Size  Field
[0..15]      16    Header
[16..19]     4     Message type: 0x100002
[20]         1     Flags (subscription mode)
[21]         1     Slot number
[22..27]     6     MAC address

Validated: packet must be at least HeaderSize + 12 (28) bytes.

Subscription flags:

Flag Value Behavior
0x00 Subscribe to ALL pads (stored in _allSlotSubscriptions)
0x01 Subscribe to specific slot by ID (stored in _subscriptions[(endpoint, slot)])
0x02 Subscribe by MAC (treated as all-slot subscription)
0x03 Both 0x01 and 0x02 (subscribe to specific slot AND all-slot)

Pad Data Response (Server -> Client, type 0x100002)

The largest message type. Contains controller info header, button state, analog inputs, touch data, and motion sensor data. PadForge is a motion-only server: button bitmasks, analog sticks, D-pad, analog buttons, and touch data are all zeroed (sticks centered at 128).

Byte offset  Size  Field                    Value / Notes
─────────── ───── ──────────────────────── ─────────────────────────────────
[0..15]      16    Header                   magic="DSUS"
[16..19]     4     Message type             0x100002
[20]         1     Slot number              0-3
[21]         1     Slot state               0=disconnected, 2=connected
[22]         1     Device model             0=N/A, 2=full gyro
[23]         1     Connection type          0=N/A
[24..29]     6     MAC address              00:00:00:00:00:{slot}
[30]         1     Battery status           0x05 (charged)
[31]         1     Connected flag           1=connected, 0=disconnected
[32..35]     4     Packet counter           uint32 LE, incremented per packet
[36]         1     Buttons bitmask 1        0x00 (zeroed, motion-only)
[37]         1     Buttons bitmask 2        0x00 (zeroed)
[38]         1     Home button              0x00 (zeroed)
[39]         1     Touch button             0x00 (zeroed)
[40]         1     Left stick X             128 (centered)
[41]         1     Left stick Y             128 (centered)
[42]         1     Right stick X            128 (centered)
[43]         1     Right stick Y            128 (centered)
[44..47]     4     Analog D-Pad             0x00 (L, D, R, U)
[48..55]     8     Analog buttons           0x00 (8 bytes)
[56..61]     6     Touch 1 data             0x00 (active, id, x16, y16)
[62..67]     6     Touch 2 data             0x00
[68..75]     8     Motion timestamp         int64 LE, microseconds
[76..79]     4     Accel X                  float LE (g-force)
[80..83]     4     Accel Y                  float LE
[84..87]     4     Accel Z                  float LE
[88..91]     4     Gyro Pitch               float LE (deg/s)
[92..95]     4     Gyro Yaw                 float LE
[96..99]     4     Gyro Roll                float LE

Total packet: 100 bytes (16 header + 84 payload). Payload: 4 bytes message type + 80 bytes data = 84 bytes.

Code offset mapping: In BuildPadDataPacket, the variable o = HeaderSize + 4 (= 20) is the base offset for data after the message type. The comment-level offsets in the code ([+0] slot, [+48] timestamp, [+56] accelX) are relative to o. To get the absolute byte offset in the packet: o + relative = 20 + relative. To get the payload offset: 4 + relative.

Field Code relative (o+N) Absolute byte Payload offset
Slot o + 0 20 +4
Packet counter o + 12 32 +16
Left stick X o + 20 40 +24
Motion timestamp o + 48 68 +52
Accel X o + 56 76 +60
Accel Y o + 60 80 +64
Accel Z o + 64 84 +68
Gyro Pitch o + 68 88 +72
Gyro Yaw o + 72 92 +76
Gyro Roll o + 76 96 +80

Motion timestamp: Written as Int64 (8 bytes) at o + 48, which spans absolute bytes [68..75]. This is a uint64 microsecond timestamp per the cemuhook protocol spec.

Float encoding: All accelerometer and gyroscope values are IEEE 754 single-precision floats written in little-endian byte order via BinaryPrimitives.WriteSingleLittleEndian.


Subscription Management

private List<EndPoint> GetSubscribers(int slot)

Returns a list of endpoints subscribed to the given slot. Called from BroadcastMotion() at ~1000Hz per active slot.

Data Structures

Two subscription dictionaries, both protected by lock(_subscriptions):

Dictionary Key Value Populated By
_subscriptions (EndPoint, slotIndex) Stopwatch.GetTimestamp() Pad data request with flags & 0x01
_allSlotSubscriptions EndPoint Stopwatch.GetTimestamp() Pad data request with flags == 0 or flags & 0x02

Subscriber Resolution Algorithm

  1. Acquire lock(_subscriptions).
  2. Compute timeoutTicks = Stopwatch.Frequency * ClientTimeoutMs / 1000 (5-second timeout in high-resolution ticks).
  3. Per-slot subscribers: Iterate _subscriptions for entries matching the requested slot:
    • If now - timestamp > timeoutTicks: add to expired list for later removal.
    • Otherwise: add endpoint to result list and seen HashSet.
  4. All-slot subscribers: Iterate _allSlotSubscriptions:
    • If expired: add to expiredAll list.
    • Otherwise: add endpoint to result if not already in seen (prevents duplicates when a client has both per-slot and all-slot subscriptions).
  5. Prune expired: Remove all entries from expired and expiredAll lists from their respective dictionaries.
  6. Release lock and return result list.

Expiration

Subscriptions expire after ClientTimeoutMs (5000 ms). Clients must periodically re-send pad data requests to stay subscribed. Expired entries are pruned lazily during GetSubscribers() iteration -- there is no background cleanup thread.

Timestamps use Stopwatch.GetTimestamp() (high-resolution performance counter) rather than DateTime.Now for sub-millisecond precision.


CRC32

Implementation

private static readonly uint[] Crc32Table = GenerateCrc32Table();

private static uint[] GenerateCrc32Table()
{
    var table = new uint[256];
    for (uint i = 0; i < 256; i++)
    {
        uint entry = i;
        for (int j = 0; j < 8; j++)
            entry = (entry & 1) != 0 ? (entry >> 1) ^ 0xEDB88320 : entry >> 1;
        table[i] = entry;
    }
    return table;
}

Standard CRC32 with reflected polynomial 0xEDB88320 (bit-reversed form of 0x04C11DB7). The 256-entry lookup table is generated once at static initialization.

ComputeCrc32()

private static uint ComputeCrc32(byte[] data, int length)
{
    uint crc = 0xFFFFFFFF;
    for (int i = 0; i < length; i++)
        crc = (crc >> 8) ^ Crc32Table[(crc ^ data[i]) & 0xFF];
    return crc ^ 0xFFFFFFFF;
}

Standard table-driven CRC32: initialize to 0xFFFFFFFF, XOR with final 0xFFFFFFFF. Processes length bytes (not the full array length).

FinalizeCrc()

private static void FinalizeCrc(byte[] packet)

To compute outgoing CRC:

  1. Zero the CRC field at packet[8..11].
  2. Compute CRC over the entire packet (packet.Length).
  3. Write the CRC back to packet[8..11] via BinaryPrimitives.WriteUInt32LittleEndian.

To verify incoming CRC (in ProcessPacket):

  1. Read CRC from data[8..11].
  2. Zero data[8..11].
  3. Compute CRC over HeaderSize + payloadLength bytes.
  4. Compare computed CRC against received CRC. Reject if mismatch.

Receive Loop

private void ReceiveLoop()

Background thread with name "PadForge.DsuServer" and IsBackground = true. Loops ReceiveFrom() with a 1024-byte buffer until _running is false.

Packet Validation Pipeline

Each received packet passes through 5 validation stages in ProcessPacket():

Stage Check Reject Condition
1 Minimum size received < HeaderSize + 4 (20 bytes)
2 Magic bytes Not "DSUC" (bytes D, S, U, C)
3 Protocol version version > ProtocolVersion (1001)
4 Payload length HeaderSize + payloadLength > received
5 CRC32 Computed CRC does not match received CRC

After validation, dispatches based on message type:

Message Type Handler
0x100000 HandleVersionRequest(sender)
0x100001 HandleControllerInfoRequest(data, length, sender)
0x100002 HandlePadDataRequest(data, length, sender)

Exception Handling

Exception When Action
SocketException when !_running Socket closed during shutdown break (exit loop)
ObjectDisposedException Socket object disposed break (exit loop)
Any other exception Malformed packet, transient error Silently caught, continue loop

Threading Model

graph LR
    subgraph "Receive Thread"
        RT["ReceiveLoop()<br/>(background thread)"]
        RT --> PV["ProcessPacket<br/>Validate + dispatch"]
        PV --> HV["HandleVersionRequest"]
        PV --> HC["HandleControllerInfoRequest"]
        PV --> HP["HandlePadDataRequest<br/>(updates subscriptions)"]
    end

    subgraph "Polling Thread (InputManager)"
        PT["~1000Hz polling loop"]
        PT --> BM["BroadcastMotion()"]
        BM --> GS["GetSubscribers()"]
        GS --> BP["BuildPadDataPacket()"]
        BP --> ST["socket.SendTo()"]
    end

    HP -.->|"lock(_subscriptions)"| SUB["_subscriptions<br/>_allSlotSubscriptions"]
    GS -.->|"lock(_subscriptions)"| SUB
Loading
Aspect Mechanism
_subscriptions dictionary Protected by lock(_subscriptions) -- both read (GetSubscribers) and write (HandlePadDataRequest) acquire the same lock
_running flag volatile bool -- no lock needed, provides happens-before ordering
_slotConnected, _slotHasMotion Written by polling thread (BroadcastMotion), read by receive thread (SendControllerInfo). No lock -- benign race (stale value at worst)
_packetCounters Accessed only from BroadcastMotion (single caller per slot). No lock needed.

Slot Limits

The DSU protocol supports a maximum of 4 slots (0-3). PadForge supports up to 16 virtual controller slots, but only slots 0-3 participate in DSU broadcasts. Slots 4-15 skip DSU entirely -- BroadcastMotion returns immediately when slot >= MaxSlots.


Polling Optimization

BroadcastMotion() is called at ~1000Hz for every active slot. The method includes several optimizations to minimize overhead when there are no subscribers:

  1. Guard clause: Returns immediately on !_running, null socket, or out-of-range slot (no lock acquisition).
  2. Subscriber check before packet build: GetSubscribers(slot) is called first. If the returned list is empty, BuildPadDataPacket() is never called (no 100-byte allocation).
  3. Lazy expiration: Expired subscriptions are pruned during GetSubscribers() iteration rather than via a dedicated timer, avoiding extra thread synchronization.

At 1000Hz with 4 active slots and no subscribers, the total overhead is ~4000 dictionary lookups/second (within lock), with no memory allocation.


DsuDiag Tool

Location: tools/DsuDiag/

Standalone DSU client diagnostic tool that connects to the DSU server and displays received motion data per-slot in real-time. Used for debugging axis mapping and verifying protocol compliance.

Usage:

  1. Start PadForge with DSU server enabled (Settings page, default port 26760).
  2. Run DsuDiag.exe -- connects to localhost:26760.
  3. Subscribes to all 4 slots and displays accelerometer/gyroscope values in the console.

Clone this wiki locally