-
Notifications
You must be signed in to change notification settings - Fork 6
DSU Protocol Implementation
PadForge implements the cemuhook DSU (DualShock UDP) protocol to stream controller motion data (gyroscope and accelerometer) to emulators. The server is compatible with Cemu, Dolphin, Yuzu, Ryujinx, and any client supporting the cemuhook protocol.
File: PadForge.App/Services/DsuMotionServer.cs
Namespace: PadForge.Services
Protocol Spec: https://github.com/v1993/cemuhook-protocol
public struct MotionSnapshot
{
public float AccelX, AccelY, AccelZ;
public float GyroPitch, GyroYaw, GyroRoll;
public long TimestampUs;
public bool HasMotion;
}Snapshot of a single slot's motion data, ready for DSU transmission. Units are already converted to DSU conventions:
- Accel: g-force (1g = 9.80665 m/s^2)
- Gyro: degrees per second
SDL uses a right-handed coordinate system. The DS4/DSU protocol expects specific sign conventions. The mapping was derived from Switch Pro Controller's known-working BetterJoy-to-DSU mapping, translated through SDL standard coordinates, and verified with DualSense (all axes match DS4/DSU convention across both DualSense and Switch 2 Pro Controller).
| DSU Field | SDL Source | Sign | Derivation |
|---|---|---|---|
AccelX |
ax |
Inverted (-ax) |
DS4 X-accel is opposite to SDL X |
AccelY |
ay |
Inverted (-ay) |
DS4 Y-accel is opposite to SDL Y |
AccelZ |
az |
Inverted (-az) |
DS4 Z-accel is opposite to SDL Z |
GyroPitch |
gx |
Inverted (-gx) |
DS4 pitch is opposite to SDL gyro X |
GyroYaw |
gy |
Not inverted (gy) |
Same sign in both coordinate systems |
GyroRoll |
gz |
Inverted (-gz) |
DS4 roll is opposite to SDL gyro Z |
Key insight: Accel and gyro must be in the same coordinate frame. AccelX and GyroPitch must reference the same physical axis. Four of six axes are inverted; only GyroYaw preserves sign.
public sealed class DsuMotionServer : IDisposable| Constant | Value | Description |
|---|---|---|
MaxSlots |
4 |
DSU protocol slot limit (PadForge slots 4-15 skip DSU broadcast) |
ProtocolVersion |
1001 |
cemuhook protocol version |
HeaderSize |
16 |
Packet header size in bytes |
MsgTypeVersion |
0x100000 |
Server -> client: version response |
MsgTypeControllerInfo |
0x100001 |
Server -> client: controller info |
MsgTypePadData |
0x100002 |
Server -> client: pad data with motion |
ClientTimeoutMs |
5000 |
Client subscription expiry (5 seconds) |
SIO_UDP_CONNRESET |
0x9800000C |
IOControl to suppress ICMP port-unreachable resets |
| Field | Type | Description |
|---|---|---|
_socket |
Socket |
UDP socket bound to IPAddress.Loopback
|
_receiveThread |
Thread |
Background receive loop thread |
_running |
volatile bool |
Server running flag |
_serverId |
uint |
Unique server ID (from Environment.TickCount) |
_port |
int |
Listening port |
_packetCounters |
uint[4] |
Per-slot packet counter (incremented per broadcast) |
_subscriptions |
Dictionary<(EndPoint, int), long> |
Per-slot client subscriptions with Stopwatch.GetTimestamp()
|
_allSlotSubscriptions |
Dictionary<EndPoint, long> |
All-slot client subscriptions with timestamp |
_slotConnected |
bool[4] |
Per-slot connection state reported to clients |
_slotHasMotion |
bool[4] |
Per-slot motion capability (device has gyro/accel sensors) |
_disposed |
bool |
Dispose guard |
public event EventHandler<string> StatusChanged;Raised when server status changes. Localized values from Strings.Instance:
-
"Listening on :{port}"-- successful bind -
"Port {port} in use"--SocketError.AddressAlreadyInUse -
"Failed to start"-- other exceptions -
"Stopped"-- afterStop()
sequenceDiagram
participant Client as Emulator (Client)
participant Server as PadForge (Server)
participant Poll as Polling Thread
Note over Client,Server: All packets: DSUC (client) / DSUS (server) magic + CRC32
Client->>Server: Version Request (0x100000)
Server->>Client: Version Response (version=1001)
Client->>Server: Controller Info Request (0x100001)<br/>slots=[0,1,2,3]
Server->>Client: Controller Info (slot 0, connected, full gyro)
Server->>Client: Controller Info (slot 1, not connected)
Server->>Client: Controller Info (slot 2, not connected)
Server->>Client: Controller Info (slot 3, not connected)
Client->>Server: Pad Data Request (0x100002)<br/>flags=0x01, slot=0<br/>(subscribes to slot 0)
loop Every ~1ms (1000Hz)
Poll->>Server: BroadcastMotion(slot=0, snapshot, connected=true)
Server->>Server: GetSubscribers(0) -> [Client]
Server->>Client: Pad Data (slot 0, motion data)
end
Note over Client,Server: Subscription expires after 5 seconds<br/>Client must re-subscribe
public bool Start(int port = 26760)- Returns
trueimmediately if already running (_running == true). - Sets
_portand generates_serverIdfromEnvironment.TickCount. - Creates
Socket(AddressFamily.InterNetwork, SocketType.Dgram, ProtocolType.Udp). - Applies
SIO_UDP_CONNRESETIOControl (_socket.IOControl(0x9800000C, new byte[4], null)) to suppress ICMP port-unreachable causingSocketExceptionon the nextReceiveFrom. Catches exceptions (non-Windows or older OS). - Binds to
new IPEndPoint(IPAddress.Loopback, port). - Sets
_running = true. - Creates and starts background receive thread:
- Name:
"PadForge.DsuServer" -
IsBackground = true(does not prevent process exit)
- Name:
- Raises
StatusChangedwith listening message. - Returns
trueon success.
Error handling:
-
SocketExceptionwithAddressAlreadyInUse: disposes socket, raisesStatusChangedwith port-in-use message, returnsfalse. - Any other exception: disposes socket, raises
StatusChangedwith failure message, returnsfalse.
public void Stop()- Returns immediately if
!_running. - Sets
_running = false. - Closes socket (
_socket?.Close(), exception caught). - Joins receive thread with 2-second timeout (
_receiveThread?.Join(2000)). - Sets
_receiveThread = null,_socket = null. - Under
lock(_subscriptions): clears both_subscriptionsand_allSlotSubscriptions. - Zeroes all
_packetCounters. - Raises
StatusChangedwith stopped message.
public void Dispose()Calls Stop() if not already disposed. Sets _disposed = true. Not thread-safe (no lock on _disposed).
public void BroadcastMotion(int slot, MotionSnapshot snapshot, bool connected)Called from the InputManager polling thread at ~1000Hz. This is the primary data path.
- Guard: returns immediately if
!_running,_socket == null, orslotis out of range[0, MaxSlots). - Updates
_slotConnected[slot]and_slotHasMotion[slot]from parameters. - Calls
GetSubscribers(slot)to get active subscribers. - Returns immediately if no subscribers (no packet allocation).
- Calls
BuildPadDataPacket(slot, snapshot, connected)to construct the packet. - Sends packet to each subscriber via
_socket.SendTo(packet, ep). Exceptions are caught silently (client gone, will timeout).
Performance optimization: Packet allocation and serialization only happen when there are active subscribers. At 1000Hz with no subscribers, the method returns after a dictionary lookup.
All packets (client and server) share this header format:
Byte offset Size Field Notes
[0..3] 4 Magic "DSUS" (server->client) or "DSUC" (client->server)
[4..5] 2 Protocol version Little-endian uint16, value 1001
[6..7] 2 Payload length Little-endian uint16, excludes header (16 bytes)
[8..11] 4 CRC32 Little-endian uint32, zeroed before computation
[12..15] 4 ID Server ID (server->client) or Client ID (client->server)
private void WriteHeader(byte[] packet, int payloadLength, uint msgType)Writes the 16-byte header with "DSUS" magic, protocol version, payload length, and server ID. The CRC32 field is left zeroed (filled later by FinalizeCrc). Also writes the message type as the first 4 bytes of the payload (at offset HeaderSize).
No additional payload beyond the message type.
Byte offset Size Field
[0..15] 16 Header (magic="DSUC", CRC, clientId)
[16..19] 4 Message type: 0x100000
Total packet: 20 bytes. Payload length: 4 bytes.
Byte offset Size Field
[0..15] 16 Header (magic="DSUS", CRC, serverId)
[16..19] 4 Message type: 0x100000
[20..21] 2 Protocol version: 1001
[22..23] 2 Padding (zero)
Total packet: 24 bytes. Payload length: 8 bytes.
Byte offset Size Field
[0..15] 16 Header
[16..19] 4 Message type: 0x100001
[20..23] 4 Number of ports requested (int32 LE)
[24..N] N Slot indices (one byte each)
Validated: numPorts must be in [0, MaxSlots] and packet must contain enough bytes for all slot indices. Each valid slot triggers a SendControllerInfo() response.
One response sent per requested slot.
Byte offset Size Field Value
[0..15] 16 Header magic="DSUS"
[16..19] 4 Message type 0x100001
[20] 1 Slot number 0-3
[21] 1 Slot state 0=not connected, 2=connected
[22] 1 Device model 0=N/A, 2=full gyro
[23] 1 Connection type 0=N/A
[24..29] 6 MAC address 00:00:00:00:00:{slot}
[30] 1 Battery status 0x05 (charged)
[31] 1 Padding 0x00
Total packet: 32 bytes. Payload length: 16 bytes.
MAC address: Fake but unique per slot. The last byte is the slot number (0-3), all others are 0x00.
Device model: Set to 2 (full gyro) when _slotHasMotion[slot] is true, 0 otherwise.
Byte offset Size Field
[0..15] 16 Header
[16..19] 4 Message type: 0x100002
[20] 1 Flags (subscription mode)
[21] 1 Slot number
[22..27] 6 MAC address
Validated: packet must be at least HeaderSize + 12 (28) bytes.
Subscription flags:
| Flag Value | Behavior |
|---|---|
0x00 |
Subscribe to ALL pads (stored in _allSlotSubscriptions) |
0x01 |
Subscribe to specific slot by ID (stored in _subscriptions[(endpoint, slot)]) |
0x02 |
Subscribe by MAC (treated as all-slot subscription) |
0x03 |
Both 0x01 and 0x02 (subscribe to specific slot AND all-slot) |
The largest message type. Contains controller info header, button state, analog inputs, touch data, and motion sensor data. PadForge is a motion-only server: button bitmasks, analog sticks, D-pad, analog buttons, and touch data are all zeroed (sticks centered at 128).
Byte offset Size Field Value / Notes
─────────── ───── ──────────────────────── ─────────────────────────────────
[0..15] 16 Header magic="DSUS"
[16..19] 4 Message type 0x100002
[20] 1 Slot number 0-3
[21] 1 Slot state 0=disconnected, 2=connected
[22] 1 Device model 0=N/A, 2=full gyro
[23] 1 Connection type 0=N/A
[24..29] 6 MAC address 00:00:00:00:00:{slot}
[30] 1 Battery status 0x05 (charged)
[31] 1 Connected flag 1=connected, 0=disconnected
[32..35] 4 Packet counter uint32 LE, incremented per packet
[36] 1 Buttons bitmask 1 0x00 (zeroed, motion-only)
[37] 1 Buttons bitmask 2 0x00 (zeroed)
[38] 1 Home button 0x00 (zeroed)
[39] 1 Touch button 0x00 (zeroed)
[40] 1 Left stick X 128 (centered)
[41] 1 Left stick Y 128 (centered)
[42] 1 Right stick X 128 (centered)
[43] 1 Right stick Y 128 (centered)
[44..47] 4 Analog D-Pad 0x00 (L, D, R, U)
[48..55] 8 Analog buttons 0x00 (8 bytes)
[56..61] 6 Touch 1 data 0x00 (active, id, x16, y16)
[62..67] 6 Touch 2 data 0x00
[68..75] 8 Motion timestamp int64 LE, microseconds
[76..79] 4 Accel X float LE (g-force)
[80..83] 4 Accel Y float LE
[84..87] 4 Accel Z float LE
[88..91] 4 Gyro Pitch float LE (deg/s)
[92..95] 4 Gyro Yaw float LE
[96..99] 4 Gyro Roll float LE
Total packet: 100 bytes (16 header + 84 payload). Payload: 4 bytes message type + 80 bytes data = 84 bytes.
Code offset mapping: In BuildPadDataPacket, the variable o = HeaderSize + 4 (= 20) is the base offset for data after the message type. The comment-level offsets in the code ([+0] slot, [+48] timestamp, [+56] accelX) are relative to o. To get the absolute byte offset in the packet: o + relative = 20 + relative. To get the payload offset: 4 + relative.
| Field | Code relative (o+N) |
Absolute byte | Payload offset |
|---|---|---|---|
| Slot | o + 0 |
20 | +4 |
| Packet counter | o + 12 |
32 | +16 |
| Left stick X | o + 20 |
40 | +24 |
| Motion timestamp | o + 48 |
68 | +52 |
| Accel X | o + 56 |
76 | +60 |
| Accel Y | o + 60 |
80 | +64 |
| Accel Z | o + 64 |
84 | +68 |
| Gyro Pitch | o + 68 |
88 | +72 |
| Gyro Yaw | o + 72 |
92 | +76 |
| Gyro Roll | o + 76 |
96 | +80 |
Motion timestamp: Written as Int64 (8 bytes) at o + 48, which spans absolute bytes [68..75]. This is a uint64 microsecond timestamp per the cemuhook protocol spec.
Float encoding: All accelerometer and gyroscope values are IEEE 754 single-precision floats written in little-endian byte order via BinaryPrimitives.WriteSingleLittleEndian.
private List<EndPoint> GetSubscribers(int slot)Returns a list of endpoints subscribed to the given slot. Called from BroadcastMotion() at ~1000Hz per active slot.
Two subscription dictionaries, both protected by lock(_subscriptions):
| Dictionary | Key | Value | Populated By |
|---|---|---|---|
_subscriptions |
(EndPoint, slotIndex) |
Stopwatch.GetTimestamp() |
Pad data request with flags & 0x01
|
_allSlotSubscriptions |
EndPoint |
Stopwatch.GetTimestamp() |
Pad data request with flags == 0 or flags & 0x02
|
- Acquire
lock(_subscriptions). - Compute
timeoutTicks = Stopwatch.Frequency * ClientTimeoutMs / 1000(5-second timeout in high-resolution ticks). -
Per-slot subscribers: Iterate
_subscriptionsfor entries matching the requested slot:- If
now - timestamp > timeoutTicks: add toexpiredlist for later removal. - Otherwise: add endpoint to result list and
seenHashSet.
- If
-
All-slot subscribers: Iterate
_allSlotSubscriptions:- If expired: add to
expiredAlllist. - Otherwise: add endpoint to result if not already in
seen(prevents duplicates when a client has both per-slot and all-slot subscriptions).
- If expired: add to
-
Prune expired: Remove all entries from
expiredandexpiredAlllists from their respective dictionaries. - Release lock and return result list.
Subscriptions expire after ClientTimeoutMs (5000 ms). Clients must periodically re-send pad data requests to stay subscribed. Expired entries are pruned lazily during GetSubscribers() iteration -- there is no background cleanup thread.
Timestamps use Stopwatch.GetTimestamp() (high-resolution performance counter) rather than DateTime.Now for sub-millisecond precision.
private static readonly uint[] Crc32Table = GenerateCrc32Table();
private static uint[] GenerateCrc32Table()
{
var table = new uint[256];
for (uint i = 0; i < 256; i++)
{
uint entry = i;
for (int j = 0; j < 8; j++)
entry = (entry & 1) != 0 ? (entry >> 1) ^ 0xEDB88320 : entry >> 1;
table[i] = entry;
}
return table;
}Standard CRC32 with reflected polynomial 0xEDB88320 (bit-reversed form of 0x04C11DB7). The 256-entry lookup table is generated once at static initialization.
private static uint ComputeCrc32(byte[] data, int length)
{
uint crc = 0xFFFFFFFF;
for (int i = 0; i < length; i++)
crc = (crc >> 8) ^ Crc32Table[(crc ^ data[i]) & 0xFF];
return crc ^ 0xFFFFFFFF;
}Standard table-driven CRC32: initialize to 0xFFFFFFFF, XOR with final 0xFFFFFFFF. Processes length bytes (not the full array length).
private static void FinalizeCrc(byte[] packet)To compute outgoing CRC:
- Zero the CRC field at
packet[8..11]. - Compute CRC over the entire packet (
packet.Length). - Write the CRC back to
packet[8..11]viaBinaryPrimitives.WriteUInt32LittleEndian.
To verify incoming CRC (in ProcessPacket):
- Read CRC from
data[8..11]. - Zero
data[8..11]. - Compute CRC over
HeaderSize + payloadLengthbytes. - Compare computed CRC against received CRC. Reject if mismatch.
private void ReceiveLoop()Background thread with name "PadForge.DsuServer" and IsBackground = true. Loops ReceiveFrom() with a 1024-byte buffer until _running is false.
Each received packet passes through 5 validation stages in ProcessPacket():
| Stage | Check | Reject Condition |
|---|---|---|
| 1 | Minimum size |
received < HeaderSize + 4 (20 bytes) |
| 2 | Magic bytes | Not "DSUC" (bytes D, S, U, C) |
| 3 | Protocol version |
version > ProtocolVersion (1001) |
| 4 | Payload length | HeaderSize + payloadLength > received |
| 5 | CRC32 | Computed CRC does not match received CRC |
After validation, dispatches based on message type:
| Message Type | Handler |
|---|---|
0x100000 |
HandleVersionRequest(sender) |
0x100001 |
HandleControllerInfoRequest(data, length, sender) |
0x100002 |
HandlePadDataRequest(data, length, sender) |
| Exception | When | Action |
|---|---|---|
SocketException when !_running
|
Socket closed during shutdown |
break (exit loop) |
ObjectDisposedException |
Socket object disposed |
break (exit loop) |
| Any other exception | Malformed packet, transient error | Silently caught, continue loop |
graph LR
subgraph "Receive Thread"
RT["ReceiveLoop()<br/>(background thread)"]
RT --> PV["ProcessPacket<br/>Validate + dispatch"]
PV --> HV["HandleVersionRequest"]
PV --> HC["HandleControllerInfoRequest"]
PV --> HP["HandlePadDataRequest<br/>(updates subscriptions)"]
end
subgraph "Polling Thread (InputManager)"
PT["~1000Hz polling loop"]
PT --> BM["BroadcastMotion()"]
BM --> GS["GetSubscribers()"]
GS --> BP["BuildPadDataPacket()"]
BP --> ST["socket.SendTo()"]
end
HP -.->|"lock(_subscriptions)"| SUB["_subscriptions<br/>_allSlotSubscriptions"]
GS -.->|"lock(_subscriptions)"| SUB
| Aspect | Mechanism |
|---|---|
_subscriptions dictionary |
Protected by lock(_subscriptions) -- both read (GetSubscribers) and write (HandlePadDataRequest) acquire the same lock |
_running flag |
volatile bool -- no lock needed, provides happens-before ordering |
_slotConnected, _slotHasMotion
|
Written by polling thread (BroadcastMotion), read by receive thread (SendControllerInfo). No lock -- benign race (stale value at worst) |
_packetCounters |
Accessed only from BroadcastMotion (single caller per slot). No lock needed. |
The DSU protocol supports a maximum of 4 slots (0-3). PadForge supports up to 16 virtual controller slots, but only slots 0-3 participate in DSU broadcasts. Slots 4-15 skip DSU entirely -- BroadcastMotion returns immediately when slot >= MaxSlots.
BroadcastMotion() is called at ~1000Hz for every active slot. The method includes several optimizations to minimize overhead when there are no subscribers:
-
Guard clause: Returns immediately on
!_running, null socket, or out-of-range slot (no lock acquisition). -
Subscriber check before packet build:
GetSubscribers(slot)is called first. If the returned list is empty,BuildPadDataPacket()is never called (no 100-byte allocation). -
Lazy expiration: Expired subscriptions are pruned during
GetSubscribers()iteration rather than via a dedicated timer, avoiding extra thread synchronization.
At 1000Hz with 4 active slots and no subscribers, the total overhead is ~4000 dictionary lookups/second (within lock), with no memory allocation.
Location: tools/DsuDiag/
Standalone DSU client diagnostic tool that connects to the DSU server and displays received motion data per-slot in real-time. Used for debugging axis mapping and verifying protocol compliance.
Usage:
- Start PadForge with DSU server enabled (Settings page, default port 26760).
- Run
DsuDiag.exe-- connects tolocalhost:26760. - Subscribes to all 4 slots and displays accelerometer/gyroscope values in the console.