/
fml.tex
193 lines (157 loc) · 7.7 KB
/
fml.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
\documentclass[a4paper,11pt]{article}
\usepackage{fullpage}
\usepackage[latin1]{inputenc}
\usepackage[T1]{fontenc}
\usepackage[normalem]{ulem}
\usepackage[english]{babel}
\usepackage{listings,babel}
\lstset{breaklines=true,basicstyle=\ttfamily}
\usepackage{graphicx}
\usepackage{moreverb}
\usepackage{float}
\usepackage{url}
\usepackage{tabularx}
\title{FastMemoryLink (FML) bus specifications}
\author{S\'ebastien Bourdeauducq}
\date{December 2009}
\begin{document}
\setlength{\parindent}{0pt}
\setlength{\parskip}{5pt}
\maketitle{}
\section{Introduction}
The FastMemoryLink bus is designed to provide a high-performance interface between a DRAM controller and peripherals that need to access large amounts of data.
FML buses are referred to as \textit{bxw FML}; which means that the bus operates with a burst length of b and that the width (in bits) of each unidirectional data line is w. For example, Milkymist uses a 4x64 FML bus; which means that each transfer with the DRAM controller is made up of four 64-bit chunks.
Its main features are the following:
\begin{itemize}
\item Synchronism. The bus is meant to be used in FPGA-based devices, whose architectures are designed for synchronous (clock-driven) systems.
\item Burst oriented. Each cycle begins with an address phase, which is then followed by several data chunks which are transferred on consecutive clock edges (the data phase). The length of the burst is fixed.
\item Pipelined transfers. During the data phase of a cycle, the control lines are free and can be used to initiate the address phase of the next cycle.
\end{itemize}
\section{Specifications}
\subsection{Signals}
A FastMemoryLink interface is made up of the following signals:
\begin{tabularx}{\textwidth}{|l|l|l|X|}
\hline
\bf{Signal} & \bf{Width} & \bf{Direction} & \bf{Description} \\
\hline
a & User-spec. & Master to slave & Address signals. They are used to specify the location in DRAM to be accessed. \\
\hline
stb & 1 & Master to slave & Strobe signal. This signal qualifies a cycle. Once it has been asserted, it cannot be deasserted until the cycle has been acknowledged by the slave. \\
\hline
we & 1 & Master to slave & Write enable signal. \\
\hline
ack & 1 & Slave to master & Acknowledgement signal. This signal is asserted for one cycle by the slave when it is ready to begin the data phase. \\
\hline
dw & User-spec. (w) & Master to slave & Write data. \\
\hline
dr & User-spec. (w) & Slave to master & Read data. \\
\hline
\end{tabularx}
\subsection{Single read cycle}
The master initiates a read cycle by presenting a valid address, deasserting \verb!we!, and asserting \verb!stb!.
At least one clock cycle later, the slave presents valid data on \verb!dr! and asserts \verb!ack! to mark the beginning of the data phase. During the b-1 subsequent cycles, the slave keeps sending nearby data using the \verb!dr! lines only (see the ``Burst ordering'' section below).
Here is an example single read timing diagram for a bus using a burst length of 4.
\begin{tabular}{|l|c|c|c|c|c|c|c|}
\hline
a & X & A & A & X & X & X & X \\
\hline
stb & 0 & 1 & 1 & 0 & 0 & 0 & 0 \\
\hline
we & X & 0 & 0 & X & X & X & X \\
\hline
ack & 0 & 0 & 1 & 0 & 0 & 0 & 0 \\
\hline
dw & X & X & X & X & X & X & X \\
\hline
dr & X & X & A+0 & A+1 & A+2 & A+3 & X \\
\hline
\end{tabular}
X = don't care
\subsection{Single write cycle}
The master initiates a write cycle by presenting valid address and data, asserting \verb!we!, and asserting \verb!stb!.
At least one clock cycle later, the slave asserts \verb!ack! to mark the beginning of the data phase. During the b-1 subsequent cycles, the master keeps sending nearby data using the \verb!dw! lines only (see the ``Burst ordering'' section below).
Here is an example single write timing diagram for a bus using a burst length of 4.
\begin{tabular}{|l|c|c|c|c|c|c|c|}
\hline
a & X & A & A & X & X & X & X \\
\hline
stb & 0 & 1 & 1 & 0 & 0 & 0 & 0 \\
\hline
we & X & 1 & 1 & X & X & X & X \\
\hline
ack & 0 & 0 & 1 & 0 & 0 & 0 & 0 \\
\hline
dw & X & A+0 & A+0 & A+1 & A+2 & A+3 & X \\
\hline
dr & X & X & X & X & X & X & X \\
\hline
\end{tabular}
\subsection{General restrictions}
While waiting for the \verb!ack! signal to become active, the master must continue to assert \verb!stb! and keep \verb!a! and \verb!we! constant.
On the cycle following the assertion of the \verb!ack! signal, the slave must deassert \verb!stb! unless it wants to start a new (pipelined) cycle (see below).
The slave is not allowed to assert \verb!ack! when \verb!stb! has not been asserted for at least one cycle.
\subsection{Pipelined read cycles}
To maximize bus utilisation and reduce latency, the master is allowed to start the address phase of the next cycle during the data phase of the current cycle.
The slave must not acknowledge the new cycle until all data from the current cycle have been transferred.
Here is an example timing diagram.
\begin{tabular}{|l|c|c|c|c|c|c|c|c|c|c|c|}
\hline
a & X & A & A & B & B & B & B & X & X & X & X\\
\hline
stb & 0 & 1 & 1 & 1 & 1 & 1 & 1 & 0 & 0 & 0 & 0 \\
\hline
we & X & 0 & 0 & 0 & 0 & 0 & 0 & X & X & X & X \\
\hline
ack & 0 & 0 & 1 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 \\
\hline
dw & X & X & X & X & X & X & X & X & X & X & X \\
\hline
dr & X & X & A+0 & A+1 & A+2 & A+3 & B+0 & B+1 & B+2 & B+3 & X \\
\hline
\end{tabular}
\subsection{Pipelined write cycles}
Writes can also be pipelined. However, the master cannot present its data immediately on the write lines since they are already busy. Instead, it must present its data as soon as possible; that is, immediately after the last word of the current cycle has been transferred.
Here is an example timing diagram.
\begin{tabular}{|l|c|c|c|c|c|c|c|c|c|c|c|}
\hline
a & X & A & A & B & B & B & B & X & X & X & X\\
\hline
stb & 0 & 1 & 1 & 1 & 1 & 1 & 1 & 0 & 0 & 0 & 0 \\
\hline
we & X & 1 & 1 & 1 & 1 & 1 & 1 & X & X & X & X \\
\hline
ack & 0 & 0 & 1 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 \\
\hline
dw & X & A+0 & A+0 & A+1 & A+2 & A+3 & B+0 & B+1 & B+2 & B+3 & X \\
\hline
dr & X & X & X & X & X & X & X & X & X & X & X \\
\hline
\end{tabular}
\subsection{Overlapping reads and writes}
The FML bus allows overlapping read and write cycles. This typically requires a relatively complex DRAM controller which implements a write queue.
The slave must assert the \verb!ack! signal at least two clock cycles after it asserted it to acknowledge the previous bus cycle.
The following timing diagram shows a read cycle which is overlapped by a write cycle.
\begin{tabular}{|l|c|c|c|c|c|c|c|c|}
\hline
a & X & A & A & B & B & X & X & X\\
\hline
stb & 0 & 1 & 1 & 1 & 1 & 0 & 0 & 0\\
\hline
we & X & 0 & 0 & 1 & 1 & X & X & X \\
\hline
ack & 0 & 0 & 1 & 0 & 1 & 0 & 0 & 0 \\
\hline
dw & X & X & X & X & B+0 & B+1 & B+2 & B+3 \\
\hline
dr & X & X & A+0 & A+1 & A+2 & A+3 & X & X \\
\hline
\end{tabular}
\subsection{Burst ordering}
The modulo b of the address is used to specify the burst ordering.
It is strongly suggested that b should be a power of 2, so that the modulo and quotient can be computed by simply slicing the address bit vector.
The bus uses a linear wrapping burst ordering. For example, on a bus with a burst length of 4, an access starting at address 128 yields the addresses \verb!128; 129; 130; 131!, and an access starting at address 129 yields the addresses \verb!129; 130; 131; 128!.
Burst reordering can be used to implement critical-word-first in caches.
\section*{Copyright notice}
Copyright \copyright 2007-2009 S\'ebastien Bourdeauducq. \\
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the LICENSE.FDL file at the root of the Milkymist source distribution.
\end{document}