Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Browse code

doc/bus: update

  • Loading branch information...
commit 0cef98373ffa3e9bef02f37bb45fbb5a956001ea 1 parent 411e6ec
Sébastien Bourdeauducq authored July 20, 2013
BIN  doc/asmi_topology.dia
Binary file not shown
BIN  doc/asmi_topology.png
98  doc/bus.rst
Source Rendered
@@ -5,7 +5,7 @@ Migen Bus contains classes providing a common structure for master and slave int
5 5
 
6 6
 * Wishbone [wishbone]_, the general purpose bus recommended by Opencores.
7 7
 * CSR-2 (see :ref:`csr2`), a low-bandwidth, resource-sensitive bus designed for accessing the configuration and status registers of cores from software.
8  
-* ASMIbus (see :ref:`asmi`), a split-transaction bus optimized for use with a high-performance, out-of-order SDRAM controller.
  8
+* LASMIbus (see :ref:`lasmi`), a bus optimized for use with a high-performance frequency-ratio SDRAM controller.
9 9
 * DFI [dfi]_ (partial), a standard interface protocol between memory controller logic and PHY interfaces.
10 10
 
11 11
 .. [wishbone] http://cdn.opencores.org/downloads/wbspec_b4.pdf
@@ -13,13 +13,13 @@ Migen Bus contains classes providing a common structure for master and slave int
13 13
 
14 14
 It also provides interconnect components for these buses, such as arbiters and address decoders. The strength of the Migen procedurally generated logic can be illustrated by the following example: ::
15 15
 
16  
-  wbcon = wishbone.InterconnectShared(
  16
+  self.submodules.wbcon = wishbone.InterconnectShared(
17 17
       [cpu.ibus, cpu.dbus, ethernet.dma, audio.dma],
18 18
       [(lambda a: a[27:] == 0, norflash.bus),
19  
-       (lambda a: a[27:] == 1, wishbone2asmi.wishbone),
  19
+       (lambda a: a[27:] == 1, wishbone2lasmi.wishbone),
20 20
        (lambda a: a[27:] == 3, wishbone2csr.wishbone)])
21 21
 
22  
-In this example, the interconnect component generates a 4-way round-robin arbiter, multiplexes the master bus signals into a shared bus, and connects all slave interfaces to the shared bus, inserting the address decoder logic in the bus cycle qualification signals and multiplexing the data return path. It can recognize the signals in each core's bus interface thanks to the common structure mandated by Migen Bus. All this happens automatically, using only that much user code. The resulting interconnect logic can be retrieved using ``wbcon.get_fragment()``, and combined with the fragments from the rest of the system.
  22
+In this example, the interconnect component generates a 4-way round-robin arbiter, multiplexes the master bus signals into a shared bus, and connects all slave interfaces to the shared bus, inserting the address decoder logic in the bus cycle qualification signals and multiplexing the data return path. It can recognize the signals in each core's bus interface thanks to the common structure mandated by Migen Bus. All this happens automatically, using only that much user code.
23 23
 
24 24
 
25 25
 Configuration and Status Registers
@@ -46,20 +46,21 @@ Migen Bank is a system comparable to wishbone-gen [wbgen]_, which automates the
46 46
 
47 47
 Bank takes a description made up of a list of registers and generates logic implementing it with a slave interface compatible with Migen Bus.
48 48
 
49  
-A register can be "raw", which means that the core has direct access to it. It also means that the register width must be less or equal to the bus word width. In that case, the register object provides the following signals:
  49
+The lowest-level description of a register is provided by the ``CSR`` class, which maps to the value at a single address on the target bus. The width of the register needs to be inferior or equal to the bus word width. All accesses are atomic. It has the following signal properties as interface to the user design:
50 50
 
51 51
 * ``r``, which contains the data written from the bus interface.
52 52
 * ``re``, which is the strobe signal for ``r``. It is active for one cycle, after or during a write from the bus. ``r`` is only valid when ``re`` is high.
53 53
 * ``w``, which must provide at all times the value to be read from the bus.
54 54
 
55  
-Registers that are not raw are managed by Bank and contain fields. If the sum of the widths of all fields attached to a register exceeds the bus word width, the register will automatically be sliced into words of the maximum size and implemented at consecutive bus addresses, MSB first. Field objects have two parameters, ``access_bus`` and ``access_dev``, determining respectively the access policies for the bus and core sides. They can take the values ``READ_ONLY``, ``WRITE_ONLY`` and ``READ_WRITE``.
56  
-If the device can read, the field object provides the r signal, which contains at all times the current value of the field (kept by the logic generated by Bank).
57  
-If the device can write, the field object provides the following signals:
  55
+Compound CSRs (which are transformed into ``CSR`` plus additional logic for implementation) provide additional features optimized for common applications.
58 56
 
59  
-* ``w``, which provides the value to be written into the field.
60  
-* ``we``, which strobes the value into the field.
  57
+The ``CSRStatus`` class is meant to be used as a status register that is read-only from the CPU. The user design is expected to drive its ``status`` signal. The advantage of using ``CSRStatus`` instead of using ``CSR`` and driving ``w`` is that the width of ``CSRStatus`` can be arbitrary. Status registers larger than the bus word width are automatically broken down into several ``CSR`` registers to span several addresses. Be careful that the atomicity of reads is not guaranteed.
61 58
 
62  
-As a special exception, fields that are read-only from the bus and write-only for the device do not use the ``we`` signal. Instead, the device must permanently drive valid data on the ``w`` signal.
  59
+The ``CSRStorage`` class provides a memory location that can be read and written by the CPU, and read and optionally written by the design. It can also span several CSR addresses. An optional mechanism for atomic CPU writes is provided; when enabled, writes to the first CSR addresses go to a back-buffer whose contents are atomically copied to the main buffer when the last address is written. When ``CSRStorage`` can be written to by the design, the atomicity of reads by the CPU is not guaranteed.
  60
+
  61
+A module can provide bus-independent CSRs by implementing a ``get_csrs`` method that returns a list of instances of the classes described above. Similary, bus-independent memories can be returned as a list by a ``get_memories`` method.
  62
+
  63
+To avoid listing those manually, a module can inherit from the ``AutoCSR`` class, which provides ``get_csrs`` and ``get_memories`` methods that scan for CSR and memory attributes and return their list. If the module has child objects that implement ``get_csrs`` or ``get_memories``, they will be called by the ``AutoCSR`` methods and their CSR and memories added to the lists returned, with the child objects' names as prefixes.
63 64
 
64 65
 Generating interrupt controllers
65 66
 ================================
@@ -68,20 +69,21 @@ The event manager provides a systematic way to generate standard interrupt contr
68 69
 Its constructor takes as parameters one or several *event sources*. An event source is an instance of either:
69 70
 
70 71
 * ``EventSourcePulse``, which contains a signal ``trigger`` that generates an event when high. The event stays asserted after the ``trigger`` signal goes low, and until software acknowledges it. An example use is to pulse ``trigger`` high for 1 cycle after the reception of a character in a UART.
71  
-* ``EventSourceLevel``, which contains a signal ``trigger`` that generates an event on its falling edge. The purpose of this event source is to monitor the status of processes and generate an interrupt on their completion. The signal ``trigger`` can be connected to the ``busy`` signal of a dataflow actor, for example.
  72
+* ``EventSourceProcess``, which contains a signal ``trigger`` that generates an event on its falling edge. The purpose of this event source is to monitor the status of processes and generate an interrupt on their completion. The signal ``trigger`` can be connected to the ``busy`` signal of a dataflow actor, for example.
  73
+* ``EventSourceLevel``, whose ``trigger`` contains the instantaneous state of the event. It must be set and released by the user design. For example, a DMA controller with several slots can use this event source to signal that one or more slots require CPU attention.
72 74
 
73 75
 The ``EventManager`` provides a signal ``irq`` which is driven high whenever there is a pending and unmasked event. It is typically connected to an interrupt line of a CPU.
74 76
 
75  
-The ``EventManager`` provides a method ``get_registers``, that returns a list of registers to be used with Migen Bank. Each event source is assigned one bit in each of those registers. They are:
  77
+The ``EventManager`` provides a method ``get_csrs``, that returns a bus-independent list of CSRs to be used with Migen Bank as explained above. Each event source is assigned one bit in each of those registers. They are:
76 78
 
77  
-* ``status``: contains the current level of the trigger line of ``EventSourceLevel`` sources. It is 0 for ``EventSourcePulse``. This register is read-only.
  79
+* ``status``: contains the current level of the trigger line of ``EventSourceProcess`` and ``EventSourceLevel`` sources. It is 0 for ``EventSourcePulse``. This register is read-only.
78 80
 * ``pending``: contains the currently asserted events. Writing 1 to the bit assigned to an event clears it.
79 81
 * ``enable``: defines which asserted events will cause the ``irq`` line to be asserted. This register is read-write.
80 82
 
81  
-.. _asmi:
  83
+.. _lasmi:
82 84
 
83  
-Advanced System Memory Infrastructure
84  
-*************************************
  85
+Lightweight Advanced System Memory Infrastructure
  86
+*************************************************
85 87
 
86 88
 Rationale
87 89
 =========
@@ -103,58 +105,38 @@ The first two techniques are explained with more details in [drreorder]_.
103 105
 
104 106
 .. [drreorder] http://www.xilinx.com/txpatches/pub/documentation/misc/improving%20ddr%20sdram%20efficiency.pdf
105 107
 
106  
-To enable the efficient implementation of these mechanisms, a new communication protocol with the memory controller must be devised. Migen and Milkymist SoC (-NG) implement their own bus, called ASMIbus, based on the split-transaction principle.
107  
-
108  
-Topology
109  
-========
110  
-The ASMI consists of a memory controller (e.g. ASMIcon) containing a hub that connects the multiple masters, handles transaction tags, and presents a view of the pending requests to the rest of the memory controller.
111  
-
112  
-Each master has a number of dedicated transaction slots allocated inside the hub. Each slot is assigned a tag, that is later used in the data transfer to identify the slot the data belongs to.
113  
-
114  
-It is suggested that memory controllers use an interface to a PHY compatible with DFI [dfi]_. The DFI clock can be the same as the ASMIbus clock, with optional serialization and deserialization taking place across the PHY, as specified in the DFI standard.
115  
-
116  
-.. figure:: asmi_topology.png
117  
-   :scale: 85 %
  108
+Migen and milkymist-ng implement their own bus, called LASMIbus, that features the last two techniques. Grouping by row had been previously explored with ASMI, but difficulties in achieving timing closure at reasonable latencies in FPGA combined with uncertain performance pay-off for some applications discouraged work in that direction.
118 109
 
119  
-   ASMI topology.
120  
-
121  
-Signals
122  
-=======
123  
-The ASMIbus consists of two parts: the control signals, and the data signals.
124  
-
125  
-The control signals are used to issue requests.
126  
-
127  
-* Master-to-Hub:
128  
-
129  
-  * ``adr`` communicates the memory address to be accessed. The unit is the word width of the particular implementation of ASMIbus.
130  
-  * ``we`` is the write enable signal.
131  
-  * ``stb`` qualifies the transaction request, and should be asserted until ``ack`` goes high.
  110
+Topology and transactions
  111
+=========================
  112
+The LASMI consists of one or several memory controllers (e.g. LASMIcon from milkymist-ng), multiple masters, and crossbar interconnect.
132 113
 
133  
-* Hub-to-Master
  114
+Each memory controller can expose several bank machines to the crossbar. This way, requests to different SDRAM banks can be processed in parallel.
134 115
 
135  
-  * ``tag_issue`` is an integer representing the transaction ("tag") attributed by the hub. The width of this signal is determined by the maximum number of in-flight transactions that the hub port can handle.
136  
-  * ``ack`` is asserted when ``tag_issue`` is valid and the transaction has been registered by the hub. A hub may assert ``ack`` even when ``stb`` is low, which means it is ready to accept any new transaction and will do as soon as ``stb`` goes high.
  116
+Transactions on LASMI work as follows:
137 117
 
138  
-The data signals are used to complete requests.
  118
+1. The master presents a valid address and write enable signals, and asserts its strobe signal.
  119
+2. The crossbar decodes the bank address and, in a multi-controller configuration, the controller address and connects the master to the appropriate bank machine.
  120
+3. The bank machine acknowledges the request from the master. The master can immediately issue a new request to the same bank machine, without waiting for data.
  121
+4. The bank machine sends data acknowledgements to the master, in the same order as it issued requests. After receiving a data acknowldegement, the master must either:
139 122
 
140  
-* Hub-to-Master
  123
+  * present valid data after a fixed number of cycles (for writes). Masters must hold their data lines at 0 at all other times so that they can be simply ORed for each controller to produce the final SDRAM write data.
  124
+  * sample the data bus after a fixed number of cycles (for reads).
141 125
 
142  
-  * ``tag_call`` is used to identify the transaction for which the data is "called". It takes the tag value that has been previously attributed by the hub to that transaction during the issue phase.
143  
-  * ``call`` qualifies ``tag_call``.
144  
-  * ``data_r`` returns data from the DRAM in the case of a read transaction. It is valid for one cycle after CALL has been asserted and ``tag_call`` has identified the transaction. The value of this signal is undefined for the cycle after a write transaction data have been called.
  126
+5. In a multi-controller configuration, the crossbar multiplexes write and data signals to route data to and from the appropriate controller.
145 127
 
146  
-* Master-to-Hub
  128
+When there are queued requests (i.e. more request acknowledgements than data acknowledgements), the bank machine asserts its ``lock`` signal which freezes the crossbar connection between the master and the bank machine. This simplifies two problems:
147 129
 
148  
-  * ``data_w`` must supply data to the controller from the appropriate write transaction, on the cycle after they have been called using ``call`` and ``tag_call``.
149  
-  * ``data_wm`` are the byte-granular write data masks. They are used in combination with ``data_w`` to identify the bytes that should be modified in the memory. The ``data_wm`` bit should be low for its corresponding ``data_w`` byte to be written.
  130
+#. Determining to which master a data acknowledgement from a bank machine should be sent.
  131
+#. Having to deal with a master queuing requests into multiple different bank machines which may collectively complete them in a different order than the master issued them.
150 132
 
151  
-In order to avoid duplicating the tag matching and tracking logic, the master-to-hub data signals must be driven low when they are not in use, so that they can be simply ORed together inside the memory controller. This way, only masters have to track (their own) transactions for arbitrating the data lines.
  133
+For each master, transactions are completed in-order by the memory system. Reordering may only occur between masters, e.g. a master issuing a request that hits a page may have it completed sooner than a master requesting earlier a precharge/activate cycle of another bank.
152 134
 
153  
-Tags represent in-flight transactions. The hub can reissue a tag as soon as the cycle when it appears on ``tag_call``.
  135
+It is suggested that memory controllers use an interface to a PHY compatible with DFI [dfi]_. The DFI clock can be the same as the LASMIbus clock, with optional serialization and deserialization taking place across the PHY, as specified in the DFI standard.
154 136
 
155 137
 SDRAM burst length and clock ratios
156 138
 ===================================
157  
-A system using ASMI must set the SDRAM burst length B, the ASMIbus word width W and the ratio between the ASMIbus clock frequency Fa and the SDRAM I/O frequency Fi so that all data transfers last for exactly one ASMIbus cycle.
  139
+A system using LASMI must set the SDRAM burst length B, the LASMIbus word width W and the ratio between the LASMIbus clock frequency Fa and the SDRAM I/O frequency Fi so that all data transfers last for exactly one LASMIbus cycle.
158 140
 
159 141
 More explicitly, these relations must be verified:
160 142
 
@@ -163,7 +145,3 @@ B = Fi/Fa
163 145
 W = B*[number of SDRAM I/O pins]
164 146
 
165 147
 For DDR memories, the I/O frequency is twice the logic frequency.
166  
-
167  
-Using ASMI with Migen
168  
-=====================
169  
-TODO: please document me!

0 notes on commit 0cef983

Please sign in to comment.
Something went wrong with that request. Please try again.