UltraPlus I2C and SPI #174

widgetlords · 2018-07-18T23:04:23Z

Has anyone tried the hard I2C and SPI modules in the UltraPlus FPGAs and got them working? I have spent the past few days tinkering away using the information in these documents:

http://www.latticesemi.com/-/media/LatticeSemi/Documents/ApplicationNotes/AD/AdvancediCE40SPII2CHardenedIPUsageGuide.ashx?document_id=50117
http://www.latticesemi.com/~/media/LatticeSemi/Documents/TechnicalBriefs/SBTICETechnologyLibrary201504.pdf

So far the SPI module works perfectly right up until you try to transmit a byte. Once you do the module just clocks out the same byte repeatedly until you disable the module or transmit a different byte at which point it just clocks the new byte ad infinitum.

On the I2C side of things the module never seems to assert the SBACKO line when writing to its registers and so anything that waits on it will hang.

I have not yet ruled out my design / software as the cause of this but I'm not sure where to go from here as I cannot find a single working example anywhere, even from Lattice.

Also, if anyone has access to the proprietary tools, does the module generator produce readable HDL code we can study?

The text was updated successfully, but these errors were encountered:

daveshah1 · 2018-07-19T06:21:09Z

Unfortunately the I2C and SPI cores are somewhat annoying to use, despite adding them to icestorm for completeness I would advise you to avoid them if possible and use soft cores instead.

The I2C module is hanging because of a major typo in the datasheet. The correct register set for the UltraPlus is not the one titled "iCE40 UltraLite and iCE40 UltraPlus", but "iCE40LM and iCE40 Ultra". Hopefully once you start following that it will at least acknowledge you.

The only working example of I2C/SPI is @mmicko's work here: https://github.com/mmicko/mikrobus-upduino/blob/master/src/picosoc/firmware.c and https://github.com/mmicko/mikrobus-upduino/blob/master/src/picosoc/ip_wrapper.v

widgetlords · 2018-07-20T19:09:09Z

Thank you for the information, that was exactly what I needed. I have the SPI module working correctly now and I will attempt I2C next week.

widgetlords · 2018-07-26T17:02:37Z

Well I have got the I2C module to acknowledge register updates and pored over both the Lattice documentation and the C code in that example but I can't get the I2C to actually do anything on the output pins. I load the address byte into I2CTXDR, 0x94 into the command register, and wait for the TRRDY bit in I2CSR. It never happens, apparently the only bit set in the register is RARC or "received acknowledge".

widgetlords · 2018-07-27T19:21:05Z

I figured it out, having the exact bus timing is apparently critical.

mithro · 2018-07-27T20:17:30Z

@widgetlords Would be awesome if you write up some details about this!

widgetlords · 2018-07-31T21:13:35Z

@mithro The current application I'm working on is proprietary but we are considering documenting something about this and releasing it. I will post a link to it here if and when it happens.

mithro · 2018-07-31T22:06:37Z

@widgetlords I was actually talking about the bus timing details and using the I2C hard block (although I would love to know about the project too).

Ideally we should collect enough details that we can create an accurate simulation model which checks that you have satisfied the requirements needed by real hardware.

mkvenkit · 2019-05-23T01:19:21Z

The I2C module is hanging because of a major typo in the datasheet. The correct register set for the UltraPlus is not the one titled "iCE40 UltraLite and iCE40 UltraPlus", but "iCE40LM and iCE40 Ultra". Hopefully once you start following that it will at least acknowledge you.

Hi @daveshah1

Which datasheet are you referring to above? If you could point to the exact reference or post a screenshot it would be immensely helpful.

Regards

Mahesh

daveshah1 · 2019-05-25T05:59:35Z

I was referring to http://www.latticesemi.com/-/media/LatticeSemi/Documents/ApplicationNotes/AD/AdvancediCE40SPII2CHardenedIPUsageGuide.ashx?document_id=50117

mkvenkit · 2019-05-25T06:30:42Z

Thanks, @daveshah1

bunnie · 2020-02-19T06:47:44Z

I had the misfortune of having to use the SB_I2C hardened IP block, in order to free up some gates, and came across this thread while looking for docs, and figured I'd add my 2c here.

The good news:

Going from pure RTL (https://github.com/betrusted-io/betrusted-ec/blob/e0f21858cd2cbb6448173f63467a93c8458c6798/rtl/rtl_i2c.py#L11) to SB_I2C (https://github.com/betrusted-io/betrusted-ec/blob/e0f21858cd2cbb6448173f63467a93c8458c6798/rtl/hard_i2c.py#L12) saves about 180 LC on an ICE40 UP5K (3.5%). There's actually quite a few extra LC that could be shaven out I think by letting the top 24 bits of the data bus go undefined.

The bad news:

The docs referred to in @daveshah1's comment above are only semi-accurate
There are some significant limitations in using the SB_I2C block (discussed below)

The TL;DR is I would only recommend using the SB_I2C block if and only if you are really out of gates and this is the only way to optimize a few LC out of the design. Previously we used an OpenCores Verilog implementation glued into Python that was very well tested and well behaved, and had a "sane" driver interface; it basically came up without a hitch and never caused us heartburn.

OK, so from here on out -- these are notes for people who are thinking about using the I2C block themselves:

The SB_I2C block uses a wishbone-oid interface. The link above to hard_i2c.py will show you how to integrate. They only provide a signal called "STB" which actually needs to be mapped to "CYC", not "STB", because they lack a "CYC" signal. The block also does not pay attention to CTI, etc. You must make sure that your wishbone interface is configured to be non-caching for the region.

The Lattice docs say that bits 7:4 depend upon the location of the block (upper right or upper left) but looking through Clifford's notes it seems maybe it's actually set by a parameter p_BUS_ADDR74. I didn't resolve this but just in case I put a hard BEL constraint on it so it doesn't move around.

I took the strategy of just mapping the address and data bits straight over to wishbone, so that the 8-bit registers are actually strided over words, and the upper 24 bits are wasted. Thus the address table given in the docs needs to be multiplied by 4 to get the actual offsets. The code for the driver is here: https://github.com/betrusted-io/betrusted-ec/blob/e0f21858cd2cbb6448173f63467a93c8458c6798/sw/betrusted-hal/src/hal_hardi2c.rs#L1 (sorry, we only wrote a Rust version).

The flow chart and timing diagrams on page 24-30 of the PDF file linked above are pretty handy, except they are wrong. Or rather, I'm guessing the core behavior was slightly tweaked between different versions of the FPGA and they just didn't bother documenting all the differences. The biggest trick in driving the block is that the block requires the host to intervene in real-time to guide the I2C transaction. In other words:

The current command in the command register is sticky. Once you have set a command in the register, that command is repeatedly issued until it is disabled or changed.
It takes some time for new commands and parameters to be accepted by the block. In other words, it is not a valid strategy to simply jam a "write" command into the block for one cycle, and then clear it work around (1). The "write" command needs to exist for somewhere longer than one I2C bus cycle, and also needs to be cleared within about one I2C bus cycle of the command's completion to avoid it issuing again.

This means you have race conditions on the "fast" side (making sure the commands are around long enough to be accepted by the I2C block), and "slow" side (making sure the command are cleared soon enough to avoid issuing a second command by accident).

The upshot is that if you're driving this with a Vexriscv block running at 12MHz, you don't actually have a lot of margin to play with. Interrupts may not be serviced in time to meet the "slow" constraint; but at 12MHz, it's "fast" enough that you can't simply fire and forget.

Thus the driver in essence requires a lot of polling to be done to make sure everything occurs in exact lock-step.

The documentation in the PDF would hint that TRRDY is the one register to watch to synchronize things. The TRRDY bit indicates that the Rx or Tx register (depending on the mode) has been copied into the I2C hard IP block, and the host is now safe to update the contents (or read it for Rx).

If you implement the flowchart on page 24 exactly as shown, what you end up with is just the very first cycle of either a read or write being produced, and all subsequent cycles are skipped.

For writes, not only do you have to wait until TRRDY is asserted, you need to wait until the initial slave address transaction (concurrent with the "STA" bit) indicates completion (by monitoring the TIP bit). If you simply load the next value into Txd and issue a write command upon TRRDY, the system will ignore it.

However, once you have completed that, you can now monitor TRRDY and issue WR commands to issue successive writes.

In other words, the flow chart needs to be modified to say "Wait for TIP to clear" after the initial "TXDR/CMDR" box in order to be correct.

I have found that "repeated-start" commands also don't work. A "repeated-start" leads to (iirc) the read data cycles following the "read start" command to disappear. Thus, the work-around is to always conclude every write phase with a "full stop", before moving onto the read phase. Minor performance loss, no biggie.

For the read side, the document does accurately specify to wait for "SRW" instead of "TRRDY" after the initial "STA+R" command. The read side flow chart is otherwise mostly correct except that I couldn't get the "1 byte read" condition to work. Because of the double-buffering they put on the I2C read side, there is an even stricter race condition imposed, where you must issue the "RD+NACK+STOP" command within a 2tSCL to 7tSCL window, or else things blow up (usually ends up with a weird runt cycle or the I2C block gets hung clocking SCL forever).

I was unable to find a way to reliably hit the 2tSCL to 7tSCL window. I had increased the hardware timer precision and tried various combinations off of that, but it always seemed I was either too fast or too slow. If your system uses caching, does XIP from SPI, or has interrupts, that would also cause this to blow up.

Thus, my final driver implementation works around this by simply not allowing single-byte reads. I have an "assert" in the code to catch that, but another valid way to deal with it might be to simply issue two reads even if a single read is requested. In many cases, this is harmless for the I2C device to read an extra byte, and the main impact would be e.g. if you were relying on the position of the read address pointer to increment by only one byte in the target device. Fortunately for my application, this is not the case so I didn't have to solve this last detail.

I will note that there is a "RBUFDIS" function that is not well documented that might solve the above problem. In the flow chart examples, they always set CKSDIS but don't explain why; I just do it in my code because that's what they recommended. I imagine that if you set the RBUFDIS signal, you would no longer have that weird race condition anymore on the RD+STO+NACK cycle, but instead you'd have another race condition timing when to read the data out of the Rxd register. I didn't want to find out which was worse, but this foot note is here for anyone who decides they absolutely must have the ability to read a single byte from a slave device using this hard IP block.

Finally, I put some diagnostics in my code to check how often we hit time-outs at places I wouldn't expect them, and I also explicitly wait for things like TRRDY to go "not ready" even though the flow chart doesn't call for it to ensure proper interlocking. Despite these measures, a small fraction of the I2C operations still trigger time-outs and fail. Therefore, all the calls to the I2C API in my implementation now check the return code and retry the operation if there is a failure. I did not go into why I had the rare time-outs, or what causes them, because my targets all support stateless read/write, e.g., I can afford to just keep on retrying the read or write until it works, but not all I2C targets are like this.

There is a mysterious "SDA delay" parameter, and apparently there is some mention of a "glitch filter" elsewhere that is a hard IP block that seems like it was meant to be used with the SB_I2C and it may be instantiated by the proprietary tool and perhaps adding these or tuning these parameters would solve the reliability problems, but the docs are sparse on this.

Good luck! I agree with a note I saw from Clifford elsewhere -- this block is pretty horrible and fussy, and you should use an RTL IP block if you can. But if you're out of gates -- you're out of gates, and I hope these notes can help you.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UltraPlus I2C and SPI #174

UltraPlus I2C and SPI #174

widgetlords commented Jul 18, 2018

daveshah1 commented Jul 19, 2018

widgetlords commented Jul 20, 2018

widgetlords commented Jul 26, 2018

widgetlords commented Jul 27, 2018

mithro commented Jul 27, 2018

widgetlords commented Jul 31, 2018

mithro commented Jul 31, 2018

mkvenkit commented May 23, 2019

daveshah1 commented May 25, 2019

mkvenkit commented May 25, 2019

bunnie commented Feb 19, 2020

UltraPlus I2C and SPI #174

UltraPlus I2C and SPI #174

Comments

widgetlords commented Jul 18, 2018

daveshah1 commented Jul 19, 2018

widgetlords commented Jul 20, 2018

widgetlords commented Jul 26, 2018

widgetlords commented Jul 27, 2018

mithro commented Jul 27, 2018

widgetlords commented Jul 31, 2018

mithro commented Jul 31, 2018

mkvenkit commented May 23, 2019

daveshah1 commented May 25, 2019

mkvenkit commented May 25, 2019

bunnie commented Feb 19, 2020