-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
serial data corruption at 460800 #45
Comments
My bad, the sequence should have been send as $9a 36 80 |
I'm seeing something similar - sometimes a line like 400 . Won't return 400 |
Interesting, the 1st sequence pushed the incorrect data, the 2nd corrected sequence with decimal 80 instead of hex, produced the correct data .... is there a parity/8-bit issue? EDIT: Nope it's just random .... happened again with both lot's of data .. |
More updates on this issue: I have noticed that when I program the board including some forth which loads and runs on boot, it outputs the correct data while power is on, but then if I unplug it and plug back in a few times, that eventually it will start outputting the wrong data. Once in that state, it never sends correct data any more. If I re-flash the board, that does not help. However, If I re-build the complete image including the Forth, I can then get it going again, .... the cycle is repeated .... This is a big worry for me, as I am totally relying on correct serial data for my project .... this is a killer ... |
I was looking through how the uart.v module is used and noticed it is never reset, is this the issue? instead of
Should it not be:
|
It shouldn't matter... Read up on resets in SRAM based FPGA's: there's a (Which can be done in the j1a/j4a with '4 $800 io!', which triggers the I'm currently looking at this issue because '5000 dup .x .' is For building deployed apps, I've taken to using verilator (make It never seems to refuse textual input, so I wonder if it isn't a timing Do what you can to collect data, particularly the specifically broken |
Well it "will" matter because it means that there is a chunk of code in What's the point of having code to respond to a reset in the module if it doesn't get called .. On Fri, Nov 4, 2016 at 2:52 PM, RGD2 notifications@github.com wrote:
|
Update: Adding the resetq to the top level of the uart module instance did not help in this case. However, there is an issue with the uart.v code. I dumped it out and replaced it with a hacked version of an opencore verilog uart module (I had to hack the interface to suit ..) .. That is working great, I have had my serial test code running for 30 mins and unplugged/plugged in at least 20 times at the moment without failure (with the old uart.v, it would fail after cycling the power only 5 times). I will keep testing ... The opencore module has better timing, as it over-samples by 16 .. which at 460,800 is probably important .. EDIT: Still working 6 hrs later, looks like the issue is fixed. If anyone wants I can post the code. |
Reasons to reset are things like glitches or soft errors caused by ionising It's possible to define initial values in verilog besides using a reset, uart.v, it turns out, does use initialisation values as well as All of the above is a bit beside the point here. If the system clock isn't 12 MHz and the baud rate isn't 115200, then both A useful patch would be to make these two into parameters, so they can Look at stack2.v and how the DEPTH parameter is used when it's instantiated |
Remy, you are missing the point. Sure changing the constants in top of uart.v need to be changed for different baud rates, not entirely without a brain cell. The point is that there is a timing error in uart.v and/or an initializing error. The opencore uart.v works correctly, the one in swapforth does not ..... end of story. Either the uart.v in swapforth needs to be debugged and fixed, or the "better" uart.v from opencore included in it's place, as I have successfully done. As I have stated, I have modified the opencore uart for swapforth so it is a direct replacement without having to change the top level... you seem to be not even interested in trying it to see if your errors go away ... which is surprising .... |
Thanks @bmentink for your efforts -- please can you supply the better UART as a patch? Please restrict the subject matter to the project itself. Criticizing the uart is great. Criticizing people, not great. Thanks. |
I can supply it as a patch, but I won't have git "push" permissions will I ?? |
@bmentink github is a little odd in that submitting patches is somewhat complicated, at least to set up the very first time for newcomers -- but this is at least partially git's 'fault'. You don't need git push permissions to Jame's repo. In apology, I'll detail the steps here:
I hope that helps. Yes, it's a bit long winded. Later you repeat just steps 3 to 5 for each new suggested patch. I'd be grateful just to see the new openuart.v attached here in a zipfile. |
Hi Remy & James, I also apologize if I caused any offence ... was a bit frustrated :) Attached is the zip file containing a single file uart2.v ... Sorry havn't got the time at present to do all the git stuff ... will get Cheers, On Sun, Nov 6, 2016 at 7:27 PM, RGD2 notifications@github.com wrote:
|
Ok Bernie - I can't see the attachment though, neither through my email copy of this thread, nor at #45 |
Oh, I attached it in an email reply .... nevermind, you can download it from here: |
How did you get on? Fix your issue @RGD2 ? |
Hmm... Very odd. But with the j4a @ 48MHz on the ice40hx8k breakout board drops characters extremely often. I.e, within 5 characters. I've added it as a branch hanging off the end of my current work - https://github.com/RGD2/swapforth/tree/uart2test |
( At some point, I will go through and rebase/squash/separate/clean a lot of the commits on my j4a-pmod branch... it's gotten to be a bit of a fork, and not everything ought to go back into master. ) I'll make a branch rebased off master to pull in with uart2.v later, if someone else doesn't beat me to it. |
I have been testing with j1a8k on a ice40hx8k breakout board ... all good, no character drops. With the j4a doesn't each task run at an effective 12Mhz? Won't that be an issue? |
Yes, and no. The core actually runs to the same clock internally, there's My issue could be due to the IO subsystem timing though - it could be the If that's it then I can avoid it by pipelining IO reads - possible with Thanks - I'll chase that up when I have a moment. |
Notice an interesting/annoying aspect of serial with the hx8k breakout board. I have my Drum trigger project running well now and it sends midi commands over serial at the 460k rate just fine. However, if I turn my Laptop off and restart it, as apposed to just suspending it, I loose communication with the board. I either get no response from the board or rubbish characters .. The only way I can restore operation, is to do a complete build again including the Forth code. If I just try to Flash the bin file, that does not restore the board .. I am wondering if it is an issue with the Reset line? I am clearing DTR in the software that I use to receive the serial midi commands .. do I need to do anything with RTS? Looking at the Schematic for the board, I can't even see where those lines are even used for that aux serial port .. Any idea's |
No, because you don't need either DTR or RTS, or any other auxiliary control line: just RX and TX. The 'production' j4a I've been using still suffers from the issue, but it seems only to affect the boards reception of some numbers. Ie, if I need to put 5500 on the stack, I'll append It's not transmission to the PC that seems to be the problem in my case: because once the wrong (or right) value is echoed, I never get a different answer just repeating This workaround is good enough to get me by: all my recorded data streams through a different board which uses an FX2 USB chip. And I've collected more than 2.5 TB through that over the last three months without much trouble. The only data corruption there seems to be due to noise in the SPI lines into the other FPGA, and I was able to change the wiring to eliminate it. But FTDI have gotten in trouble recently for doing nasty things if their driver thinks it's taking to a forged chip, maybe it's that? |
I've just been reading back over this issue this morning (and cleaning out the copied comments the email responses left). Between then and now, I'd been reading up on the ice40. The behaviour you get -- about it not working until after a recompile - IS INDEED consistent with some part of the ice40's configuration ram going un-set. The ice40 documentation reveals that, at least for the BRAM, and most likely for all configuration cells, which are SRAM: If not specifically set during configuration, then the previous data stays put. I wonder if you left it powered off for at least a couple minutes before turning it on again, when you found that unplugging it didn't clear the problem? If you cut power for a few seconds only, the un-configured sram state causing the issue could well have kept its value, if the bitfile being loaded from the eeprom on boot never specifically set it. This is also why reconfiguration with the same bitfile didn't help - copying it from the PC to the on-board eeprom chip changed nothing. But recompiling and reconfiguring did help: Every time you recompile, the whole design ends up placed in essentially a different random way, so even if the same cells were left alone, a different set was taking over. At least until the design stuffed itself up somehow. If so, this does appear to be an issue not necessarily with swapforth, but possibly with the icestorm / arachne-pnr / yosys toolchain. First thing to do then would be to fully update all of those to their latest github revisions, and see if this bug still bites. We have had similar bugs 'go away' with updates to the toolchain, it is still young. |
Hi James,
No, because you don't need either DTR or RTS, or any other auxiliary control line: just RX >and TX.
Perhaps our issues are to do with the FTDI chip??
Ok, then why does shell.py set & clear these lines then? That is confusing.
Regarding FTDI driver, I will upgrade/downgrade it to see if that is the
issue ..
Seems strange it only happens on cold boot of my Laptop .. and .. that I
have to re-program the FPGA to clear the issue .. how can that be a driver
issue? ..
Cheers,
Bernie
…On Thu, Dec 15, 2016 at 10:10 AM, RGD2 ***@***.***> wrote:
No, because you don't need either DTR or RTS, or any other auxiliary
control line: just RX and TX.
Perhaps our issues are to do with the FTDI chip??
The 'production' j4a I've been using still suffers from the issue, but it
seems only to affect the boards reception of some numbers. Ie, if I need to
put 5500 on the stack, I'll append dup . to see if it got there ok, and
if not (as quite often happens) I'll hit the up arrow, and change the line
to drop 5500 dup . and keep repeating that until 5500 comes back.
It's not transmission to the PC that seems to be the problem in my case:
because once the wrong (or right) value is echoed, I never get a different
answer just repeating dup .
And the issue doesn't seem to bother text - otherwise it would have spat
the dummy at words it should know. Although... Forth is case insensitive,
so if the bit that encodes case in ascii is flipped, it wouldn't
necessarily complain.
But I'm still not sure how it gets some of the "wrong" values I've been
seeing.
This workaround is good enough to get me by: all my recorded data streams
through a different board which uses an FX2 USB chip. And I've collected
more than 2.5 TB through that over the last three months without much
trouble. The only data corruption there seems to be due to noise in the SPI
lines into the other FPGA, and I was able to change the wiring to eliminate
it.
But FTDI have gotten in trouble recently for doing nasty things if their
driver thinks it's taking to a forged chip, maybe it's that?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#45 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AJp6hz5Pf3sJ5xHFh1axJylqUyZnyLQAks5rIFswgaJpZM4Ki80M>
.
|
But recompiling and reconfiguring *did* help: Every time you recompile,
the whole design ends up placed in essentially a different random way, so
even if the same cells were left alone, a different set was taking over.
At least until the design stuffed itself up somehow.
Now that makes sense ... good thought ..
I will update the tools again ..
Thanks
On Thu, Dec 15, 2016 at 11:48 AM, Bernard Mentink <bmentink@gmail.com>
wrote:
… Hi James,
>No, because you don't need either DTR or RTS, or any other auxiliary
control line: just RX >and TX.
>Perhaps our issues are to do with the FTDI chip??
Ok, then why does shell.py set & clear these lines then? That is confusing.
Regarding FTDI driver, I will upgrade/downgrade it to see if that is the
issue ..
Seems strange it only happens on cold boot of my Laptop .. and .. that I
have to re-program the FPGA to clear the issue .. how can that be a driver
issue? ..
Cheers,
Bernie
On Thu, Dec 15, 2016 at 10:10 AM, RGD2 ***@***.***> wrote:
> No, because you don't need either DTR or RTS, or any other auxiliary
> control line: just RX and TX.
> Perhaps our issues are to do with the FTDI chip??
>
> The 'production' j4a I've been using still suffers from the issue, but it
> seems only to affect the boards reception of some numbers. Ie, if I need to
> put 5500 on the stack, I'll append dup . to see if it got there ok, and
> if not (as quite often happens) I'll hit the up arrow, and change the line
> to drop 5500 dup . and keep repeating that until 5500 comes back.
>
> It's not transmission to the PC that seems to be the problem in my case:
> because once the wrong (or right) value is echoed, I never get a different
> answer just repeating dup .
> And the issue doesn't seem to bother text - otherwise it would have spat
> the dummy at words it should know. Although... Forth is case insensitive,
> so if the bit that encodes case in ascii is flipped, it wouldn't
> necessarily complain.
> But I'm still not sure how it gets some of the "wrong" values I've been
> seeing.
>
> This workaround is good enough to get me by: all my recorded data streams
> through a different board which uses an FX2 USB chip. And I've collected
> more than 2.5 TB through that over the last three months without much
> trouble. The only data corruption there seems to be due to noise in the SPI
> lines into the other FPGA, and I was able to change the wiring to eliminate
> it.
>
> But FTDI have gotten in trouble recently for doing nasty things if their
> driver thinks it's taking to a forged chip, maybe it's that?
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#45 (comment)>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AJp6hz5Pf3sJ5xHFh1axJylqUyZnyLQAks5rIFswgaJpZM4Ki80M>
> .
>
|
DTR is (ab?)used to control the j1a/j4a reset signal. In that context, it's a single-bit 'gpio' type thing that just happens to be available for use that way on that serial interface - it's not required to send data through the serial port, and isn't really being used as one would a modem. When you do a 'CTRL-C' into shell.py it sends a reset signal, so if you accidentally lock up your swapforth machine, you can recover without wiping out the memory - it just does a 'soft' reset of the core. This is also why the tricks involving the This lets you connect to a 'deployed' app on a j1a, interrupting it so you can still add/change the code. I have had multiple shell.py's connected to the same j4a at once - and it even works fine, so long as each burst of IO happens at different times. (it was accidental, I left screen running an instance, and then found it later...). So, at least for the j4a, (probably because of the breakout board) the reset isn't always sent at connection time? (or else I left it set up to reset 'thread0' only, which is the other possibility). But you can always force a soft reset with CTRL-C, and there's a way to force a 'harder reset' which involves the actual FPGA resetting and reconfiguring itself like a cold boot, as well. (using the 'warmboot') functionality. This can even be used to swap between FPGA images - one can have more than one in the bitfile. I use this when developing with bigger programs from |
It is required in the sense that if reset is enabled by DTR, not much serial action is going to go on is it?, because the j1a is in reset ... Which is why I made sure DTR was cleared by my program on the Laptop that talks out the serial port, as it seemed to come up enabled by default (high). Cheers |
Here's an actual little 'conversation' I had recently with the deployed j4a, in the middle of an experimental run. I wanted '1000' on the stack, because I was about to use it to set a certain variable...
<sigh>...
... finally! |
I was just now able to reproduce the above on a different dev board -- using an application-specific image. And my issue isn't the serial port: It does seem to be a j4a bug. Which means I haven't been seeing your bug at all. (I don't think). Sorry! But on the other hand, if the updated version fixes it for you, you might close this bug out. |
I have updated both my OS (to get an updated FTDI lib) and the synth
tools, so far I havn't had the issue, but will keep testing awhile before I
close this issue ...
…On Mon, Dec 19, 2016 at 6:03 PM, RGD2 ***@***.***> wrote:
I was just now able to reproduce the above on a different dev board --
using an application-specific image.
And my issue *isn't* the serial port: It does seem to be a j4a bug.
Which means I haven't been seeing your bug at all. (I don't think).
Sorry!
But on the other hand, if the updated version fixes it for you, you might
close this bug out.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#45 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AJp6h4Opx7b1Etf8zNlJD_320ONuMUuFks5rJhAVgaJpZM4Ki80M>
.
|
This may help with jamesbowman#45 serial data corruption, at least with the j1a
Hitting this one up again -- this time because I have an application which needs a secondary serial line to talk to some old lab equipment at 38,400. Reception at the PC end works fine, but reception at the ice40 end never works. This seems related to the fact that the FT2232H chip on the breakout board is run from the same 12MHz clock, but the baud generator in that chip, and the one generated from the ice40 PLL (at 48MHZ, divided down) , do not always get the same phase. This seems to get much worse when playing with alternate PLL settings (eg, using generated verilog pll blocks from icepll with different ice40 derived clock rates.), and seems to explain the differences in shell.py stability comparing the j1a and j4a. I'm going to switch over to bmentink's opencore rs232 uart, and see how that goes. |
…amesbowman#39 This contains an alternate buart() module for asynchronous RS232. - will resynchronise itself at every start bit transition - rejects bad characters - does not clobber last valid data - keeps the most recently received character, 'valid' flag always clears on read. - works reliably up to 921600 baud on both ice40hx8k breakout and iceStick - works with other 'logic level' rs232 devices offboard, or via max232 chips to true rs232 ports. - easy to instantiate multiple ports with different baud rates This also adds a "make pcon" option to remember the settings for connecting with picocom, which is useful for testing character-by-character operation, as well as manual control over the reset line. It sometimes works if the uart connection is marginal, and shell.py gets stuck on connect.
Ok. I am fairly confident I understand this issue now. Jame's original uart suffers from a circular dependency issue -> it's sampling at 2xbaud when idle, when it should be sampling as fast as possible at clk rate, until sure it's found the middle of the start bit. This means it never really phases itself properly to an asynchronous serial signal, and only works part of the time with the onboard and incidentally synchronous ft2232h on the iceStick and hx8k breakout boards. (and any other feeding the same oscillator to both chips). This also seems to explain why j1a/j4a both seem to never work when clocked by the ice40 PLL running at anything other than 12 or 48 MHz. And the latter only with PLL settings which icepll says are actually invalid. What's probably happening is the buart is breaking communications in those cases, because the PLL output doesn't happen to start up with the coincidental yet suitable phase dependency compared to the FT2232's baud generator. Finally, I'd assert that it's a bug to retain only the first byte received (until collected by the io subsystem), rather than replacing the uncollected data with the freshest byte. If something isn't collecting the data, then when it does start taking it, it should be expecting not the first glitch long ago from start up, but the most recent, validly received byte. I did try for a while to get bmentink's opencores part to work, but was more or less stymied by the fairly horrible style it's state machines are written in. (blocking assignments in synchronous code, rather than non-blocking). It was having different behaviour depending on how it was connecting, so I eventually grew frustrated with it and wrote my own. It may also help with #39 and possibly fix #15 as well. It's been tested on both a ice40hx8k breakout board with the j4a as well as the j1a on the iCEstick. It appears to be reliable at 38400, 460800 and even 921600, although it could probably do with more testing. I'm going to rebase it and prepare a pull request soon.... |
…amesbowman#39 This contains an alternate buart() module for asynchronous RS232. - will resynchronise itself at every start bit transition - rejects bad characters - does not clobber last valid data - keeps the most recently received character, 'valid' flag always clears on read. - works reliably up to 921600 baud on both ice40hx8k breakout and iceStick - works with other 'logic level' rs232 devices offboard, or via max232 chips to true rs232 ports. - easy to instantiate multiple ports with different baud rates This also adds a "make pcon" option to remember the settings for connecting with picocom, which is useful for testing character-by-character operation, as well as manual control over the reset line. It sometimes works if the uart connection is marginal, and shell.py gets stuck on connect.
Sounds great, good work, if you want some more testing please let me know what branch you have your current code checked into, and I will try and thrash it .. |
@bmentink The branch is called fix#45, on my swapforth repo. it's pull request #48. ''' ... Ought to get it out for you. |
Hi,
That works a treat, tried both 460800 and 921600 on the 8k board.
No issues so far at 48Mhz, will try higher clocks ...
Cheers,
B.
…On Wed, May 3, 2017 at 9:10 AM, RGD2 ***@***.***> wrote:
@bmentink <https://github.com/bmentink> The branch is called fix#45, on
my swapforth repo. it's pull request #48
<#48>.
'''
git remote add rgd2 https://github.com/RGD2/swapforth
git fetch rgd2 fix#45
git checkout fix#45
'''
... Ought to get it out for you.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#45 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AJp6h4j4Y9PPCLuYYXranmDPYalNfCJIks5r15vMgaJpZM4Ki80M>
.
|
@RGD2 would be good to give this new UART a try - but I can no longer see a pull request? Did I miss it? |
The original uart.v with mecrisp-ice (UP5k, external usb serial dongle) does not work with any PLL based clk and all usual baudrates, nor at any baudrate w/ 30MHz external oscillator. RX corrupts the incoming data. With uart2.v it works at 30MHz ext osc and PLL and 115k2. Not tested with original j1a. Frankly, below assign used in the original uart.v is something which may cause the issues: assign ser_clk = (counter == limit); I would replace that with something which is registered.. Or change the counter design. I can see the RX behavior changes with baudrate, even when you go down to 9k6 when the counter will be wider. It infers most probably a ripple counter, where the lsb->msb propagation delay depends on the counter width. The ser_clk pulse will be delayed by the counter prop delay such it comes at wrong moment.. It may also happen a result becomes shorter than allowed in regard to the always @* logic outputs where the ser_clk is used. It seems it works better with higher baudrates, where the counter is shorter (thus the counter's prop delay is smaller and does not affect the ser_clk's log1 pulse position much). |
…amesbowman#39 This contains an alternate buart() module for asynchronous RS232. - will resynchronise itself at every start bit transition - rejects bad characters - does not clobber last valid data - keeps the most recently received character, 'valid' flag always clears on read. - works reliably up to 921600 baud on both ice40hx8k breakout and iceStick - works with other 'logic level' rs232 devices offboard, or via max232 chips to true rs232 ports. - easy to instantiate multiple ports with different baud rates
I am sending the following sequence from Forth:
$9a emit 36 emit $80 emit
in a loop, and I am seeing:
directly out of port /dev/ttyUSB1, either by cat'ing to a file and dumping it, or some terminal program ..
However, sometimes it is ok ... and I get the right data.
Is there some problem with the timing at 460800 baud? It only seems a problem after sending an 8-bit value ..
The text was updated successfully, but these errors were encountered: