Continuous binary logging data loss #62

Closed
greydet opened this Issue Feb 23, 2011 · 20 comments

6 participants

@greydet

I got my device sending a continuous flow of binary data to the OpenLog in new file mode. I encounter data loss issues once the UART speed is greater than 9600bps.
The OpenLog datasheet says that such data loss only occur at 57600bps and higher.

Following are the loss rates I observe in function of the uart speed :

  • 9600bps: 0%
  • 19200bps: 0.97%
  • 38400bps: 34.33%
  • 57600bps: 43.50%
  • 115200bps: 64.45%

Those stats are captured from a continuous write to the UART during approximately 30s.

My OpenLog firmware version is 2.41.
The device that sends data to the OpenLog is a custom board based on a Microchip dsPic33FJ128MC802. The OpenLog is powered at 5V.

@tz1

It might make 38,400 on a perfectly new or erased micro SD card, class 6. When it gets full and rewritten it will slow down and pause up to 250mS while writing. It can only buffer a limited amount when this happens. Panasonic has a SD reformatter that will (in most cases) actually erase the card.

@greydet

Thank you for the reply tz1.

I just tried to format the SD card and run my test again at 38400 bps. It still looses 2.93% of the sent bytes.

Is there a way to improve this somehow by buffering more data on the OpenLog or anything else? I really can't accept such data loss in my application.

@tz1

Not directly. Maybe a newer card. The only way I was able to solve it was to add a 32k SPI sram chip from microchip and buffer data there when the SD was busy. I also can go a bit faster with my FAT32/SD-SPI library since I overlap busies (wait-unbusy, write-sector instead of write-sector, wait-unbusy). Hyperlog is my version in my github repository under sparkfun and tends to minimize the data loss, but it also includes the SPI version. in a different directory.

@nseidle
SparkFun Electronics member

Hi greydet - thanks for posting!

tz1 is correct in theory. Worst case according to the SD spec is a pause of 250ms, however I rarely see this in practice.

OpenLog buffers the incoming data with a 512 byte RX buffer. This is currently the internal limitation of the ATmega328 + OpenLog firmware.

I've setup many test sessions to test 'continuous' serial logging.

This morning's results:
Used Arduino v0021, Duemilanove board, 5V supply to OpenLog, OpenLog firmware v2.5 (you have 2.41 which is 'engine similar' to v2.5), and 4 different SD cards (1GB noname, 1GB sandisk, 8GB transcend, 16GB sandisk). Test write using the example sketch, 33,000bytes of ASCII characters.

57600bps:
1GB with 50MB data, catches 100% of 33,000 characters
Fresh 1GB, 1st: 30740/33000, 2nd: 30228/30000
Fresh 8GB, catches 100%, 100%, 100%
Fresh 16GB, catches 100%, 100%, 100%
8GB with 4GB data, catches 100%, 100%, 100%
Larger datasets will increase the probability of the SD card needing more time to record.

At 9600bps, nothing was lost on any card.

@nseidle
SparkFun Electronics member

I have run very large dataset tests. 2GB over on this issue:
https://github.com/nseidle/OpenLog/issues/closed#issue/56

@nseidle
SparkFun Electronics member

I think my question is how continuous is continuous? I don't have a good dsPIC platform to replicate your test. Can we instead rely on a computer serial port test? Teraterm to USB/Serial to OpenLog? I'm worried the USB/Serial adapter may inject delays that would not be seen on your dsPIC platform.

@nseidle
SparkFun Electronics member

More testing:

19200bps:
1GB noname card: 31252/33000, 32276/33000
16GB card: 100%, 100%

38400bps:
1GB noname card: 31252/33000, 32276/33000
16GB card: 100%, 100%

So I'm beginning to lean towards different card specs (write speed). Realize the 1GB is writing in FAT16, 16GB card is writing in FAT32. I'll try to do some additional testing with the 1GB card formatted for FAT32.

@nseidle
SparkFun Electronics member

Same results at FAT32.

38400bps, 1GB noname card: 32276/30000, 32276/30000.

@OldFar-SeeingArt

I, too, am getting dropped chars writing in binary mode @ 57.6k: maybe under 1% but highly annoying when trying to parse the data later.

I'm using the 1gig card Sparkfun sells. Are these too slow? I'm sending 7 bytes per sample to the card and really wanted to be able to do 1000 samples/second but it takes 1.25ms approx at 57.6k for this. If I could run at 115.2k I would be a local hero. Is there any chance of 115.2k working with a top-o-the-line sd card?

Thanks for any insight!

@tz1

I did a fairly extensive analysis of the problem with links to references:
https://github.com/nseidle/OpenLog/issues/12
You rarely see the long pauses on SD write, but all it takes is one to drop a block of characters. You can often see them on an oscilloscope - both shorter pauses and long ones.
The only way I could get 100% reliable fast baud rates was adding an external SPI SRAM (32k*8) to buffer incoming data during the pauses. The class10s are better but they are only standard size SD. This might be done more easily with an arduino pro and SD/uSD breakout board.

@OldFar-SeeingArt
@tz1

The SPI SRAM can be attached using the pins used for programming across from the serial side (actually just mosi/miso/sclk), the power lines, and one more pin as the SRAM slave select.(PB1 in my proof of concept code https://github.com/tz1/sparkfun/tree/master/sramlog), here is one someone else built https://picasaweb.google.com/lh/photo/ZTfGIwEVLLKl57CjOT19XA?feat=directlink

@OldFar-SeeingArt

Based on the idea of a wide latitude of quality that exists with current cards on the market (from reading the suggested items people mentioned above), I trotted over to Staples office store, bought a Sandisk 4gig card (formatted as FAT32 incidentally) and tried it. Up to now, I have been using the 1gig cards that Sparkfun sells. Out of 7 cards, 3 have spontaneously died...

a) results at 57600: no dropouts of data during a one minute test.

b) changed my data rate to 1000 samples/second and 115200 baud: again, looks perfect.

Obviously, more testing is required to see how it works as the card fills up but so far, it looks to be a card quality issue. As more data comes in, I'll get back here to report how it goes... meanwhile, thanks to everyone!

@tz1

Get the erase program from panasonic and see if that helps.

A higher class card (class 6 v.s. class 4 v.s .class 2) will have shorter pauses on a fresh card.

@OldFar-SeeingArt

Found the erase program, tried it, will run tests with it. Thanks!

@OldFar-SeeingArt

I tested the Sparkfun 1gig card formatted with the Panasonic formatting program. The data six bytes per sample along with a 1000 times per second. The test was run at 115,200 baud

During a 5 minute test, there were several hundred errors, all apparently due to missed data during card writes. The same test on the Sandisk 4 gig card had no errors. It is possible that for longer tests, the Sandisk might start to get behind and drop chars here and there. But the data files get so big that my tests are limited to around 5 minutes or less.

So it appears that for fast baud rates, you need top quality cards to keep up.

@petejan

I also find that openlog drops sectors both V1.61 and V2.4 at high baud rates. For me 1% data loss is still no good, surly nothing should be dropped, at 2 GB, 1% is still 10 MB!

Is SRAM the solution?

At 38400 bps, 250 ms is 960 chars, At 57600 its 1440 chars, with only 2 K its tight, but is there a way?

With a Formatted card, is there still a problem when keeping the FAT/directory up to date, its writing the same sector all the time, so when the card is close to full the card will have to erase-write instead of just swapping out a sector.

I used EFSL before and it has quite a good caching method, keeping the sectors in memory that are being used. It needs an update to write FAT32/SDHC though.

Could also instead of writing to a serial buffer, then copying this data into a buffer to write to the card, would you save space by keeping 2 sectors in memory, one for the current sector being written, then the other which contains the next sector to write. This would get away from the need to double buffer the data. The writing to the sector cache could be done directly in the serial interrupt. The cache could also be flushed on request, of before going to low power mode.

@nseidle
SparkFun Electronics member

I've started contemplating an SRAM backed OpenLog (I know tz, you've been pushing me for over a year for this). I'm going to close this issue and look at two OpenLog versions. One for regular people, and one for super-fast endless datarates.

@nseidle nseidle closed this Jun 1, 2011
@tz1
tz1 commented Jun 1, 2011

I actually had a working prototype that someone else verified. And it isn't that complex to add, just another, different slave-select pin for the SRAM. A 32k microchip should handle anything and it has a TSSOP version. Also 12Mhz gives a better multiple for 115.2k, but you might also want to try some multiple of 921600 so you can do the higher baud rates. 14.7456Mhz would hit them exactly, though would require a bootloader tweak, but you probably can move to optiboot at the same time.
(the only other thing would be to bring out the I2C pins to a connection so you could log those kinds of sensors and/or the ICP pin to log edges/timings/pulse stuff).
A Class 10 device might help, but I haven't tried it:
http://www.newegg.com/Product/Product.aspx?Item=N82E16820161411

@thomasylo

At 115200 baud use SD card with write speed of 40 MByte per second for continouous logging. Noname brand card barely works for 9600. They usually do not even specify the write rate. I found Samsung's 16GB card with the above speed good enough. They also have 32GB card that runs at 80Mbyte per second in write.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment