Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random time-out while scanning or enrolling fingerprints #18

Closed
edautz opened this issue Mar 5, 2019 · 25 comments
Closed

Random time-out while scanning or enrolling fingerprints #18

edautz opened this issue Mar 5, 2019 · 25 comments

Comments

@edautz
Copy link

edautz commented Mar 5, 2019

While testing a FPM10A/DY50-2V1 3V3 sensor with build in touch controller using at 57600 bps 8N1 on a ESP32 Devkit-1 board connected to hardware serial sometimes time-outs occurs:

Trace made with remote debug:

Fingerprint check for existing print in database:

(D t:23504ms) 05032019 13:31:03 .
(D t:23510ms) 05032019 13:31:03 .
(D t:23518ms) 05032019 13:31:03 .
(D t:23523ms) 05032019 13:31:03 .
(D t:23530ms) 05032019 13:31:03 .
(D t:23537ms) 05032019 13:31:03 .
(D t:23544ms) 05032019 13:31:03 .
(D t:23550ms) 05032019 13:31:03 .
(D t:24140ms) 05032019 13:31:03 Image taken
(D t:24532ms) 05032019 13:31:04 Image converted
(D t:24536ms) 05032019 13:31:04 Remove finger
(D t:26097ms) 05032019 13:31:05 Timeout!

Break off and start a new search again:

(D t:34893ms) 05032019 13:31:14 Touch the fingerprint reader to search for a print...
(D t:38517ms) 05032019 13:31:18 Waiting for valid finger
(D t:40520ms) 05032019 13:31:20 Timeout!
(D t:50317ms) 05032019 13:31:30 Timeout!
(D t:60113ms) 05032019 13:31:39 Timeout!
(D t:69909ms) 05032019 13:31:49 Timeout!
(D t:79705ms) 05032019 13:31:59 Timeout!
(D t:89501ms) 05032019 13:32:09 Timeout!
(D t:99297ms) 05032019 13:32:19 Timeout!

Had to reset complete module.
A software reset didn’t work.

@brianrho
Copy link
Owner

brianrho commented Mar 5, 2019

Where's the code that produced this? Use pastebin.com

@edautz
Copy link
Author

edautz commented Mar 5, 2019

Code is here: https://pastebin.com/UtVugGwL

This is only a snipped based on an large code based on libraries.
This code is a modified version of your examples, enriched with remote debugging, webserver logging and syslogging and the use of the build in touch controller.

@brianrho
Copy link
Owner

brianrho commented Mar 5, 2019

There's a lot of code there that could potentially be the cause. I'd suggest you first disable the remoteDebug as well as the webSocket stuff and see if it still times out, since this didnt happen when you were running the examples. Also make sure you're calling finger.begin() before you call search_database()

@edautz
Copy link
Author

edautz commented Mar 5, 2019

To be clear, the timeouts also randomly occurs when using the example code. My guess would be about 5% of the time when scanning a fingerprint, but are hard te reproduce.

Included my first testcode on the ESP32 only using two serial ports to communicate fully based on your examples with slide modifications to create a menu and using touch and controllable led.

https://pastebin.com/ZYQJr7NW

@brianrho
Copy link
Owner

brianrho commented Mar 5, 2019

Without being able to reproduce them with the examples, there's not a lot I can do. Maybe you can try running one of them a few times un-modified till something shows up.

In your regular application code, I'd suggest you also try switching to the regular getImage(), leave out the LED control commands and perhaps the touch detection and see if things get better. I haven't tested those thoroughly since I don't have a module with the function, so can't vouch for their behaviour.

@mohammadhasanzadeh
Copy link

mohammadhasanzadeh commented Mar 6, 2019

Unfortunately, I have the same problem on ZFM60 and R308 sensors.

@edautz
Copy link
Author

edautz commented Mar 6, 2019

I wonder if the timeout problem is not causes by a serial buffer problem.

I seen the timeout only occurs when data is transferred and not in a idle situation. I had a situation on ESP8266 connections to a P1 smartmeter port with a default 128 byte buffer, sometimes bufferoverruns occurs. I solved this by increasing the buffer to 1024 bytes.

I see in:
https://github.com/espressif/arduino-esp32/blob/master/cores/esp32/HardwareSerial.cpp that the default buffer size is 256 bytes.
Line 55 _uart = uartBegin(_uart_nr, baud ? baud : 9600, config, rxPin, txPin, 256, invert);

There is a way to increase size_t HardwareSerial::setRxBufferSize(size_t new_size)
91 return uartResizeRxBuffer(_uart, new_size);
92 }

Only the maximum is not clear, I assume 1024 or 2048 bytes.

I going to experiment with those values, if this doesn't help I going to start debugging in the FPM or serial port to check whats going on, I think there is a data transfer with 128 bytes packets when scanning a fingerprint. If the ESP is not able to process the buffer on a given moment a overrun could occur. I don't know whether the FPM sensor or the ESP is capable to correct this situation.

@brianrho, can you provide a simple loop script for continuously transfer data from and to the FPS to generate heavy load to reproduce this problem?

@edautz
Copy link
Author

edautz commented Mar 6, 2019

Did some testing. After 10 attemps timeout occurs. Enabled debug but the datavolume seems to be low, only commands.

Did uncomment:

#define FPM_R551_MODULE

for testing in the FPM.H file.

Did dozens of tests but the timeout problem seems to be disappeared.

All other functions like led_on(), led_off(),deleting, enrollment and delete the database works.
I will do some more testing. Fingers crossed.

@mohammadhasanzadeh, can you also uncomment #define FPM_R551_MODULE and test your sensor?

@brianrho
Copy link
Owner

brianrho commented Mar 6, 2019

Yeah the reason I was skeptical that it's a buffer issue is because from what I've seen so far, you're only sending commands (and not requesting data) and the < 32-byte responses are obtained immediately. The default 64/128-byte buffer should usually be enough for the command-response cycle.
The only changes you made are to uncomment the define? What example are you testing with?

@edautz
Copy link
Author

edautz commented Mar 6, 2019

I did test with my fullblown setup, including the webserver, remote debug, syslog etc.

Yes, only change was uncomment the define.

But it is too early to conclude the problem is solved, I noticed the problem is not always present during testing. Will do more testing each day and keep you posted.

Also curious about the findings of @mohammadhasanzadeh.

@brianrho
Copy link
Owner

brianrho commented Mar 6, 2019

If that is truly the only change you made and you're testing with your own code, then I think I know where the problem is already. The R551 does not use the high speed search command, since it doesn't support it. It uses the regular search command 0x04 I think, while the other modules use 0x1B. Could be the problem somehow.

@brianrho
Copy link
Owner

brianrho commented Mar 6, 2019

Okay then. You can also check further on your own end by commenting out the define again and see if the problem comes back.

@mohammadhasanzadeh
Copy link

mohammadhasanzadeh commented Mar 7, 2019

Did some testing. After 10 attemps timeout occurs. Enabled debug but the datavolume seems to be low, only commands.
Did uncomment:
#define FPM_R551_MODULE
for testing in the FPM.H file.
Did dozens of tests but the timeout problem seems to be disappeared.
All other functions like led_on(), led_off(),deleting, enrollment and delete the database works.
I will do some more testing. Fingers crossed.
@mohammadhasanzadeh, can you also uncomment #define FPM_R551_MODULE and test your sensor?

I do not have access to the sensors for the next two days, but I certainly test it for next Saturday.

@edautz
Copy link
Author

edautz commented Mar 7, 2019

Bad luck, time-outs reappeared, so uncommenting #define FPM_R551_MODULE is not the solution.
Go another way to reset the sensor after a time-out.
Some experimenting show when a time-out occurs, I could issue:

fserial.end();
Switch the power of the sensor off and after a little delay on again.
And issue a fserial.begin(.....).
Then the sensor is live and kicking again.
I going to use a AO3415 mosfet with a max current of 4A to do this job. This mosfet can directy switched with a GPIO pin.

Before I go this path.

@brianrho, is it worth trying to increase?

#define FPM_DEFAULT_TIMEOUT 1000

@brianrho
Copy link
Owner

brianrho commented Mar 8, 2019

I doubt it, the device should be able to respond within a few hundred ms, unless it's a search and there are a lot of saved prints. So you can try making it 3 seconds since the datasheet says search times can take up to 1 second. Current max should be under 200 mA I think, so long as you're using a strong external source (like one of those USB-UART adapters), you should be fine.

How many timeouts did you observe, how often? What commands exactly yielded those timeouts? To be sure there's nothing else causing the timeouts (especially with the LED commands which I haven't worked with extensively), use the vanilla code in the search_database example for tests, though you can integrate the touch pin. Hope the touch pin is pulled up externally, otherwise make the pin INPUT_PULLUP in the code

@edautz
Copy link
Author

edautz commented Mar 8, 2019

First tests indicated that the time-outs only occur during the FingerFastSearch function.

Created a AO3415 mosfet recovery circuit when a timeout occurs. After the timout a fserial.end() is issued, ESP32 outputpin rizes mosfet gate . This powers off the sensor and after a few seconde power it up again by lowering gate en issue a fserial.begin(....). This works as a charm. This is a nice work-around to revive the sensor.

Will also try the #define FPM_DEFAULT_TIMEOUT 3000 option.

@brianrho
Copy link
Owner

brianrho commented Mar 9, 2019

I do remember someone who did the same thing in their code a few years back, with the Adafruit library. They basically called begin() every time before calling the search function. I thought then that it was unnecessary but it could be they faced the same issue. Perhaps you can try the Adafruit library when you can, and see if it also times out or if this problem is limited to FPM.

@brianrho
Copy link
Owner

brianrho commented Mar 9, 2019

I've made some changes. In particular, the handshake command may be useful to help you 'wake' the module so you don't have to restart it in hardware. There are now debug levels so you can see everything if you want or just the errors. 2 functions (fingerFastSearch and getModel) were renamed.

@edautz
Copy link
Author

edautz commented Mar 10, 2019

Thanks.

The usage of handshake is not quite clear to me. Do you mean I can issue a handshake() to revive the sensor after a timeout?

Maybe I have discovered a cause of my time-outs. I see random heavy rtc drifts on my ESP32. Drifting of days in an hour. Such a drift could expire the timeout very quick.

@brianrho
Copy link
Owner

You can use the handshake before calling getImage() in case the module goes to sleep or something. I've also increased the timeout to 2 secs just in case the module is just being slow to respond.

I thought your timeouts are from the FPM commands returning FPM_TIMEOUT. Don't see how it's related to the RTC.

@edautz
Copy link
Author

edautz commented Mar 10, 2019

With the old library and a timeout of 3 seconds, did more than a hundred readings without a timeout. In the 1 second timeout configuration I managed to trigger a timeout within 20 attemps.

I confirm the FPM commands returning FPM_TIMEOUT.

My suggestion about the RTC is that the RTC is somehow related to the way used to measure the Timeout period. If the RTC is heavely drifting, (I measured several times a situation the time drift within an hour more then 49 days e.g. from 07032019 04:16 to 25042019 21:07), the default 1 second period could be reduced with a factor 1100 to less then 1ms in practice.

The drift is totally random, sometimes hours good, sometimes those heavy drifts.

It took me days to find out, that the time problem was a random heavy drift, because when I notices this first I thought that kind of drifts couldn't be not possible.

Default the NTP refresh period is 1 hour, I reduced it to 5 minutes to keep the time in sync.

Testing is still going on.........

@brianrho
Copy link
Owner

The library uses millis() for timing and last I checked, Arduino cores use a timer peripheral for millis() and not the hardware RTC so the drift should have no effect on the library.
Would be best if you could switch to the new library with the debug level set to 1 so we can see if any other errors (like a wrong checksum) precede the FPM_TIMEOUT.

@edautz
Copy link
Author

edautz commented Mar 11, 2019

Moved to the new library and did some testing. I noticed when the sensors is powered for a long time and warm no timeouts occur. Also with a timeout defined of 1000ms.

But after powering off for half an hour, one of the first try out were timeout,

Enter searchDatabase: 31163
[+]Response timeout
FPM Check searchDatabase: timeout!
Millis: 32284
[+]Response timeout
[+]Response timeout

Enter searchDatabase: 10182
[+]Response timeout
FPM Check searchDatabase: timeout!
Millis: 11301
[+]Response timeout
[+]Response timeout
[+]Response timeout

After modifing the code a little bit to get some more timing information, my ESP32 begin to crash:

abort() was called at PC 0x40141adf on core 1

Backtrace: 0x4008c694:0x3ffb1e30 0x4008c8c5:0x3ffb1e50 0x40141adf:0x3ffb1e70 0x40141b26:0x3ffb1e90 0x4014143b:0x3ffb1eb0 0x4014152a:0x3ffb1ed0 0x401414e1:0x3ffb1ef0 0x400dcee3:0x3ffb1f10 0x400e028a:0x3ffb1f50 0x400d5e7a:0x3ffb1f70 0x400d28c8:0x3ffb1f90 0x400e78b5:0x3ffb1fb0 0x4008e915:0x3ffb1fd0

abort() was called at PC 0x40141adf on core 1

Backtrace: 0x4008c694:0x3ffb1e30 0x4008c8c5:0x3ffb1e50 0x40141adf:0x3ffb1e70 0x40141b26:0x3ffb1e90 0x4014143b:0x3ffb1eb0 0x4014152a:0x3ffb1ed0 0x401414e1:0x3ffb1ef0 0x400dcee3:0x3ffb1f10 0x400e028a:0x3ffb1f50 0x400d5e7a:0x3ffb1f70 0x400d28c8:0x3ffb1f90 0x400e78b5:0x3ffb1fb0 0x4008e915:0x3ffb1fd0

I had to switch to a new ESP32 board. Could be I blown the board by flashing to much.

The new board is running stable. With the new board I measured a database search with 4 fingerprints tooks around 150-200ms. I raised the time-out to 5000ms, let the sensor cooldown for a long time and test again..........

@brianrho
Copy link
Owner

Highly unlikely that flashing is the issue, lifetime is at least 100000 cycles. You should check the code you modified or better still, run only the minimal code needed to test this. If your other code is there with wifi, websockets etc, there's no way to isolate the problem. Also best to stick with the 2s timeout to rule out the chance that search is just being slow.

@edautz
Copy link
Author

edautz commented Mar 15, 2019

Did more then a hunderd tests and only one timeout. Programmed the handshake() to test the recovery of the timeout, wich didn’t happen after the programming. Closing this issue now and post when a timeout happens the results of the handshake().

Thanks again for this great library, wich helped me alot to create a fringerprint key for my alarm system.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants