Verification not working anymore, restart container helped #53

DarianAnjuhal · 2022-12-13T08:29:58Z

Hi,

after couple of months without issues the verification step did not work anymore.
The widget looked correct.

Unfortunately I do not have the errors from the client console (I try to get more infos).
But in the mcaptcha service I have the following log (Broken pipe (os error 32):

Defense { levels: [Level { visitor_threshold: 2000, difficulty_factor: 50000 }, Level { visitor_threshold: 10000, difficulty_factor: 3000000 }, Level { visitor_threshold: 20000, difficulty_factor: 5000000 }], current_visitor_threshold: 0 }
 ERROR mcaptcha::errors              > Broken pipe (os error 32)
 INFO  actix_web::middleware::logger > 10.42.0.226 "POST /api/v1/pow/config HTTP/1.1" 400 37 "https://*********/widget/?sitekey=*****" "Mozilla/5.0 (X11; Linux x86_64; rv:107.0) Gecko/20100101 Firefox/107.0" 0.081178
 INFO  sqlx::query                   > /* SQLx ping */; rows affected: 0, rows returned: 0, elapsed: 339.746µs

Is it possible to monitor the verification endpoint in some way? Next time I prefer to get a mail from our monitoring and not a customers call. :-)

Are there additional logs I can have a look?
Please let me know if I can help to solve this issue. If I fix something by myself, of course I will generate a Pull request (as always).

Bye and have a nice day
Darian

The text was updated successfully, but these errors were encountered:

realaravinth · 2022-12-13T08:53:49Z

Hello,

Are you using mCaptcha/cache with this deployment? If yes, how was it doing at the time of this error? I've encountered broken pipe errors before when a dependent service has crashed.

Also, which version are you running? Build details can be obtained from /api/v1/meta/build

Is it possible to monitor the verification endpoint in some way?

There's a health endpoint, but it only checks if the database and the cache are reachable.

DarianAnjuhal · 2022-12-14T08:49:03Z

Hi!

@realaravinth Thanks for your answer and your support.

Yes, I am using mCaptcha/cache:latest. Unfortunately I don't have logs from the cache right now. I have added monitoring to your health endpoint and as well the /pow/config. If the error occurs next time, I can provide more logs.

Here are my versions:
{"version":"0.1.0","git_commit_hash":"c1f6ce3ae29321f0fdecf801ba789f60e4f89511"}

Bye and have a nice day
Darian

wzrdtales · 2023-02-20T11:59:37Z

this is apparently a big issue...

realaravinth · 2023-02-21T09:39:05Z

this is apparently a big issue...

Did you face the issue as well?

wzrdtales · 2023-02-21T09:50:33Z

permanently yes, its terrible, every few hours. Also there seems to be no option to actually disable this cache module.

realaravinth · 2023-02-21T09:54:41Z

permanently yes, its terrible, every few hours.

Please provide logs.

Also there seems to be no option to actually disable this cache module.

Commenting out the redis section in the config file will use an embedded cache. I just realized that it is not documented, I'll improve docs to reflect that. But please note that the embedded cache doesn't persist data to disk.

wzrdtales · 2023-02-21T09:55:49Z

i would love to provide logs. But there are none...

Is there an undocumented setting for a tracing or mode or something that actually generates meaningful logs useful for you?

wzrdtales · 2023-02-21T09:56:11Z

with there are none i mean literally there is only apache like access logs, no errors whatsoever were in the logs

realaravinth · 2023-02-21T10:00:25Z

with there are none i mean literally there is only apache like access logs, no errors whatsoever were in the logs

Weird. Are you running this using Docker or just the binary?

realaravinth · 2023-02-21T10:01:12Z

I was able to reproduce the error, it happens when the Redis container crashes. mCaptcha/libmcaptcha#10 should fix that 🤞

mcaptcha-copy-mcaptcha-1           | Defense { levels: [Level { visitor_threshold: 100, difficulty_factor: 500 }, Level { visitor_threshold: 110, difficulty_factor: 5000 }, Level { visitor_threshold: 200, difficulty_factor: 50000 }, Level { visitor_threshold: 1000, difficulty_factor: 500000 }, Level { visitor_threshold: 1700, difficulty_factor: 1000000 }, Level { visitor_threshold: 2000, difficulty_factor: 2000000 }, Level { visitor_threshold: 2500, difficulty_factor: 5000000 }], current_visitor_threshold: 0 }
mcaptcha-copy-mcaptcha-1           |  ERROR mcaptcha::errors              > Broken pipe (os error 32)

cc: @wzrdtales @DarianAnjuhal

wzrdtales · 2023-02-21T10:02:41Z

actually it does not need a redis crash, our redis never crashed. having a dropped connection due to lifecycles TCP connections or simply a redis service that gets moved around to another server due to maintenance is probably enough.

wzrdtales · 2023-02-21T10:03:20Z

with there are none i mean literally there is only apache like access logs, no errors whatsoever were in the logs

Weird. Are you running this using Docker or just the binary?

docker in kubernetes.

realaravinth · 2023-02-21T10:06:22Z

actually it does not need a redis crash, our redis never crashed. having a dropped connection due to lifecycles TCP connections or simply a redis service that gets moved around to another server due to maintenance is probably enough.

Right, all things that make Redis unavailable :D

docker in kubernetes.

Interesting. Setting RUST_LOG=info but it should be done automatically by the program if the env var is unset 🤷

wzrdtales · 2023-02-21T10:09:27Z

that is the thing we already set RUST_LOG to debug, with no effect. To be honest all in all, everything from the rust universe feels to be quite immature still.

wzrdtales · 2023-02-21T10:11:11Z

Right, all things that make Redis unavailable :D

A dropped connection should normally be a default thing to handle for any library dealing with redis, or simply any network accessed protocol library.

realaravinth · 2023-02-21T10:45:51Z

that is the thing we already set RUST_LOG to debug, with no effect.

I'm unfamiliar with Kubernetes. I used kompose to convert the docker-compose file shipped with this repository to generate Kubernetes deployment configurations for mCaptcha and deployed it. Logs appear to be working:

16:10 atm@lab tmp → kubectl logs -f deployment/mcaptcha
 INFO  mcaptcha > mcaptcha: mCaptcha - a PoW-based CAPTCHA system.
For more information, see: https://mcaptcha.org
Build info:
Version: 0.1.0 commit: f78669955c2150864d2ee8a9a5d90e134aaa52aa
 INFO  mcaptcha::settings > Loading config file from /etc/mcaptcha/config.toml
 INFO  mcaptcha::settings > Overriding [server].port with environment variable
 INFO  mcaptcha::settings > Overriding [database].url and [database].database_type with environment variable
 INFO  mcaptcha::data     > Initializing credential manager
 INFO  mcaptcha::data     > Initialized credential manager
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: DBError(Io(Custom { kind: Uncategorized, error: "failed to lookup address information: No address associated with hostname" }))', src/db.rs:36:53
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

To be honest all in all, everything from the rust universe feels to be quite immature still.

Rust is a developing language, it takes time to build stuff. Majority of the work done within the Rust community, including this project, is volunteer-driven.

wzrdtales · 2023-03-06T10:55:16Z

a similar problem seem to exist with the db connections as well, completely deactivating this redis module results in the same issues after some time

realaravinth · 2023-03-31T10:31:23Z

a similar problem seem to exist with the db

Unable to reproduce

closes: #53

wzrdtales · 2023-03-31T11:39:53Z

can't really tell how to reproduce it, but we have disabled redis in the meantime completely and still get lock ups. About the environment, it is using a high available postgres db, using pgbouncer, maybe that helps as background info. Something locks the application up completely however, so yeah this is hard to evaluate if you can't reproduce it. I can only suggest trying to run it in k8s with the zalando operator for running a HA postgres db. As this is the setup where it is locking up currently (roughly about once in 24-48hours the service stalls completely, you don't even get an answer from config and sometimes not even the widget is loading anymore)

wzrdtales · 2023-03-31T11:41:23Z

we actually have a locked up one right now in this second...

wzrdtales · 2023-03-31T11:43:49Z

this time:

it is getting the configs still but no verify is working anymore.

And here the complete lifetimes log
https://pastes.l00m32.wizardtales.net/?5017c6a279993f10#FK5y9kMaNYdvg3L6x9dWezcCxmNUkCB34xCDUYP57uTV

in particular

from conversin: pool timed out while waiting for an open connection

this was the reason i was saying there are issues with the db connection as well

realaravinth · 2023-03-31T12:03:34Z

@wzrdtales: Please create a separate issue for the situation you describe. Also, I request you to kindly provide complete context when you say something isn't working as it should be.

k8s is unsupported for the time being. I don't have the bandwidth for it and mCaptcha doesn't even have an alpha release yet, so supporting k8s seems unjustified.

The DB timeout usually happens when the DB library is unable to acquire a connection, it can be simulated by killing the database and doing something on the app to trigger DB activity:

I don't know why it is happening in your environment, especially when you say your database is HA.

Please feel free to reopen the ticket if the problem persists.

wzrdtales · 2023-03-31T12:06:28Z

you tested this within a few minutes. this does only happen if you run it long term and do sufficient enough traffic on it. no traffic no problems and usually in the beginning from the start no problems.

no problem however, we are taking care of it ourselves currently, I am just giving you feedback not expecting anything from you.

realaravinth mentioned this issue Feb 21, 2023

Use connection manager to reconnect to Redis on connection loss mCaptcha/libmcaptcha#10

Closed

realaravinth self-assigned this Feb 21, 2023

realaravinth added a commit that referenced this issue Mar 31, 2023

fix: update libmcaptcha to use connection manager

58f93cb

closes: #53

realaravinth mentioned this issue Mar 31, 2023

fix: update libmcaptcha to use connection manager #70

Merged

realaravinth closed this as completed in #70 Mar 31, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Verification not working anymore, restart container helped #53

Verification not working anymore, restart container helped #53

DarianAnjuhal commented Dec 13, 2022

realaravinth commented Dec 13, 2022

DarianAnjuhal commented Dec 14, 2022

wzrdtales commented Feb 20, 2023

realaravinth commented Feb 21, 2023

wzrdtales commented Feb 21, 2023

realaravinth commented Feb 21, 2023

wzrdtales commented Feb 21, 2023

wzrdtales commented Feb 21, 2023

realaravinth commented Feb 21, 2023

realaravinth commented Feb 21, 2023

wzrdtales commented Feb 21, 2023

wzrdtales commented Feb 21, 2023

realaravinth commented Feb 21, 2023

wzrdtales commented Feb 21, 2023

wzrdtales commented Feb 21, 2023

realaravinth commented Feb 21, 2023

wzrdtales commented Mar 6, 2023

realaravinth commented Mar 31, 2023

wzrdtales commented Mar 31, 2023

wzrdtales commented Mar 31, 2023

wzrdtales commented Mar 31, 2023 •

edited

realaravinth commented Mar 31, 2023 •

edited

wzrdtales commented Mar 31, 2023

Verification not working anymore, restart container helped #53

Verification not working anymore, restart container helped #53

Comments

DarianAnjuhal commented Dec 13, 2022

realaravinth commented Dec 13, 2022

DarianAnjuhal commented Dec 14, 2022

wzrdtales commented Feb 20, 2023

realaravinth commented Feb 21, 2023

wzrdtales commented Feb 21, 2023

realaravinth commented Feb 21, 2023

wzrdtales commented Feb 21, 2023

wzrdtales commented Feb 21, 2023

realaravinth commented Feb 21, 2023

realaravinth commented Feb 21, 2023

wzrdtales commented Feb 21, 2023

wzrdtales commented Feb 21, 2023

realaravinth commented Feb 21, 2023

wzrdtales commented Feb 21, 2023

wzrdtales commented Feb 21, 2023

realaravinth commented Feb 21, 2023

wzrdtales commented Mar 6, 2023

realaravinth commented Mar 31, 2023

wzrdtales commented Mar 31, 2023

wzrdtales commented Mar 31, 2023

wzrdtales commented Mar 31, 2023 • edited

realaravinth commented Mar 31, 2023 • edited

wzrdtales commented Mar 31, 2023

wzrdtales commented Mar 31, 2023 •

edited

realaravinth commented Mar 31, 2023 •

edited