Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid session for client #99

Open
madtrick opened this issue Sep 16, 2023 · 2 comments
Open

Invalid session for client #99

madtrick opened this issue Sep 16, 2023 · 2 comments

Comments

@madtrick
Copy link

madtrick commented Sep 16, 2023

Hi,

A couple of weeks ago I opened an issue where I described the a problem I was having with the library: I was missing watch events.

Today I was digging a bit more onto this problem and started by looking at the watches that ZK reported as registered:

% echo wchc | nc localhost 2181
0x100002823d9000a
        /nodes
0x100002823d90003
        /nodes

0x100002823d9000a was watch from my Rust program. The other one was a watcher I attached from the ZK cli. After a while that watch was gone from the list. And the following line printed in the zookeeper logs:

rustkv-zookeeper-1      | 2023-09-16 21:01:07,320 [myid:] - INFO  [SessionTracker:o.a.z.s.ZooKeeperServer@730] - Expiring session 0x100002823d9000a, timeout of 15000ms exceeded
rustkv-zookeeper-1      | 2023-09-16 21:01:07,348 [myid:] - INFO  [NIOWorkerThread-6:o.a.z.s.ZooKeeperServer@1188] - Invalid session 0x100002823d9000a for client /172.19.0.1:55974, probably expired

Update, here's the code:

use std::sync::mpsc::channel;
use std::thread;
use std::time::Duration;
use zookeeper::AddWatchMode;
use zookeeper::{Acl, CreateMode, WatchedEvent, Watcher, ZooKeeper};

struct LoggingWatcher;
impl Watcher for LoggingWatcher {
    fn handle(&self, e: WatchedEvent) {
        // TODO: use the info! macro
        println!("{:?}", e)
    }
}

pub fn main() {
    let zk = ZooKeeper::connect("localhost:2181", Duration::from_secs(15), LoggingWatcher).unwrap();

    env_logger::init();

    let (send, recv) = channel();
    let closure = move |event: WatchedEvent| send.send(()).unwrap();
    zk.add_watch("/nodes", AddWatchMode::PersistentRecursive, closure)
        .unwrap();

    let node_create = zk.create(
        "/nodes",
        vec![],
        Acl::open_unsafe().clone(),
        CreateMode::Persistent,
    );

    match node_create {
        Err(zookeeper::ZkError::NodeExists) => println!("Node exists"),
        Err(_) => panic!("Unexpected error"),
        Ok(_) => (),
    }

    for x in recv.iter() {
        println!("Event {:?}, path {:?}", x, x);
    }
}
@madtrick
Copy link
Author

... A bunch of "[2023-09-16T21:36:03Z TRACE mio_extras::timer] ticking; curr=Token(18446744073709551615)" messages
[2023-09-16T21:35:53Z TRACE mio_extras::timer] triggering; token=Token(0)
[2023-09-16T21:35:53Z TRACE mio_extras::timer] unlinking timeout; slot=70; token=Token(0)
[2023-09-16T21:35:53Z TRACE zookeeper::io] handle ping timeout
[2023-09-16T21:35:53Z TRACE zookeeper::io] clear_timeout: Ping
[2023-09-16T21:35:53Z TRACE zookeeper::io] Pinging [::1]:2181
[2023-09-16T21:35:53Z TRACE mio_extras::timer] tick_to; target_tick=1606; current_tick=1606
[2023-09-16T21:35:53Z TRACE mio_extras::timer] ticking; curr=Token(18446744073709551615)
[2023-09-16T21:35:53Z TRACE mio_extras::timer] unsetting readiness
[2023-09-16T21:35:53Z TRACE zookeeper::io] event token=Token(3) ready=Readable
[2023-09-16T21:35:53Z TRACE zookeeper::io] ready_channel Ping
[2023-09-16T21:35:53Z TRACE mio::poll] registering with poller
[2023-09-16T21:35:53Z TRACE mio::sys::unix::kqueue] registering; token=Token(1); interests=Readable | Writable | Error | Hup
[2023-09-16T21:35:53Z TRACE mio::poll] registering with poller
[2023-09-16T21:35:53Z TRACE mio::poll] registering with poller
[2023-09-16T21:35:53Z TRACE zookeeper::io] event token=Token(1) ready=Writable
[2023-09-16T21:35:53Z TRACE zookeeper::io] Written 12 bytes
[2023-09-16T21:35:53Z TRACE mio::poll] registering with poller
[2023-09-16T21:35:53Z TRACE mio::sys::unix::kqueue] registering; token=Token(1); interests=Readable | Error | Hup
[2023-09-16T21:35:53Z TRACE zookeeper::io] event token=Token(1) ready=Readable
[2023-09-16T21:35:53Z TRACE zookeeper::io] Read 20 bytes
[2023-09-16T21:35:53Z TRACE zookeeper::io] Response chunk len = 16 buf len is 20
[2023-09-16T21:35:53Z TRACE zookeeper::io] handle_response in Connected state [16]
[2023-09-16T21:35:53Z TRACE zookeeper::io] Got ping response in 3.29925ms
[2023-09-16T21:35:53Z TRACE zookeeper::io] start_timeout: Ping
[2023-09-16T21:35:53Z TRACE mio_extras::timer] setting timeout; delay=170.595551209s; tick=1706; current-tick=1607
[2023-09-16T21:35:53Z TRACE mio_extras::timer] advancing the wakeup time; target=1706; curr=18446744073709551615
[2023-09-16T21:35:53Z TRACE mio_extras::timer] unparking wakeup thread
[2023-09-16T21:35:53Z TRACE mio_extras::timer] inserted timout; slot=170; token=Token(0)
[2023-09-16T21:35:53Z TRACE mio::poll] registering with poller
[2023-09-16T21:35:53Z TRACE mio::poll] registering with poller
[2023-09-16T21:35:53Z TRACE mio_extras::timer] wakeup thread: sleep_until_tick=1706; now_tick=1606
[2023-09-16T21:35:53Z TRACE mio::poll] registering with poller
[2023-09-16T21:35:53Z TRACE mio_extras::timer] sleeping; tick_ms=100; now_tick=1606; sleep_until_tick=1706; duration=10000
[2023-09-16T21:35:53Z TRACE mio::sys::unix::kqueue] registering; token=Token(1); interests=Readable | Error | Hup
[2023-09-16T21:36:02Z TRACE zookeeper::io] event token=Token(1) ready=Readable
[2023-09-16T21:36:02Z TRACE zookeeper::io] clear_timeout: Ping
[2023-09-16T21:36:02Z TRACE mio_extras::timer] unlinking timeout; slot=170; token=Token(0)
[2023-09-16T21:36:02Z TRACE zookeeper::io] Read 53 bytes
[2023-09-16T21:36:02Z TRACE zookeeper::io] Response chunk len = 49 buf len is 53
[2023-09-16T21:36:02Z TRACE zookeeper::io] handle_response in Connected state [49]
[2023-09-16T21:36:02Z TRACE zookeeper::io] handle_response Got a watch event!
[2023-09-16T21:36:02Z TRACE zookeeper::io] start_timeout: Ping
[2023-09-16T21:36:02Z TRACE mio_extras::timer] setting timeout; delay=179.669514459s; tick=1797; current-tick=1607
[2023-09-16T21:36:02Z TRACE mio_extras::timer] inserted timout; slot=5; token=Token(0)
[2023-09-16T21:36:02Z TRACE mio::poll] registering with poller
[2023-09-16T21:36:02Z TRACE mio::poll] registering with poller
[2023-09-16T21:36:02Z TRACE mio::poll] registering with poller
[2023-09-16T21:36:02Z TRACE mio::sys::unix::kqueue] registering; token=Token(1); interests=Readable | Error | Hup
[2023-09-16T21:36:02Z INFO  zookeeper::watch] Event thread got response ReplyHeader { xid: -1, zxid: 219, err: 0 }
[2023-09-16T21:36:02Z DEBUG zookeeper::watch] WatchedEvent { event_type: NodeCreated, keeper_state: SyncConnected, path: Some("/nodes/node0000000047") }
Event (), path ()
[2023-09-16T21:36:02Z TRACE zookeeper::io] event token=Token(1) ready=Readable
[2023-09-16T21:36:02Z TRACE zookeeper::io] clear_timeout: Ping
[2023-09-16T21:36:02Z TRACE mio_extras::timer] unlinking timeout; slot=5; token=Token(0)
[2023-09-16T21:36:02Z TRACE zookeeper::io] Read 53 bytes
[2023-09-16T21:36:02Z TRACE zookeeper::io] Response chunk len = 49 buf len is 53
[2023-09-16T21:36:02Z TRACE zookeeper::io] handle_response in Connected state [49]
[2023-09-16T21:36:02Z TRACE zookeeper::io] handle_response Got a watch event!
[2023-09-16T21:36:02Z TRACE zookeeper::io] start_timeout: Ping
[2023-09-16T21:36:02Z TRACE mio_extras::timer] setting timeout; delay=179.68130675s; tick=1797; current-tick=1607
[2023-09-16T21:36:02Z TRACE mio_extras::timer] inserted timout; slot=5; token=Token(0)
[2023-09-16T21:36:02Z TRACE mio::poll] registering with poller
[2023-09-16T21:36:02Z TRACE mio::poll] registering with poller
[2023-09-16T21:36:02Z INFO  zookeeper::watch] Event thread got response ReplyHeader { xid: -1, zxid: 220, err: 0 }
[2023-09-16T21:36:02Z TRACE mio::poll] registering with poller
[2023-09-16T21:36:02Z DEBUG zookeeper::watch] WatchedEvent { event_type: NodeDeleted, keeper_state: SyncConnected, path: Some("/nodes/node0000000047") }
[2023-09-16T21:36:02Z TRACE mio::sys::unix::kqueue] registering; token=Token(1); interests=Readable | Error | Hup
Event (), path ()
[2023-09-16T21:36:03Z TRACE mio_extras::timer] wakeup thread: sleep_until_tick=1706; now_tick=1706
[2023-09-16T21:36:03Z TRACE mio_extras::timer] setting readiness from wakeup thread
[2023-09-16T21:36:03Z TRACE mio_extras::timer] wakeup thread: sleep_until_tick=18446744073709551615; now_tick=1706
[2023-09-16T21:36:03Z TRACE mio_extras::timer] sleeping; tick_ms=100; now_tick=1706; blocking sleep
[2023-09-16T21:36:03Z TRACE zookeeper::io] event token=Token(2) ready=Readable
[2023-09-16T21:36:03Z TRACE zookeeper::io] ready_timer thread=ThreadId(3)
[2023-09-16T21:36:03Z TRACE mio_extras::timer] tick_to; target_tick=1706; current_tick=1607
A bunch of "[2023-09-16T21:36:03Z TRACE mio_extras::timer] ticking; curr=Token(18446744073709551615)" messages
...
[2023-09-16T21:36:03Z TRACE mio_extras::timer] unsetting readiness
[2023-09-16T21:36:03Z TRACE mio_extras::timer] advancing the wakeup time; target=1797; curr=18446744073709551615
[2023-09-16T21:36:03Z TRACE mio_extras::timer] unparking wakeup thread
[2023-09-16T21:36:03Z TRACE zookeeper::io] Spurious timer
[2023-09-16T21:36:03Z TRACE mio::poll] registering with poller
[2023-09-16T21:36:03Z TRACE mio::poll] registering with poller
[2023-09-16T21:36:03Z TRACE mio_extras::timer] wakeup thread: sleep_until_tick=1797; now_tick=1706
[2023-09-16T21:36:03Z TRACE mio_extras::timer] sleeping; tick_ms=100; now_tick=1706; sleep_until_tick=1797; duration=9100
[2023-09-16T21:36:03Z TRACE zookeeper::io] event token=Token(1) ready=Readable
[2023-09-16T21:36:03Z TRACE zookeeper::io] clear_timeout: Ping
[2023-09-16T21:36:03Z TRACE mio_extras::timer] unlinking timeout; slot=5; token=Token(0)
[2023-09-16T21:36:03Z TRACE zookeeper::io] Read 53 bytes
[2023-09-16T21:36:03Z TRACE zookeeper::io] Response chunk len = 49 buf len is 53
[2023-09-16T21:36:03Z TRACE zookeeper::io] handle_response in Connected state [49]
[2023-09-16T21:36:03Z TRACE zookeeper::io] handle_response Got a watch event!
[2023-09-16T21:36:03Z TRACE zookeeper::io] start_timeout: Ping
[2023-09-16T21:36:03Z TRACE mio_extras::timer] setting timeout; delay=181.280991042s; tick=1813; current-tick=1707
[2023-09-16T21:36:03Z TRACE mio_extras::timer] inserted timout; slot=21; token=Token(0)
[2023-09-16T21:36:03Z TRACE mio::poll] registering with poller
[2023-09-16T21:36:03Z TRACE mio::poll] registering with poller
[2023-09-16T21:36:03Z TRACE mio::poll] registering with poller
[2023-09-16T21:36:03Z TRACE mio::sys::unix::kqueue] registering; token=Token(1); interests=Readable | Error | Hup
[2023-09-16T21:36:03Z INFO  zookeeper::watch] Event thread got response ReplyHeader { xid: -1, zxid: 222, err: 0 }
[2023-09-16T21:36:03Z DEBUG zookeeper::watch] WatchedEvent { event_type: NodeCreated, keeper_state: SyncConnected, path: Some("/nodes/node0000000048") }
Event (), path ()
[2023-09-16T21:36:03Z TRACE zookeeper::io] event token=Token(1) ready=Readable
[2023-09-16T21:36:03Z TRACE zookeeper::io] clear_timeout: Ping
[2023-09-16T21:36:03Z TRACE mio_extras::timer] unlinking timeout; slot=21; token=Token(0)
[2023-09-16T21:36:03Z TRACE zookeeper::io] Read 53 bytes
[2023-09-16T21:36:03Z TRACE zookeeper::io] Response chunk len = 49 buf len is 53
[2023-09-16T21:36:03Z TRACE zookeeper::io] handle_response in Connected state [49]
[2023-09-16T21:36:03Z TRACE zookeeper::io] handle_response Got a watch event!
[2023-09-16T21:36:03Z TRACE zookeeper::io] start_timeout: Ping
[2023-09-16T21:36:03Z TRACE mio_extras::timer] setting timeout; delay=181.2877915s; tick=1813; current-tick=1707
[2023-09-16T21:36:03Z TRACE mio_extras::timer] inserted timout; slot=21; token=Token(0)
[2023-09-16T21:36:03Z TRACE mio::poll] registering with poller
[2023-09-16T21:36:03Z TRACE mio::poll] registering with poller
[2023-09-16T21:36:03Z TRACE mio::poll] registering with poller
[2023-09-16T21:36:03Z TRACE mio::sys::unix::kqueue] registering; token=Token(1); interests=Readable | Error | Hup
[2023-09-16T21:36:03Z INFO  zookeeper::watch] Event thread got response ReplyHeader { xid: -1, zxid: 223, err: 0 }
[2023-09-16T21:36:03Z DEBUG zookeeper::watch] WatchedEvent { event_type: NodeDeleted, keeper_state: SyncConnected, path: Some("/nodes/node0000000048") }
Event (), path ()
[2023-09-16T21:36:09Z TRACE zookeeper::io] event token=Token(1) ready=Readable
[2023-09-16T21:36:09Z TRACE zookeeper::io] clear_timeout: Ping
[2023-09-16T21:36:09Z TRACE mio_extras::timer] unlinking timeout; slot=21; token=Token(0)
[2023-09-16T21:36:09Z WARN  zookeeper::io] Connection closed: read
[2023-09-16T21:36:09Z TRACE zookeeper::io] reconnect
[2023-09-16T21:36:09Z INFO  zookeeper::io] Establishing Zk connection
[2023-09-16T21:36:09Z WARN  zookeeper::io] Zk session timeout, closing io event loop

Here are some of the last log lines from the library. And this is what ZK logged:

rustkv-zookeeper-1      | 2023-09-16 21:36:09,322 [myid:] - INFO  [SessionTracker:o.a.z.s.ZooKeeperServer@730] - Expiring session 0x100002823d90011, timeout of 15000ms exceeded```

@behos
Copy link
Contributor

behos commented Sep 25, 2023

It looks like the client session is timing out on the server side, maybe because it's not sending ping events while you are waiting for the event.

From your logs it seems like ping timeouts are reset on event receipts, but I'm not sure Zookeeper considers these events for keepalive.

The timeout is cleared here on any event, but maybe it needs to be retained for reads.
https://github.com/bonifaido/rust-zookeeper/blob/master/src/io.rs#L384

I don't think pinging has a big impact on performance so the client could probably send pings on every timeout without resetting on write.

@bonifaido I think removing the whole condition for resetting ping timeout in that loop would keep long-running-read connections alive, but I'm not sure if that would have any other effect (does the client need to be idle to do a ping?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants