Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When message fetch fails, show error instead of loading-animation #4186

Closed
gnprice opened this issue Jul 9, 2020 · 0 comments · Fixed by #4205
Closed

When message fetch fails, show error instead of loading-animation #4186

gnprice opened this issue Jul 9, 2020 · 0 comments · Fixed by #4205
Assignees
Labels
a-data-sync Zulip's event system, event queues, staleness/liveness a-message list P1 high-priority

Comments

@gnprice
Copy link
Member

gnprice commented Jul 9, 2020

When you go to read a conversation, we load the messages that are in it if we don't already have them. While we're doing that fetch, we show a "loading" animation of placeholder messages. But then if that fetch fails with an exception, we carry on showing that loading-animation forever.

That's bad because after the fetch has failed, we are not in fact still working on loading the messages, so the animation is effectively telling the user something that isn't true. It's also misleading when trying to debug, as it obscures the fact there was an error and not just something taking a long time.

Cases where we've seen this come up recently:

  • Message list loads forever when some messages have reactions #4156 turned out to be a server bug, where the server returned garbled data on fetching any messages that had emoji reactions.
    • We'd actually seen this bug before, when it was live on chat.zulip.org and caused the same symptom. It was fixed within a couple of days, but perhaps unsurprisingly there was a deployment that happened to have upgraded to master within that window of a couple of days, and stayed there.
  • Long, sometimes endless loading in message list #4033 may in part reflect another case of this -- at least when the loading is "endless" and not merely long. (Though because that one goes away on quitting and relaunching the app, it's definitely not the same server bug and probably is a purely client-side bug.)

Instead, when the fetch fails we should show a widget that's not animated and says there was an error. Preferably also with a button (low-emphasis, like a text button) to retry.

Some chat discussion of how to implement this starts here, and particularly here and after.

@gnprice gnprice added a-message list a-data-sync Zulip's event system, event queues, staleness/liveness P1 high-priority labels Jul 9, 2020
@chrisbobbe chrisbobbe self-assigned this Jul 23, 2020
chrisbobbe added a commit to chrisbobbe/zulip-mobile that referenced this issue Jul 24, 2020
We'll use this in an upcoming commit, for zulip#4186.
chrisbobbe added a commit to chrisbobbe/zulip-mobile that referenced this issue Jul 24, 2020
For more straightforward error handling, move the dispatch of
fetching messages to the UI component that we want to make
responsible for handling errors in the fetch (and that is already
responsible for showing the messages).

Now, we can wrap the fetch in a `try` in the UI component itself,
and easily set the local UI state in the `catch`. When we're in the
error state, we'll see an error view instead of the infinite loading
view, thus fixing zulip#4186.

We have to carefully remove the fetches in `doInitialFetch` and
`doNarrow` that this strategy replaces, and edit the tests
accordingly.

We can't only fetch in `componentDidMount` because we might need to
reregister an event queue (see DEAD_QUEUE) while the component is
just sitting there, having been mounted for some time. In
particular, this needs to happen on REALM_INIT. This was handled
before because `doInitialFetch` was called upon DEAD_QUEUE, and the
fetch was done after dispatching REALM_INIT. Now, we listen for
REALM_INIT as a state change in `state.session.eventQueueId` in
`componentDidUpdate`.

We also make sure that the fetch on REALM_INIT only happens for the
"topmost" or "last" narrows, using the same values that were used in
`doInitialFetch`. We optimize a bit further:

- Before, a separate fetch was done for the "topmost" and the "last"
  narrow, even if they were the same narrow. I haven't fully
  understood the differences between the two markers, but I suspect
  there's a lot of overlap. Now, a particular `ChatScreen` only
  fetches once, even if it's both "topmost" and "last".

- Before, for non-"topmost" and non-"last" narrows, we didn't bother
  to fetch them at all, on reregistering an event queue -- even when
  they later became "topmost" or "last"! So, set up a little bit of
  local UI state to track that.

Finally, adjust tests and comments according to the changes. See
also discussion [1], where Greg carefully explains the subtle ways
things have to be ordered, and we discuss some race conditions.

[1] https://chat.zulip.org/#narrow/stream/243-mobile-team/topic/.23M4156.20Message.20List.20placeholders/near/948514

Fixes: zulip#4186
chrisbobbe added a commit to chrisbobbe/zulip-mobile that referenced this issue Jul 24, 2020
We'll use this in an upcoming commit, for zulip#4186.
chrisbobbe added a commit to chrisbobbe/zulip-mobile that referenced this issue Jul 24, 2020
For more straightforward error handling, move the dispatch of
fetching messages to the UI component that we want to make
responsible for handling errors in the fetch (and that is already
responsible for showing the messages).

Now, we can wrap the fetch in a `try` in the UI component itself,
and easily set the local UI state in the `catch`. When we're in the
error state, we'll see an error view instead of the infinite loading
view, thus fixing zulip#4186.

We have to carefully remove the fetches in `doInitialFetch` and
`doNarrow` that this strategy replaces, and edit the tests
accordingly. So, do.

We can't only fetch in `componentDidMount` because we might need to
reregister an event queue (see DEAD_QUEUE) while the component is
just sitting there, having been mounted for some time. In
particular, this needs to happen on REALM_INIT. This was handled
before by `doInitialFetch` being called upon DEAD_QUEUE, and the
fetch being done after dispatching REALM_INIT. Now, we listen for
REALM_INIT as a state change in `state.session.eventQueueId` in
`componentDidUpdate`.

We also preserve that the fetch on REALM_INIT only happens for the
"topmost" or "last" narrows, using the same values that were used in
`doInitialFetch`. We optimize a bit further:

- Before, a separate fetch was done for the "topmost" and the "last"
  narrow, even if they were the same narrow. I haven't fully
  understood the differences between the two markers, but I suspect
  there's a lot of overlap. A particular `ChatScreen` doesn't fetch
  twice, even if it's both "topmost" and "last" [1].

- Before, for non-"topmost" and non-"last" narrows, we didn't bother
  to fetch them at all, on reregistering an event queue -- even when
  they later became "topmost" or "last"! So, set up a little bit of
  local UI state to track that.

Finally, adjust tests and comments according to the changes. See
also discussion [2], where Greg carefully explains the subtle ways
things have to be ordered, and we discuss some race conditions.

[1] N.B.: If narrow X is considered the "topmost" or "last" narrow,
    and multiple `ChatScreen`s are showing narrow X, multiple
    fetches will be done (one per `ChatScreen` showing X). Where the
    number of `ChatScreen`s showing X is large, this is a
    performance regression.
[2] https://chat.zulip.org/#narrow/stream/243-mobile-team/topic/.23M4156.20Message.20List.20placeholders/near/948514

Fixes: zulip#4186
chrisbobbe added a commit to chrisbobbe/zulip-mobile that referenced this issue Jul 28, 2020
We'll use this in an upcoming commit, for zulip#4186.
chrisbobbe added a commit to chrisbobbe/zulip-mobile that referenced this issue Jul 28, 2020
For more straightforward error handling, move the dispatch of
fetching messages to the UI component that we want to make
responsible for handling errors in the fetch (and that is already
responsible for showing the messages).

Now, we can wrap the fetch in a `try` in the UI component itself,
and easily set the local UI state in the `catch`. When we're in the
error state, we'll see an error view instead of the infinite loading
view, thus fixing zulip#4186.

We have to carefully remove the fetches in `doInitialFetch` and
`doNarrow` that this strategy replaces, and edit the tests
accordingly. So, do.

We can't only fetch in `componentDidMount` because we might need to
reregister an event queue (see DEAD_QUEUE) while the component is
just sitting there, having been mounted for some time. In
particular, this needs to happen on REALM_INIT. This was handled
before by `doInitialFetch` being called upon DEAD_QUEUE, and the
fetch being done after dispatching REALM_INIT. Now, we listen for
REALM_INIT as a state change in `state.session.eventQueueId` in
`componentDidUpdate`.

We also preserve that the fetch on REALM_INIT only happens for the
"topmost" or "last" narrows, using the same values that were used in
`doInitialFetch`. We optimize a bit further:

- Before, a separate fetch was done for the "topmost" and the "last"
  narrow, even if they were the same narrow. I haven't fully
  understood the differences between the two markers, but I suspect
  there's a lot of overlap. A particular `ChatScreen` doesn't fetch
  twice, even if it's both "topmost" and "last" [1].

- Before, for non-"topmost" and non-"last" narrows, we didn't bother
  to fetch them at all, on reregistering an event queue -- even when
  they later became "topmost" or "last"! So, set up a little bit of
  local UI state to track that.

Finally, adjust tests and comments according to the changes. See
also discussion [2], where Greg carefully explains the subtle ways
things have to be ordered, and we discuss some race conditions.

[1] N.B.: If narrow X is considered the "topmost" or "last" narrow,
    and multiple `ChatScreen`s are showing narrow X, multiple
    fetches will be done (one per `ChatScreen` showing X). Where the
    number of `ChatScreen`s showing X is large, this is a
    performance regression.
[2] https://chat.zulip.org/#narrow/stream/243-mobile-team/topic/.23M4156.20Message.20List.20placeholders/near/948514

Fixes: zulip#4186
chrisbobbe added a commit to chrisbobbe/zulip-mobile that referenced this issue Jul 29, 2020
We'll use this in an upcoming commit, for zulip#4186.
chrisbobbe added a commit to chrisbobbe/zulip-mobile that referenced this issue Jul 29, 2020
For more straightforward error handling, move the dispatch of
fetching messages to the UI component that we want to make
responsible for handling errors in the fetch (and that is already
responsible for showing the messages).

Now, we can wrap the fetch in a `try` in the UI component itself,
and easily set the local UI state in the `catch`. When we're in the
error state, we'll see an error view instead of the infinite loading
view, thus fixing zulip#4186.

We have to carefully remove the fetches in `doInitialFetch` and
`doNarrow` that this strategy replaces, and edit the tests
accordingly. So, do.

We fetch in `componentDidMount`, but that's not quite enough: we
might need to reregister an event queue (see DEAD_QUEUE) while the
component is just sitting there, having been mounted for some time.
In particular, this needs to happen on REALM_INIT. This was handled
before by `doInitialFetch` being called upon DEAD_QUEUE, and the
fetch being done after dispatching REALM_INIT. So, now, we listen
for REALM_INIT as a state change in `state.session.eventQueueId` in
`componentDidUpdate`.

We also preserve that the fetch on REALM_INIT only happens for the
"topmost" or "last" narrows, using the same values that were used in
`doInitialFetch`. We do a few things better:

- Before, a separate fetch was done for the "topmost" *and* the
  "last" narrow, even if they were the same narrow. I haven't fully
  understood the differences between the two markers, but I suspect
  there's a lot of overlap. Now, a particular `ChatScreen` doesn't
  fetch twice, even if it's both "topmost" and "last" [1].

- Before, for non-"topmost" and non-"last" narrows, we didn't bother
  to fetch them at all, on reregistering an event queue -- even when
  they later became "topmost" or "last". So, set up a little bit of
  local UI state to say that we want them to fetch as soon as they
  next gain focus.

Finally, adjust tests and comments according to the changes. See
also discussion [2], where Greg carefully explains the subtle ways
actions have to be ordered, and we discuss some race conditions.

[1] N.B.: If narrow X is considered the "topmost" or "last" narrow,
    and multiple `ChatScreen`s are showing narrow X, multiple
    fetches will be done (one per `ChatScreen` showing X). Where the
    number of `ChatScreen`s showing X is large, this is a
    performance regression.
[2] https://chat.zulip.org/#narrow/stream/243-mobile-team/topic/.23M4156.20Message.20List.20placeholders/near/948514

Fixes: zulip#4186
chrisbobbe added a commit to chrisbobbe/zulip-mobile that referenced this issue Jul 29, 2020
We'll use this in an upcoming commit, for zulip#4186.
chrisbobbe added a commit to chrisbobbe/zulip-mobile that referenced this issue Jul 29, 2020
For more straightforward error handling, move the dispatch of
fetching messages to the UI component that we want to make
responsible for handling errors in the fetch (and that is already
responsible for showing the messages).

Now, we can wrap the fetch in a `try` in the UI component itself,
and easily set the local UI state in the `catch`. When we're in the
error state, we'll see an error view instead of the infinite loading
view, thus fixing zulip#4186.

We have to carefully remove the fetches in `doInitialFetch` and
`doNarrow` that this strategy replaces, and edit the tests
accordingly. So, do.

We fetch in `componentDidMount`, but that's not quite enough: we
might need to reregister an event queue (see DEAD_QUEUE) while the
component is just sitting there, having been mounted for some time.
In particular, this needs to happen on REALM_INIT. This was handled
before by `doInitialFetch` being called upon DEAD_QUEUE, and the
fetch being done after dispatching REALM_INIT. So, now, we listen
for REALM_INIT as a state change in `state.session.eventQueueId` in
`componentDidUpdate`.

We also preserve that the fetch on REALM_INIT only happens for the
"topmost" or "last" narrows, using the same values that were used in
`doInitialFetch`. We do a few things better:

- Before, a separate fetch was done for the "topmost" *and* the
  "last" narrow, even if they were the same narrow. I haven't fully
  understood the differences between the two markers, but I suspect
  there's a lot of overlap. Now, a particular `ChatScreen` doesn't
  fetch twice, even if it's both "topmost" and "last" [1].

- Before, for non-"topmost" and non-"last" narrows, we didn't bother
  to fetch them at all, on reregistering an event queue -- even when
  they later became "topmost" or "last". So, set up a little bit of
  local UI state to say that we want them to fetch as soon as they
  next gain focus.

Finally, adjust tests and comments according to the changes. See
also discussion [2], where Greg carefully explains the subtle ways
actions have to be ordered, and we discuss some race conditions.

[1] N.B.: If narrow X is considered the "topmost" or "last" narrow,
    and multiple `ChatScreen`s are showing narrow X, multiple
    fetches will be done (one per `ChatScreen` showing X). Where the
    number of `ChatScreen`s showing X is large, this is a
    performance regression.
[2] https://chat.zulip.org/#narrow/stream/243-mobile-team/topic/.23M4156.20Message.20List.20placeholders/near/948514

Fixes: zulip#4186
chrisbobbe added a commit to chrisbobbe/zulip-mobile that referenced this issue Aug 10, 2020
We'll use this in an upcoming commit, for zulip#4186.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a-data-sync Zulip's event system, event queues, staleness/liveness a-message list P1 high-priority
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants