Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Neighbourhood view not updated on shared activity stop #840

Open
quozl opened this issue Jun 12, 2019 · 25 comments
Open

Neighbourhood view not updated on shared activity stop #840

quozl opened this issue Jun 12, 2019 · 25 comments
Labels

Comments

@quozl
Copy link
Contributor

quozl commented Jun 12, 2019

When an invited user stops a shared activity, the neighbourhood view may be incorrectly updated.

Reproducer;

  • on instance one, start and share the CollabWrapper test activity,
  • on instance two, join the shared activity from the neighbourhood view, then stop it

Observed behaviour;

  • on instance one, two buddy icons are shown on neighbourhood view for instance two,
  • on instance two, the buddy icon for instance one may disappear and reappear,

Expected behaviour;

  • on instance one, one buddy icon should be shown for instance two,
  • on instance two, no change to neighbourhood view,

Instance one shell.log;

Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/jarabe/desktop/meshbox.py", line 409, in _buddy_removed_cb
    self._remove_buddy(buddy_model)
  File "/usr/local/lib/python2.7/dist-packages/jarabe/desktop/meshbox.py", line 435, in _remove_buddy
    icon = self._buddies[buddy_model.props.key]
KeyError: dbus.ByteArray('AAAAB...Y85WlA==', variant_level=1)

Instance two shell.log;

1560310543.375053 ERROR root: set_active() failed: org.freedesktop.DBus.Error.NoReply: Message recipient disconnected from message bus without replying

org.freedesktop.DBus.Error.NoReply is caused by premature termination of the telepathy-salut process before it replies to a message.

On instance two at the time of the event, the process ID of the telepathy-salut process changes, and on Ubuntu 19.04 and an apport crash file is added to /var/crash on behalf of the process.

The problem is always reproducible and repeatable.

Theory; a message sent to Telepathy Salut causes it to crash. It should not crash. This is a bug in Telepathy Salut, possibly combined with a bug in Sugar usage of the API.

@quozl quozl transferred this issue from sugarlabs/sugar-toolkit-gtk3 Jun 12, 2019
@aperezbios
Copy link
Contributor

Was this against a released version of Sugar, top-of-trunk, or a branch?

@quozl
Copy link
Contributor Author

quozl commented Jun 12, 2019

Yes.

@aperezbios
Copy link
Contributor

All of the above?

@quozl
Copy link
Contributor Author

quozl commented Jun 12, 2019

Sure, that's another way of looking at it. At least released versions 0.112, 0.113, 0.114, master, and python3 branches are affected, as are the branches in development for TelepathyGLib porting.

@Aniket21mathur
Copy link
Contributor

My question might be off-topic.
I went through the dbus documentation and classes. We are using dbus.ByteArray class. But as per the documentation it takes only one str argument. variant_level is taken as an argument by dbus.Byte class.
I don't know whether my question is relevant, just had it in my mind so posted. 😅 .
Thanks!

@quozl
Copy link
Contributor Author

quozl commented Jun 13, 2019

variant_level is an argument for dbus.Byte, so it is logical by inheritance that it is available for use in a dbus.ByteArray.

% python
Python 2.7.15+ (default, Nov 27 2018, 23:36:35) 
[GCC 7.3.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import dbus
>>> x = dbus.ByteArray('test')
>>> x
dbus.ByteArray('test')
>>> x = dbus.ByteArray('test', variant_level=0)
>>> x
dbus.ByteArray('test')
>>> x = dbus.ByteArray('test', variant_level=2)
>>> x
dbus.ByteArray('test', variant_level=2)
>>> x = dbus.ByteArray('test', freddy=2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'freddy' is an invalid keyword argument for this function
>>> █

dbus-python source package, file dbus_bindings/bytes.c is also useful.

@Aniket21mathur
Copy link
Contributor

Thanks for explaining. :-).

@quozl
Copy link
Contributor Author

quozl commented Feb 25, 2020

Summary: Debug a Telepathy crash

Sugar uses the Telepathy API for collaboration. Every time a collaboration stops, one of the Telepathy programs crashes with a core dump. It has been happening for a few years.

See if you can use gdb and the Telepathy source code to find out what the problem is! A task for experts. You may have to reach out to the Telepathy project as well.

@shaansubbaiah
Copy link
Member

@quozl I'm able to reproduce the error in the instance two example differently.

Reproduced by: Closing any activity.
(Tested on the fructose set, this happens on the Log activity as well, which is not a shared activity)

Error:

ERROR root: set_active() failed: org.freedesktop.DBus.Error.NoReply: Message recipient disconnected from message bus without replying

Tested on: Sugar 117-3, Debian 10.4 Buster

@quozl
Copy link
Contributor Author

quozl commented Jun 13, 2020

Thanks, but it makes no difference. Telepathy Salut is still crashing, and it should not. It is the Telepathy Salut process on instance two that crashes, hence the set_active message. It is, however, useful to know the same problem persists in 0.117 Debian package 3. Perhaps the same version of Telepathy is used.

@srevinsaju
Copy link
Member

I am not sure; I have logs related to telepathy-salut, and I guess @quozl might have seen some other logs of mine related to telepathy_Salut. In case it helps

1591852813.812322 WARNING root: Connection /org/freedesktop/Telepathy/Connection/salut/local_xmpp/e2782320 does not support OLPC buddy properties
/usr/lib/python3.8/site-packages/gi/overrides/Gtk.py:1644: Warning: g_value_type_compatible: assertion 'src_type' failed
  return _Gtk_main(*args, **kwargs)
/usr/lib/python3.8/site-packages/gi/overrides/Gtk.py:1644: Warning: unable to set property 'buddy' of type 'PyObject' from value of type '(null)'
  return _Gtk_main(*args, **kwargs)
1591852839.673118 ERROR root: set_active() failed: org.freedesktop.DBus.Error.NoReply: Message recipient disconnected from message bus without replying

(metacity:4459): metacity-WARNING **: 08:20:43.682: CurrentTime used to choose focus window; focus window may not be correct.

@quozl
Copy link
Contributor Author

quozl commented Jun 15, 2020

No, it doesn't help, it confuses. "does not support OLPC buddy properties" is a different problem; compare your version of Telepathy Salut, we're using 0.8.1 with Ubuntu 20.04. "NoReply" is the same underlying problem.

@aperezbios
Copy link
Contributor

aperezbios commented Jan 23, 2021

for future reference, since I just ran across this issue @srevinsaju ran across back in June, inside the SoaS Fedora nightly image:

In the latest Fedora Rawhide (will soon become 34) SoaS images, telepathy-salut is version 0.8.1, and python3-gwebsockets is version 0.7, and metacity is version 3.38.0

@srevinsaju
Copy link
Member

srevinsaju commented Jan 23, 2021

For the completion of my answer that time, I was using v0.8.1 using that time.

Tested again, have the same error. v0.8.1.r7, gwessocketson v0.7 and metacity on v3.38.0. Test system was Arch Linux, native install, and not a Debian Live Build.

@quozl
Copy link
Contributor Author

quozl commented Jan 24, 2021

Indeed. It needs diagnosis by debugging Telepathy, or reviewing Telepathy commits. My guess is the support for Sugar may have been removed from Telepathy because it was not maintained.

@aperezbios
Copy link
Contributor

aperezbios commented Feb 6, 2021

This still happens under Sugar 0.118 in the Fedora Rawhide nightly composes. Upon examining the Telepathy source repo for Sugar support, it does does not appear to have been removed from current releases of Telepathy, only telepathy-next.

Fedora Rawhide/34 includes telepathy-gabble 0.18.4, telepathy-salut 0.8.1, telepathy-glib 0.24.1, and python3-gwebsockets 0.7

@aperezbios aperezbios added the bug label Feb 6, 2021
@quozl
Copy link
Contributor Author

quozl commented Feb 7, 2021

Thanks. That eliminates my guess. It still needs diagnosis by debugging Telepathy, by testing an older version of Telepathy, or by reviewing Telepathy commits between the last working version and now. To be sure you have the same problem as described by this issue, the Telepathy Salut process id will change between when the shared activity is running, and when it is stopped. If process crash dumps are configured, then a core file will be made.

@aperezbios
Copy link
Contributor

aperezbios commented Feb 13, 2021

This also happens when a shared activity has NOT been stopped, so the bug description should be updated to reflect that, since the scope is significantly larger.

Here's the log output I get from Sugar 0.118 (SoaS Fedora 34 prerelease):

Traceback (most recent call last):
  File "/usr/lib64/python3.9/site-packages/dbus/connection.py", line 606, in msg_reply_handler
    reply_handler(*message.get_args_list(**get_args_opts))
  File "/usr/lib/python3.9/site-packages/jarabe/model/neighborhood.py", line 301, in __conn_get_interfaces_reply_cb
    self._connection[interface] = dbus.Interface(
TypeError: 'NoneType' object does not support item assignment
1613234385.855060 WARNING root: Connection /org/freedesktop/Telepathy/Connection/gabble/jabber/_37b17a0aa5efd8b3801af1778cd1134c45bf8571e_40jabber_2esugarlabs_2eorg_2fsugar does not support OLPC buddy properties
/usr/lib/python3.9/site-packages/gi/overrides/Gtk.py:1649: Warning: g_value_type_compatible: assertion 'src_type' failed
  return _Gtk_main(*args, **kwargs)
/usr/lib/python3.9/site-packages/gi/overrides/Gtk.py:1649: Warning: unable to set property 'buddy' of type 'PyObject' from value of type '(null)'
  return _Gtk_main(*args, **kwargs)

@chimosky
Copy link
Member

@aperezbios when a shared activity hasn't been stopped on one of the shared instances?

@aperezbios
Copy link
Contributor

@chimosky when simply looking at the neighborhood view (with jabber.sugarlabs.org configured as the collaboration server)

@aperezbios
Copy link
Contributor

Bizarrely, the simple act of setting G_MESSAGES_DEBUG="all" (and rebooting, or killing jarabe.py to induce sugar desktop re-start) in /etc/environment seems to "fix" the problem, although I've no idea why, as it makes no sense.

@chimosky
Copy link
Member

chimosky commented Feb 15, 2021

@chimosky when simply looking at the neighborhood view (with jabber.sugarlabs.org configured as the collaboration server)

The first part of the error could be when no interface was returned which could happen when a connection hasn't been made but that'll throw a Disconnected error instead. @quozl might have a better insight as to the possible causes of this.

I've seen the last line of the error and I can't remember the cause of that and I think there used to be an open issue for it.

Bizarrely, the simple act of setting G_MESSAGES_DEBUG="all" (and rebooting, or killing jarabe.py to induce sugar desktop re-start) in /etc/environment seems to "fix" the problem, although I've no idea why, as it makes no sense.

Are you saying when you do this the jabber server connection gets created successfully and those errors aren't shown in the logs anymore?

@quozl
Copy link
Contributor Author

quozl commented Feb 17, 2021

Oh, okay. No, I'm not using jabber.sugarlabs.org at all, I'm concentrating only on link-local peer to peer, aka Telepathy Salut. If you're using jabber.sugarlabs.org, then you're using Telepathy Gabble. Two very different underlying software components. I'm interested to know what causes Salut to unexpectedly terminate, but I've not even got as far as analysing the core dump.

@aperezbios
Copy link
Contributor

@quozl should we separate these issues, one for telepahthy-gabble, and one for telepathy-salut? We need to figure out why setting G_MESSAGES_DEBUG="all" in /etc/environment (and then restarting jarabe), results in Sugar (using) telepathy-gabble seeing neighbors/peers.

@quozl
Copy link
Contributor Author

quozl commented Feb 24, 2021

Thanks for asking. I don't mind either way. It's not like were not solving the problem because the issue is too hard to read. Speculating; we're not solving it because (a) it is a Telepathy problem rather than a Sugar problem (most likely), (b) Telepathy is written in C and we have attracted primarily Python developers, and (c) those of us with C skills haven't had the time to investigate.

If you do split the issues, create new ones that condense and summarise the observations in each scenario, with a link back to this issue, then we can close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants