Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicated messages in mam_message #4128

Closed
DoxasticFox opened this issue Sep 21, 2023 · 4 comments
Closed

Duplicated messages in mam_message #4128

DoxasticFox opened this issue Sep 21, 2023 · 4 comments

Comments

@DoxasticFox
Copy link

DoxasticFox commented Sep 21, 2023

I'm using the docker image mongooseim/mongooseim:6.1.0-5-gabdcd0b48. I have JOIN_CLUSTER=false set in the environment.

I'm not sure if this is a server or client issue, or if it's a bug or misconfiguration, but I'm seeing lots of duplicated messages in MAM. The problem is much worse in the I direction, but it happens in the O direction too. It only happens for some messages.


select count(*), direction
from mam_message
group by from_jid, remote_bare_jid, direction, search_body
having count(*) > 1
order by direction, count desc;
Expand
 count | direction 
-------+-----------
   252 | I
   252 | I
   127 | I
   127 | I
   126 | I
   126 | I
   126 | I
   126 | I
   126 | I
   126 | I
    12 | I
     8 | I
     8 | I
     8 | I
     8 | I
     8 | I
     8 | I
     8 | I
     8 | I
     7 | I
     7 | I
     7 | I
     7 | I
     7 | I
     6 | I
     6 | I
     6 | I
     5 | I
     5 | I
     4 | I
     4 | I
     4 | I
     4 | I
...

     2 | I
     2 | I
     2 | I
     2 | I
     5 | O
     4 | O
     4 | O
     2 | O
     2 | O
     2 | O
     2 | O
     2 | O
     2 | O
     2 | O
     2 | O
     2 | O
     2 | O
(593 rows)

I find it curious that the numbers are typically close to powers of two. Whatever process is causing duplication seems to iteratively double the number of messages.


I realise that search_body might contain commonly used messages like "hi". But I've manually inspected the messages from the above query and can confirm that there's many long message bodies which were repeated.

If I group by message instead of search_body, I only see duplicated messages in the I direction. I think this is because IDs are getting generated and added to the messages:

select count(*), direction
from mam_message
group by from_jid, remote_bare_jid, direction, message
having count(*) > 1
order by direction, count desc;
Expand
 count | direction 
-------+-----------
     3 | I
     3 | I
     3 | I
     3 | I
     3 | I
     2 | I
     2 | I
...
     2 | I
     2 | I
     2 | I
     2 | I
(223 rows)

When I query MAM, I see that the repeated messages have been given different Stanza IDs:

<iq type='set' id='${queryId}'>
  <query xmlns='urn:xmpp:mam:2' queryid='${queryId}'>
    <x xmlns='jabber:x:data' type='submit'>
      <field var='FORM_TYPE'>
        <value>urn:xmpp:mam:2</value>
      </field>
      <field var='with'>
        <value>${jid}</value>
      </field>
    </x>
    <set xmlns='http://jabber.org/protocol/rsm'>
      <max>50</max>
      <before/>
    </set>
  </query>
</iq>
Expand
...
<message from='2@example.com' to='2@example.com/GGe5TAmYyIqL955H' id='ab0611a2-fb84-4d7c-bff4-0ebe9c19599f' xmlns='jabber:client'><result queryid='tNwqoW64yn' xmlns='urn:xmpp:mam:2' id='C1ENON425MG1'><forwarded xmlns='urn:xmpp:forward:0'><delay xmlns='urn:xmpp:delay' stamp='2023-09-21T05:29:14.599130Z' from='502@example.com/5DT5g4ImiR77bdgS'></delay><message from='502@example.com/5DT5g4ImiR77bdgS' xmlns='jabber:client' to='2@example.com' type='chat' id='cPp52686afMUyN1mC4qT4PRLs2KbdOBNaS9pYwrP'><body>how are you</body><request xmlns='urn:xmpp:receipts'/><stanza-id by='2@example.com' id='C1EFH044AF01' xmlns='urn:xmpp:sid:0'/></message></forwarded></result></message>
<message from='2@example.com' to='2@example.com/GGe5TAmYyIqL955H' id='a6e53936-c3e7-4f7e-8a15-06fe86ce6818' xmlns='jabber:client'><result queryid='tNwqoW64yn' xmlns='urn:xmpp:mam:2' id='C1ENON425MO1'><forwarded xmlns='urn:xmpp:forward:0'><delay xmlns='urn:xmpp:delay' stamp='2023-09-21T05:29:14.599131Z' from='502@example.com/5DT5g4ImiR77bdgS'></delay><message from='502@example.com/5DT5g4ImiR77bdgS' xmlns='jabber:client' to='2@example.com' type='chat' id='F9CCtP3kLRnLC7hnzeHrHD2MhLPRwQFfQIh56cjx'><body>2) i just got here friend</body><request xmlns='urn:xmpp:receipts'/><stanza-id by='2@example.com' id='C1EFH044AF81' xmlns='urn:xmpp:sid:0'/></message></forwarded></result></message>
<message from='2@example.com' to='2@example.com/GGe5TAmYyIqL955H' id='e204db9a-d83e-4b50-a04e-1bc285d9a39e' xmlns='jabber:client'><result queryid='tNwqoW64yn' xmlns='urn:xmpp:mam:2' id='C1ENON425PG1'><forwarded xmlns='urn:xmpp:forward:0'><delay xmlns='urn:xmpp:delay' stamp='2023-09-21T05:29:14.599142Z' from='502@example.com/5DT5g4ImiR77bdgS'></delay><message from='502@example.com/5DT5g4ImiR77bdgS' xmlns='jabber:client' to='2@example.com' type='chat' id='cPp52686afMUyN1mC4qT4PRLs2KbdOBNaS9pYwrP'><body>how are you</body><request xmlns='urn:xmpp:receipts'/><stanza-id by='2@example.com' id='C1EFH044AI01' xmlns='urn:xmpp:sid:0'/></message></forwarded></result></message>
<message from='2@example.com' to='2@example.com/GGe5TAmYyIqL955H' id='42e6ee5a-94b7-410d-a1da-4c950ffd07b4' xmlns='jabber:client'><result queryid='tNwqoW64yn' xmlns='urn:xmpp:mam:2' id='C1ENON425PO1'><forwarded xmlns='urn:xmpp:forward:0'><delay xmlns='urn:xmpp:delay' stamp='2023-09-21T05:29:14.599143Z' from='502@example.com/5DT5g4ImiR77bdgS'></delay><message from='502@example.com/5DT5g4ImiR77bdgS' xmlns='jabber:client' to='2@example.com' type='chat' id='F9CCtP3kLRnLC7hnzeHrHD2MhLPRwQFfQIh56cjx'><body>2) i just got here friend</body><request xmlns='urn:xmpp:receipts'/><stanza-id by='2@example.com' id='C1EFH044AI81' xmlns='urn:xmpp:sid:0'/></message></forwarded></result></message>
...

The IDs cPp52686afMUyN1mC4qT4PRLs2KbdOBNaS9pYwrP and F9CCtP3kLRnLC7hnzeHrHD2MhLPRwQFfQIh56cjx are generated by my XMPP client. I'm using xmpp.js to write my own client, so I'm very open to the idea that this is a client-side issue. But I also don't understand how a client-side problem could cause duplication in both the I and O directions.


mongooseim.toml:

Expand
[general]
  loglevel = "warning"
  hosts = ["example.com"]
  host_types = []
  default_server_domain = "example.com"
  registration_timeout = "infinity"
  language = "en"
  max_fsm_queue = 1000
  rdbms_server_type = "pgsql"

[outgoing_pools.rdbms.default]
  scope = "global"
  workers = 5

  [outgoing_pools.rdbms.default.connection]
    driver = "pgsql"
    keepalive_interval = 30

    host     = "example.com"
    port     =  5432
    database = "chat_db"
    username = "my_username"
    password = "my_password"

[[listen.http]]
  ip_address = "0.0.0.0"
  port = 5442
  transport.num_acceptors = 10
  transport.max_connections = 1024

  [[listen.http.handlers.mod_websockets]]
    host = "_"
    path = "/"

[auth.external]
  program = "/usr/lib/mongooseim/etc/auth.sh"
  instances = 2

[services.service_admin_extra]

[services.service_mongoose_system_metrics]

[modules.mod_adhoc]

[modules.mod_privacy]
  backend = "rdbms"

[modules.mod_blocking]

[modules.mod_disco]
  users_can_see_hidden_services = false

[modules.mod_stream_management]
  buffer_max = 10000

[modules.mod_presence]

[modules.mod_last]
  backend = "rdbms"

[modules.mod_sic]

[modules.mod_carboncopy]

[modules.mod_inbox]
  backend = "rdbms_async"
  boxes = ["chats"]

[modules.mod_offline_stub]

[modules.mod_mam.pm]

[shaper.normal]
  max_rate = 1000

[shaper.fast]
  max_rate = 50_000

[shaper.mam_shaper]
  max_rate = 1

[shaper.mam_global_shaper]
  max_rate = 1000

[acl]
  local = [{}]

[access]
  max_user_sessions = [
    {acl = "all", value = 10}
  ]

  max_user_offline_messages = [
    {acl = "admin", value = 5000},
    {acl = "all", value = 100}
  ]

  local = [
    {acl = "local", value = "allow"}
  ]

  c2s = [
    {acl = "blocked", value = "deny"},
    {acl = "all", value = "allow"}
  ]

  c2s_shaper = [
    {acl = "admin", value = "none"},
    {acl = "all", value = "normal"}
  ]

  s2s_shaper = [
    {acl = "all", value = "fast"}
  ]

  muc_admin = [
    {acl = "admin", value = "allow"}
  ]

  muc_create = [
    {acl = "local", value = "allow"}
  ]

  muc = [
    {acl = "all", value = "allow"}
  ]

  register = [
    {acl = "all", value = "allow"}
  ]

  mam_set_prefs = [
    {acl = "all", value = "default"}
  ]

  mam_get_prefs = [
    {acl = "all", value = "default"}
  ]

  mam_lookup_messages = [
    {acl = "all", value = "default"}
  ]

  mam_set_prefs_shaper = [
    {acl = "all", value = "mam_shaper"}
  ]

  mam_get_prefs_shaper = [
    {acl = "all", value = "mam_shaper"}
  ]

  mam_lookup_messages_shaper = [
    {acl = "all", value = "mam_shaper"}
  ]

  mam_set_prefs_global_shaper = [
    {acl = "all", value = "mam_global_shaper"}
  ]

  mam_get_prefs_global_shaper = [
    {acl = "all", value = "mam_global_shaper"}
  ]

  mam_lookup_messages_global_shaper = [
    {acl = "all", value = "mam_global_shaper"}
  ]
@arcusfelis
Copy link
Contributor

Hi,
Check if your library tries to send messages again on the failure.
Check mongooseim logs - maybe something interesting would be there.
We store one copy for outgoing message and one copy for incoming message. Each time we see a stanza routed.

Do you use groupchats?

@DoxasticFox
Copy link
Author

Thanks for the quick response!

Check if your library tries to send messages again on the failure.

I can't say for sure, but I don't see anything about retries in the documentation. I also grepped xmpp.js source code for retr and saw nothing about retries/retransmitted messages. But I see stream management messages coming from the library when I inspect the websocket connection in my browser, so if xmpp.js was following XEP-0198 properly, it should ack re-transmitted messages and re-transmit them if necessary. I wonder if it could be an issue with their stream management implementation.

Then again, I recall this issue happening in a dockerised setup running locally, so I don't know how messages could be getting lost on the network.

Check mongooseim logs - maybe something interesting would be there.

When this happened locally, I could never spot anything suspicious in MongooseIM's logs, unfortunately.

Do you use groupchats?

Nope. Just PMs.

@arcusfelis
Copy link
Contributor

arcusfelis commented Sep 26, 2023 via email

@DoxasticFox
Copy link
Author

It's different. I also disabled stream management on the server and the problem seems to be gone. Gonna close this issue. Thanks for the help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants