Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
Startup is extremely slow (or hangs) #1842
Comments
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
sesse
Nov 30, 2017
I let it run for half an hour or so. From looking at the developer tools, it appears that it's not actually doing anything; all the CPU is going into repainting the loading bar… which also explains why the entire system is sluggish even if I nice Signal to idle priority. (It's driving the compositor.)
sesse
commented
Nov 30, 2017
|
I let it run for half an hour or so. From looking at the developer tools, it appears that it's not actually doing anything; all the CPU is going into repainting the loading bar… which also explains why the entire system is sluggish even if I nice Signal to idle priority. (It's driving the compositor.) |
scottnonnenberg
added
the
Need Information
label
Dec 1, 2017
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
scottnonnenberg
Dec 1, 2017
Member
It sounds like you did an export from the Chrome extension to Standalone. How big is your ~/.config/Signal directory? Also, can you provide your full log so far by going to View -> Debug Log in the main menu?
|
It sounds like you did an export from the Chrome extension to Standalone. How big is your |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
mohonk
Dec 1, 2017
I see a similar behavior and I did an export from the Chrome extension which has 1 GB.
The ~/.config/Signal directory has 1.1 GB.
After I renamed the Chrome extension export and restarted Signal Desktop the startup went smoothly and I can actually use Signal Desktop now :)
mohonk
commented
Dec 1, 2017
|
I see a similar behavior and I did an export from the Chrome extension which has 1 GB. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
sesse
Dec 1, 2017
Yes, I did an export from Chrome. ~/.config/Signal is 2.2 GB, although I'm not sure why it would be so big. :-)
The debug log contained basically what I already pasted, but I started Signal anew and made it post it (URL below). As you will see, I ran from 22:35 to 23:15 before I gave up (and all the time, it was stuck in “Loading”), so my assumption is that it's really gotten stuck somewhere, not just slow.
However, interestingly, today it got further. After I posted a debug log, I let it stay for a few more minutes, and after four minutes, it got to “Loading messages. 50 so far…” and then appeared to hang again. I gave it a few more minutes to make sure it didn't progress further. then posted another debug log. Here it is: https://gist.github.com/anonymous/199256d56064f6fc392797f694889883
sesse
commented
Dec 1, 2017
|
Yes, I did an export from Chrome. ~/.config/Signal is 2.2 GB, although I'm not sure why it would be so big. :-) The debug log contained basically what I already pasted, but I started Signal anew and made it post it (URL below). As you will see, I ran from 22:35 to 23:15 before I gave up (and all the time, it was stuck in “Loading”), so my assumption is that it's really gotten stuck somewhere, not just slow. However, interestingly, today it got further. After I posted a debug log, I let it stay for a few more minutes, and after four minutes, it got to “Loading messages. 50 so far…” and then appeared to hang again. I gave it a few more minutes to make sure it didn't progress further. then posted another debug log. Here it is: https://gist.github.com/anonymous/199256d56064f6fc392797f694889883 |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
sesse
Dec 1, 2017
I started with LC_ALL=C signal-desktop. It booted up in maybe five seconds. Closed, booted again, still works. Restarted with my normal locale (nb_NO.UTF-8) and now it boots really fast again. So something got unstuck; I don't know if it had anything to do with the locale or not.
New debug log: https://gist.github.com/anonymous/c1c07b134d59dd802192682470ce36a2
sesse
commented
Dec 1, 2017
|
I started with LC_ALL=C signal-desktop. It booted up in maybe five seconds. Closed, booted again, still works. Restarted with my normal locale (nb_NO.UTF-8) and now it boots really fast again. So something got unstuck; I don't know if it had anything to do with the locale or not. New debug log: https://gist.github.com/anonymous/c1c07b134d59dd802192682470ce36a2 |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
scottnonnenberg
Dec 1, 2017
Member
@sesse Is it true that whenever it hangs like that it uses 100% CPU? Was it also using a lot of memory?
|
@sesse Is it true that whenever it hangs like that it uses 100% CPU? Was it also using a lot of memory? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
sesse
Dec 1, 2017
Yes, it uses 100% CPU, although I wonder if maybe it's using all of it (and the GPU) to show the loading animation, not for useful work.
I don't think it was using a lot of memory—I would have noticed.
sesse
commented
Dec 1, 2017
|
Yes, it uses 100% CPU, although I wonder if maybe it's using all of it (and the GPU) to show the loading animation, not for useful work. I don't think it was using a lot of memory—I would have noticed. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
scottnonnenberg
Dec 1, 2017
Member
My suspicion is that it's some sort of database cleanup thing going on. Did you happen to attempt to import more than once?
|
My suspicion is that it's some sort of database cleanup thing going on. Did you happen to attempt to import more than once? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
sesse
Dec 1, 2017
I don't honestly remember (it's a month ago now). It may have failed the first time?
sesse
commented
Dec 1, 2017
|
I don't honestly remember (it's a month ago now). It may have failed the first time? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
scottnonnenberg
Dec 1, 2017
Member
It sounds like it has happened a couple times now. Is there any pattern to it?
Can you also describe your general usage of the app? How many sent/received messages per day? How many images or other attachments?
|
It sounds like it has happened a couple times now. Is there any pattern to it? Can you also describe your general usage of the app? How many sent/received messages per day? How many images or other attachments? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
sesse
Dec 1, 2017
I don't know of any pattern, no. Usage is fairly light. Think 10–20 messages a day, maybe one or two images per day. There's the odd video or audio file, but those are rare. There are two groups that seem to be fairly large by Signal standards (~20 people).
Occasionally I get “Error handling incoming message” on incoming group messages. I doubt it's related.
sesse
commented
Dec 1, 2017
|
I don't know of any pattern, no. Usage is fairly light. Think 10–20 messages a day, maybe one or two images per day. There's the odd video or audio file, but those are rare. There are two groups that seem to be fairly large by Signal standards (~20 people). Occasionally I get “Error handling incoming message” on incoming group messages. I doubt it's related. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
scottnonnenberg
Dec 1, 2017
Member
Okay, so to describe my thinking on this a little bit more. The database we use doesn't delete immediately. It comes back later to clean up. With a large amount of data stored, this process could take some time.
I think that's what is happening in your case, because the actual javascript code being run was minimal - you can see it in your trace. I expect that our requests to the database just didn't return. We do have some 2-minute timeouts in place in the app for just this kind of thing, so I'm disappointed that you were on the loading screen forever.
Anyway, you have two choices.
- Tough it out and continue to use that installation as-is. You can perhaps help us track this behavior down by poking and prodding the app over time.
- Start over from scratch. Close the app and erase
~/.config/Signal. Start up and re-link.
I suppose there might be another option too, deleting some of the conversations you don't really care about, to see if they eventually reduce the size of your database. Up to you!
|
Okay, so to describe my thinking on this a little bit more. The database we use doesn't delete immediately. It comes back later to clean up. With a large amount of data stored, this process could take some time. I think that's what is happening in your case, because the actual javascript code being run was minimal - you can see it in your trace. I expect that our requests to the database just didn't return. We do have some 2-minute timeouts in place in the app for just this kind of thing, so I'm disappointed that you were on the loading screen forever. Anyway, you have two choices.
I suppose there might be another option too, deleting some of the conversations you don't really care about, to see if they eventually reduce the size of your database. Up to you! |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
sesse
Dec 1, 2017
This is LevelDB, right? :-) I know fairly well how it works (as in, the basic ideas); I was involved in open-sourcing parts of it.
If you think it's spending its time in native code, I can run perf on it next time it's slow to start up? I guess the theory is that LevelDB needs more than two minutes to return, Signal times out the database call and doesn't manage to recover?
sesse
commented
Dec 1, 2017
|
This is LevelDB, right? :-) I know fairly well how it works (as in, the basic ideas); I was involved in open-sourcing parts of it. If you think it's spending its time in native code, I can run perf on it next time it's slow to start up? I guess the theory is that LevelDB needs more than two minutes to return, Signal times out the database call and doesn't manage to recover? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
scottnonnenberg
Dec 1, 2017
Member
Yup, IndexedDB is built on top of LevelDB in Chrome. Any analysis you can do there would be really, really helpful! :0)
It's interesting, looking at your log, from 22:35 to 23:15. Everything appears to be happening normally, except there's no 'empty' event fired which would get rid of that loading screen. And we should see one, since there were no messages from the websocket, and only one message that we retry.
If you could continue providing logs, telling us which sessions started up properly and which didn't, that would help us ensure that it isn't something in our application startup. My best guess at this point is that we're doing something on startup which is really sensitive to high CPU usage, maybe a number of defers to the event loop due to all of our Promise-related operations, and the event loop is running very slowly against all that CPU usage spent on LevelDB cleanup.
|
Yup, IndexedDB is built on top of LevelDB in Chrome. Any analysis you can do there would be really, really helpful! :0) It's interesting, looking at your log, from 22:35 to 23:15. Everything appears to be happening normally, except there's no 'empty' event fired which would get rid of that loading screen. And we should see one, since there were no messages from the websocket, and only one message that we retry. If you could continue providing logs, telling us which sessions started up properly and which didn't, that would help us ensure that it isn't something in our application startup. My best guess at this point is that we're doing something on startup which is really sensitive to high CPU usage, maybe a number of defers to the event loop due to all of our Promise-related operations, and the event loop is running very slowly against all that CPU usage spent on LevelDB cleanup. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
sesse
Jan 6, 2018
I had another relatively slow start; only 30 seconds or so, but enough that I got a few seconds of profile. It's indeed IndexedDB that's taking time:
17,29% IndexedDB [kernel.kallsyms] [k] copy_user_enhanced_fast_string ▒
14,50% IndexedDB signal-desktop [.] 0x0000000001c6b163 ▒
7,02% IndexedDB [kernel.kallsyms] [k] clear_page_erms ▒
2,91% IndexedDB signal-desktop [.] 0x0000000001c6b177 ▒
2,15% IndexedDB [kernel.kallsyms] [k] native_irq_return_iret ▒
2,07% IndexedDB [unknown] [k] 0x00007f35e74a96bf ▒
2,00% IndexedDB [kernel.kallsyms] [k] __handle_mm_fault ▒
1,61% IndexedDB libc-2.26.so [.] __memmove_avx_unaligned_erms ▒
1,47% IndexedDB [kernel.kallsyms] [k] _raw_spin_lock ▒
1,29% IndexedDB [kernel.kallsyms] [k] unmap_page_range ▒
1,16% IndexedDB [kernel.kallsyms] [k] find_get_entry ▒
1,14% IndexedDB [kernel.kallsyms] [k] get_page_from_freelist ▒
1,05% IndexedDB [kernel.kallsyms] [k] __rmqueue ▒
0,99% IndexedDB [kernel.kallsyms] [k] release_pages ▒
0,94% IndexedDB [kernel.kallsyms] [k] free_pcppages_bulk ▒
0,66% IndexedDB [kernel.kallsyms] [k] __pagevec_lru_add_fn ▒
0,57% IndexedDB [kernel.kallsyms] [k] __do_page_fault ▒
0,56% IndexedDB [kernel.kallsyms] [k] error_entry ▒
0,54% IndexedDB [kernel.kallsyms] [k] free_hot_cold_page ▒
0,49% IndexedDB [kernel.kallsyms] [k] generic_file_read_iter ▒
0,48% IndexedDB [kernel.kallsyms] [k] __radix_tree_lookup ▒
0,47% IndexedDB [kernel.kallsyms] [k] mem_cgroup_try_charge ▒
0,38% IndexedDB [kernel.kallsyms] [k] handle_mm_fault ▒
0,35% IndexedDB [kernel.kallsyms] [k] __alloc_pages_nodemask ▒
0,34% IndexedDB [kernel.kallsyms] [k] swapgs_restore_regs_and_return_to_usermode ▒
0,32% IndexedDB [kernel.kallsyms] [k] __mod_zone_page_state ▒
0,30% IndexedDB [kernel.kallsyms] [k] __mod_node_page_state ▒
0,29% IndexedDB [kernel.kallsyms] [k] pagevec_lru_move_fn ▒
0,28% IndexedDB [kernel.kallsyms] [k] alloc_pages_vma ▒
0,27% IndexedDB [kernel.kallsyms] [k] free_pages_and_swap_cache ▒
0,26% IndexedDB libc-2.26.so [.] __GI_wcrtomb
It's interesting that a significant fraction of the time is being spent in the kernel, not in userspace. (This is a KPTI-enabled kernel, so syscalls are a bit slower than they'd be a few weeks ago, but still, it shouldn't be that bad.)
sesse
commented
Jan 6, 2018
|
I had another relatively slow start; only 30 seconds or so, but enough that I got a few seconds of profile. It's indeed IndexedDB that's taking time:
It's interesting that a significant fraction of the time is being spent in the kernel, not in userspace. (This is a KPTI-enabled kernel, so syscalls are a bit slower than they'd be a few weeks ago, but still, it shouldn't be that bad.) |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
MartinX3
Jan 6, 2018
10 seconds at my side with KPTI-disabled kernel
But I only have 4 conversations with a limit of 500 messages each and I didn't used the chrom plugin before.
I started fresh with the standalone app and a not working first sync.
Signal 1.1.0
Linux Ubuntu Mate 17.10 x64
Kernel 4.13.0-21-generic
MartinX3
commented
Jan 6, 2018
•
|
10 seconds at my side with KPTI-disabled kernel Signal 1.1.0 |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
scottnonnenberg
Jan 8, 2018
Member
@MartinX3 We really need logs, especially in situations where it "doesn't work." And we need a lot more information about what "not working" really means. What exactly do you see?
|
@MartinX3 We really need logs, especially in situations where it "doesn't work." And we need a lot more information about what "not working" really means. What exactly do you see? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
MartinX3
Jan 8, 2018
@scottnonnenberg
I would like to provide a log, like @sesse
But I don't know how to profile the IndexedDB process on linux.
MartinX3
commented
Jan 8, 2018
|
@scottnonnenberg |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
sesse
Jan 8, 2018
After some stracing, I wonder if maybe the IndexedDB thread spends most of its CPU time talking to some other Signal process… Ie., that the problem isn't the database itself, but that the multi-process model causes huge overhead per query.
This is just a theory at this stage, though.
sesse
commented
Jan 8, 2018
|
After some stracing, I wonder if maybe the IndexedDB thread spends most of its CPU time talking to some other Signal process… Ie., that the problem isn't the database itself, but that the multi-process model causes huge overhead per query. This is just a theory at this stage, though. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
MartinX3
commented
Jan 8, 2018
|
So the SignalDB isn't loaded by a single thread? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
scottnonnenberg
Jan 8, 2018
Member
@sesse We do encryption via service worker so we don't block the main thread. And it does goes to the database for some things. You're saying that we might see some benefits by ensuring that our DB access is from one thread only? You've seen this in other LevelDB implementations?
|
@sesse We do encryption via service worker so we don't block the main thread. And it does goes to the database for some things. You're saying that we might see some benefits by ensuring that our DB access is from one thread only? You've seen this in other LevelDB implementations? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
sesse
Jan 8, 2018
When I say process, I really mean process—not thread. There are certainly two processes here (well, actually three), communicating by a UNIX domain socket. I don't know what they are talking about :-)
I don't know how service workers map to threads or processes. This is one of the problems of using JS for desktop applications; you're very far abstracted from the metal, so performance debugging can become very difficult.
sesse
commented
Jan 8, 2018
|
When I say process, I really mean process—not thread. There are certainly two processes here (well, actually three), communicating by a UNIX domain socket. I don't know what they are talking about :-) I don't know how service workers map to threads or processes. This is one of the problems of using JS for desktop applications; you're very far abstracted from the metal, so performance debugging can become very difficult. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
scottnonnenberg
Jan 8, 2018
Member
@sesse The way Electron works, built on top of Chromium, is that there are usually three processes. The overall management process, then a renderer process, then usually what appears to be a Chromium helper process - I'm not sure what goes on in that one; maybe visual compositing or something?
Anyway, you're saying that IndexedDB performance is slowing down because the process interacting with IndexedDB is frequently being interrupted for inter-process communication?
|
@sesse The way Electron works, built on top of Chromium, is that there are usually three processes. The overall management process, then a renderer process, then usually what appears to be a Chromium helper process - I'm not sure what goes on in that one; maybe visual compositing or something? Anyway, you're saying that IndexedDB performance is slowing down because the process interacting with IndexedDB is frequently being interrupted for inter-process communication? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
sesse
Jan 8, 2018
I'm not saying I have the answer by any means (I am definitely not an expert on Chrome internals!), but my theory is that IndexedDB lives in the helper process, so it sits behind (slow) inter-process communication.
sesse
commented
Jan 8, 2018
|
I'm not saying I have the answer by any means (I am definitely not an expert on Chrome internals!), but my theory is that IndexedDB lives in the helper process, so it sits behind (slow) inter-process communication. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
janLo
Jan 16, 2018
I have the same problem on all my computers. The ones with an SSD are a bit slower but it blocks generally between 2 and 10 minutes on startup.
janLo
commented
Jan 16, 2018
|
I have the same problem on all my computers. The ones with an SSD are a bit slower but it blocks generally between 2 and 10 minutes on startup. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
scottnonnenberg
Jan 16, 2018
Member
@janLo Please provide a bit more detail - do you see a 'messages so far' number counting up? How fast? How many messages with attachments do you generally receive? (we're downloading them on startup)
This issue is generally about situations where the number doesn't count up at all, or does it extremely slowly. Either way, a log from your large download would be useful. View -> Debug Log.
|
@janLo Please provide a bit more detail - do you see a 'messages so far' number counting up? How fast? How many messages with attachments do you generally receive? (we're downloading them on startup) This issue is generally about situations where the number doesn't count up at all, or does it extremely slowly. Either way, a log from your large download would be useful. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
janLo
Jan 17, 2018
I see the "counting up" - depending on how many messages the machine had recieved in the past I've waited for numbers between 300 and 1500. I recieve messages with attachments occationally but not that often. maybe 1-5 per day. It also seems that these are all teh "old" messages, as the new ones that I received until the app was closed arrive when the UI is started.
The log is filled with
INFO 2018-01-16T22:58:15.334Z queueing envelope +[REDACTED]084.12 1515975123664
janLo
commented
Jan 17, 2018
|
I see the "counting up" - depending on how many messages the machine had recieved in the past I've waited for numbers between 300 and 1500. I recieve messages with attachments occationally but not that often. maybe 1-5 per day. It also seems that these are all teh "old" messages, as the new ones that I received until the app was closed arrive when the UI is started. The log is filled with
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
entonjackson
Jan 18, 2018
Same here. Takes forever to load conversation. 3 different devices. 3 different internet qualities. Makes no difference.
entonjackson
commented
Jan 18, 2018
|
Same here. Takes forever to load conversation. 3 different devices. 3 different internet qualities. Makes no difference. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
scottnonnenberg
Jan 18, 2018
Member
@entonjackson Thanks for weighing in. Please provide logs including times you experience slow startup and slow conversation load, as well as details about your computer. It will really help us figure out what we can do to improve things.
|
@entonjackson Thanks for weighing in. Please provide logs including times you experience slow startup and slow conversation load, as well as details about your computer. It will really help us figure out what we can do to improve things. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
janLo
Jan 18, 2018
I'd like to mention, that this happens only since I use the "standalone" version. It never happened with the chrome app.
janLo
commented
Jan 18, 2018
|
I'd like to mention, that this happens only since I use the "standalone" version. It never happened with the chrome app. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
scottnonnenberg
Jan 18, 2018
Member
@janLo The Chrome app had the same 'counting up' screen. Are you saying that you never saw it, or are you saying that it never took this long? What OS are you on, by the way? Logs would be useful as well.
|
@janLo The Chrome app had the same 'counting up' screen. Are you saying that you never saw it, or are you saying that it never took this long? What OS are you on, by the way? Logs would be useful as well. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
janLo
commented
Jan 18, 2018
|
It was never long enough for me to notice. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
DracoLlasa
Jan 19, 2018
i just wanted to chime in a bit to see if i can get confirmation on this being a confirmed issue or not.
i send easily 100+ messages a day using the desktop app, as others have noted after a couple hundred to a thousand or two messages the app gets slow to start specifically to load a contact window that has a lot of history. Worse when i get to the point where i have many thousand of messages with a specific contact, the client will actually take a few seconds to send. i will type a message, press enter and it will sit a bit. during which time i could still change the message which was nice sometimes, but i saw that might have been fixed.
If this is a confirmed issue and my info can be of any assistance please let me know.
DracoLlasa
commented
Jan 19, 2018
|
i just wanted to chime in a bit to see if i can get confirmation on this being a confirmed issue or not. If this is a confirmed issue and my info can be of any assistance please let me know. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
scottnonnenberg
Jan 19, 2018
Member
@DracoLlasa Your logs would be useful, since they will show initial startup time as well as conversation load time. Please don't hesitate to include them in the future. :0)
|
@DracoLlasa Your logs would be useful, since they will show initial startup time as well as conversation load time. Please don't hesitate to include them in the future. :0) |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
DracoLlasa
Jan 19, 2018
@scottnonnenberg ok no problem, this is the first time i have contributed here, would the log you want be from the "view | Debug log" menus option? if not let me know from where.
If it is that debug log, should i use the 'submit' button there or just pull the relevant logs from a startup and such and post it here or in an attached txt file
DracoLlasa
commented
Jan 19, 2018
|
@scottnonnenberg ok no problem, this is the first time i have contributed here, would the log you want be from the "view | Debug log" menus option? if not let me know from where. If it is that debug log, should i use the 'submit' button there or just pull the relevant logs from a startup and such and post it here or in an attached txt file |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
scottnonnenberg
Jan 19, 2018
Member
@DracoLlasa View -> Debug Log, then upload that log, then take the URL of that upload and paste it here.
|
@DracoLlasa |
arikroc
referenced this issue
Jan 21, 2018
Open
App extremely slow to "Load messages" on macOS desktop #1990
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
sesse
Jan 22, 2018
I had another fairly slow startup today. This time, I captured a perf with -g, which told me this:
- 13,70% 0,00% IndexedDB libpthread-2.26.so [.] __GI___libc_pread64 ▒
- __GI___libc_pread64 ▒
- 6,92% entry_SYSCALL_64_fastpath ▒
- sys_pread64 ▒
- 6,90% vfs_read ▒
- 6,90% __vfs_read ▒
- 6,89% generic_file_read_iter ▒
- 6,31% copy_page_to_iter ▒
- copyout ▒
- 6,26% copy_user_enhanced_fast_string ▒
- 4,50% page_fault ▒
- 4,44% __do_page_fault ▒
- 4,27% handle_mm_fault ▒
- 4,21% __handle_mm_fault ▒
+ 2,68% alloc_pages_vma ▒
0,65% _raw_spin_lock ▒
5,15% copy_user_enhanced_fast_string ▒
+ 0,75% return_from_SYSCALL_64
It seems that IndexedDB opens every single shard I have and wants to read a bit from it (probably the bloom filter or something similar). Since I have 647 shards, this takes a fair amount of time…
I think a relevant question to the other reporters in the threads is what this returns:
ls -l ~/.config/Signal/IndexedDB/file__0.indexeddb.leveldb/ | wc -l
It would be interesting to see if there's a correlation between number of shards and startup slowness.
There's also seemingly somehow VM pressure just moving data around (I don't know for what, the stack unwinding somehow don't extend far enough):
- 13,58% 4,10% IndexedDB libc-2.26.so [.] __memmove_avx_unaligned_erms ▒
- 11,41% __memmove_avx_unaligned_erms ▒
- 4,95% page_fault ▒
- 4,90% __do_page_fault ▒
- 4,64% handle_mm_fault ▒
- 4,52% __handle_mm_fault ▒
- 2,94% alloc_pages_vma ▒
- 2,91% __alloc_pages_nodemask ▒
- get_page_from_freelist ▒
2,27% clear_page_erms ▒
0,57% _raw_spin_lock ▒
2,93% __memmove_avx_unaligned_erms ▒
0,70% error_entry
sesse
commented
Jan 22, 2018
|
I had another fairly slow startup today. This time, I captured a perf with -g, which told me this:
It seems that IndexedDB opens every single shard I have and wants to read a bit from it (probably the bloom filter or something similar). Since I have 647 shards, this takes a fair amount of time… I think a relevant question to the other reporters in the threads is what this returns:
It would be interesting to see if there's a correlation between number of shards and startup slowness. There's also seemingly somehow VM pressure just moving data around (I don't know for what, the stack unwinding somehow don't extend far enough):
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
MartinX3
commented
Jan 22, 2018
|
Would a solution for this problem speed up the sync process with the app? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
DracoLlasa
Jan 24, 2018
@scottnonnenberg thanks, i was checking it out yesterday and think i sorted out what to select to send.. was going to do it tonight after work but got the new version and i can tell its much quicker to load.. i will keep an eye on it and not do anything to intentionally clean up old messaged and stuff like i did last time it got real bad.
DracoLlasa
commented
Jan 24, 2018
|
@scottnonnenberg thanks, i was checking it out yesterday and think i sorted out what to select to send.. was going to do it tonight after work but got the new version and i can tell its much quicker to load.. i will keep an eye on it and not do anything to intentionally clean up old messaged and stuff like i did last time it got real bad. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
janLo
Jan 24, 2018
Not as slow as it used to be but still slower than any version of the chrome app: https://gist.github.com/e5b771b160d813dfb91a7bcbcdce17e7
janLo
commented
Jan 24, 2018
|
Not as slow as it used to be but still slower than any version of the chrome app: https://gist.github.com/e5b771b160d813dfb91a7bcbcdce17e7 |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
scottnonnenberg
Jan 24, 2018
Member
@janLo Apologies if I've mentioned this before, but one reason the Chrome App was 'faster' is because it was always running - it was running anytime you had your Chrome browser running. This meant that it had to download large amounts of messages less often, and when it did, it started downloading before you even opened the window.
|
@janLo Apologies if I've mentioned this before, but one reason the Chrome App was 'faster' is because it was always running - it was running anytime you had your Chrome browser running. This meant that it had to download large amounts of messages less often, and when it did, it started downloading before you even opened the window. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
sesse
Jan 27, 2018
I did an iostat during a slow load:
Device r/s w/s rkB/s wkB/s rrqm/s wrqm/s %rrqm %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm %util
sda 6,78 4,32 250,61 144,32 0,36 5,80 5,02 57,34 1,24 61,79 0,28 36,96 33,45 4,18 4,63
sda 99,00 0,00 7044,00 0,00 0,00 0,00 0,00 0,00 0,36 0,00 0,04 71,15 0,00 0,28 2,80
sda 89,00 93,00 10444,00 620,00 0,00 11,00 0,00 10,58 0,40 0,00 0,04 117,35 6,67 0,20 3,60
sda 286,00 0,00 35160,00 0,00 0,00 0,00 0,00 0,00 0,43 0,00 0,12 122,94 0,00 0,39 11,20
sda 98,00 0,00 11864,00 0,00 0,00 0,00 0,00 0,00 0,45 0,00 0,04 121,06 0,00 0,33 3,20
sda 104,00 0,00 13092,00 0,00 0,00 0,00 0,00 0,00 0,35 0,00 0,04 125,88 0,00 0,31 3,20
sda 511,00 3,00 64692,00 76,00 0,00 17,00 0,00 85,00 0,47 8,00 0,26 126,60 25,33 0,45 23,20
sda 134,65 0,00 16760,40 0,00 0,00 0,00 0,00 0,00 0,50 0,00 0,06 124,47 0,00 0,41 5,54
sda 90,00 0,00 10832,00 0,00 0,00 0,00 0,00 0,00 0,40 0,00 0,04 120,36 0,00 0,36 3,20
sda 87,00 1,00 10348,00 1076,00 0,00 0,00 0,00 0,00 0,83 4,00 0,08 118,94 1076,00 0,68 6,00
How much stuff are you loading from the DB on startup, anyway? Either IndexedDB has a huge startup overhead, or your schema is designed so that you need to do too many queries.
sesse
commented
Jan 27, 2018
|
I did an iostat during a slow load:
How much stuff are you loading from the DB on startup, anyway? Either IndexedDB has a huge startup overhead, or your schema is designed so that you need to do too many queries. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
scottnonnenberg
Jan 29, 2018
Member
@sesse I think the large amount of data saved and loaded has to do with our caching mechanism for not-yet-processed messages. Before we attempt to decrypt a message, we put it in the database. Then when we've successfully decrypted it, we update it in the database. And then finally when we've processed it fully, we create the real message in the database and delete the cached data.
This is a lot of creates, updates and deletes, especially when we're processing hundreds of messages because the queue has filled up since we were last running.
But it does ensure that we're resilient in the face of crashes, or even the user closing the app while we're loading messages.
|
@sesse I think the large amount of data saved and loaded has to do with our caching mechanism for not-yet-processed messages. Before we attempt to decrypt a message, we put it in the database. Then when we've successfully decrypted it, we update it in the database. And then finally when we've processed it fully, we create the real message in the database and delete the cached data. This is a lot of creates, updates and deletes, especially when we're processing hundreds of messages because the queue has filled up since we were last running. But it does ensure that we're resilient in the face of crashes, or even the user closing the app while we're loading messages. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
sesse
Jan 29, 2018
FWIW, I doubt I ever start up Signal and there's hundreds of messages in the queue, but I could be mistaken.
sesse
commented
Jan 29, 2018
|
FWIW, I doubt I ever start up Signal and there's hundreds of messages in the queue, but I could be mistaken. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
sesse
Jan 29, 2018
Also, hundreds of messages or not; you are reading hundreds of megabytes of data on startup, as far as I can tell. It can't all be about a few hundred kilobyte-large messages being stored two or three times :-)
sesse
commented
Jan 29, 2018
|
Also, hundreds of messages or not; you are reading hundreds of megabytes of data on startup, as far as I can tell. It can't all be about a few hundred kilobyte-large messages being stored two or three times :-) |
MartinX3
commented
Jan 29, 2018
•
|
@sesse Maybe there are Metadata for every single Message + serialized Photos and Videos in your Chats? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
sesse
Jan 29, 2018
There are, but still, why should they need to be loaded on startup? I don't suppose Signal needs to open every single old message I have just to boot?
sesse
commented
Jan 29, 2018
|
There are, but still, why should they need to be loaded on startup? I don't suppose Signal needs to open every single old message I have just to boot? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
scottnonnenberg
Jan 29, 2018
Member
@sesse We do a few index-based queries on startup, like checking for expired disappearing messages, or looking for any cached messages which didn't finish processing. When we do database schema updates, we sometimes do loop through every single object from a certain store, but that's uncommon. When we process a message, we first determine whether it's a duplicate by checking our database, again using an index-based query.
As an expert in IndexedDB, perhaps you can talk about how index scans are implemented? Surely we don't read that much data from disk when we do a index-based query?
|
@sesse We do a few index-based queries on startup, like checking for expired disappearing messages, or looking for any cached messages which didn't finish processing. When we do database schema updates, we sometimes do loop through every single object from a certain store, but that's uncommon. When we process a message, we first determine whether it's a duplicate by checking our database, again using an index-based query. As an expert in IndexedDB, perhaps you can talk about how index scans are implemented? Surely we don't read that much data from disk when we do a index-based query? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
sesse
Jan 29, 2018
I'm afraid I'm nowhere near an IndexedDB expert. :-) I was involved in open-sourcing parts of LevelDB (in particular, the parts relating to compression), and I've also used its internal predecessor (SSTable) a fair amount. But IndexedDB is on top of that again, and that's where any secondary indexing sits—and I've never touched it.
sesse
commented
Jan 29, 2018
|
I'm afraid I'm nowhere near an IndexedDB expert. :-) I was involved in open-sourcing parts of LevelDB (in particular, the parts relating to compression), and I've also used its internal predecessor (SSTable) a fair amount. But IndexedDB is on top of that again, and that's where any secondary indexing sits—and I've never touched it. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
scottnonnenberg
Jan 29, 2018
Member
@sesse Do you have the log to go along with that 'reading hundreds of megabytes' slow load?
|
@sesse Do you have the log to go along with that 'reading hundreds of megabytes' slow load? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
sesse
Jan 29, 2018
I think this is the one: https://gist.github.com/anonymous/d54f911f5e5083bf24eca6678b172ede . It looks as if it may have scrolled out, though :-/
In general, it appears LevelDB is performing very poorly on my system. There shouldn't be hundreds of these really small files, e.g. this from the LevelDB log (LOG in the directory):
2018/01/28-19:52:10.889 11030 Compacting 1@2 + 29@3 files
2018/01/28-19:52:10.917 11030 Generated table #18250@2: 8 keys, 711 bytes
2018/01/28-19:52:10.943 11030 Generated table #18251@2: 3 keys, 1092 bytes
2018/01/28-19:52:10.969 11030 Generated table #18252@2: 2 keys, 1096 bytes
2018/01/28-19:52:11.000 11030 Generated table #18253@2: 3 keys, 830423 bytes
2018/01/28-19:52:11.027 11030 Generated table #18254@2: 4 keys, 1627 bytes
2018/01/28-19:52:11.055 11030 Generated table #18255@2: 7 keys, 2812 bytes
2018/01/28-19:52:11.101 11030 Generated table #18256@2: 1 keys, 3740570 bytes
2018/01/28-19:52:11.127 11030 Generated table #18257@2: 5 keys, 1815 bytes
2018/01/28-19:52:11.154 11030 Generated table #18258@2: 2 keys, 1087 bytes
2018/01/28-19:52:11.180 11030 Generated table #18259@2: 2 keys, 1158 bytes
2018/01/28-19:52:11.206 11030 Generated table #18260@2: 2 keys, 949 bytes
2018/01/28-19:52:11.232 11030 Generated table #18261@2: 2 keys, 1102 bytes
2018/01/28-19:52:11.258 11030 Generated table #18262@2: 1 keys, 841 bytes
2018/01/28-19:52:11.285 11030 Generated table #18263@2: 2 keys, 916 bytes
2018/01/28-19:52:11.311 11030 Generated table #18264@2: 2 keys, 1308 bytes
2018/01/28-19:52:11.337 11030 Generated table #18265@2: 3 keys, 1331 bytes
2018/01/28-19:52:11.363 11030 Generated table #18266@2: 1 keys, 841 bytes
2018/01/28-19:52:11.390 11030 Generated table #18267@2: 1 keys, 657 bytes
2018/01/28-19:52:11.419 11030 Generated table #18268@2: 1 keys, 723 bytes
2018/01/28-19:52:11.459 11030 Generated table #18269@2: 2 keys, 2863460 bytes
2018/01/28-19:52:11.486 11030 Generated table #18270@2: 1 keys, 733 bytes
2018/01/28-19:52:11.512 11030 Generated table #18271@2: 2 keys, 1119 bytes
2018/01/28-19:52:11.540 11030 Generated table #18272@2: 1 keys, 733 bytes
2018/01/28-19:52:11.566 11030 Generated table #18273@2: 1 keys, 713 bytes
2018/01/28-19:52:11.592 11030 Generated table #18274@2: 1 keys, 783 bytes
2018/01/28-19:52:11.618 11030 Generated table #18275@2: 4 keys, 1906 bytes
2018/01/28-19:52:11.645 11030 Generated table #18276@2: 3 keys, 1314 bytes
2018/01/28-19:52:11.672 11030 Generated table #18277@2: 3 keys, 1655 bytes
2018/01/28-19:52:11.698 11030 Generated table #18278@2: 2 keys, 1339 bytes
2018/01/28-19:52:11.739 11030 Generated table #18279@2: 1 keys, 2804578 bytes
2018/01/28-19:52:11.739 11030 Compacted 1@2 + 29@3 files => 10269392 bytes
2018/01/28-19:52:11.766 11030 compacted to: files[ 0 0 24 261 400 0 0 ]
All these level-2 shards at a few kilobytes? That looks strange to me.
sesse
commented
Jan 29, 2018
|
I think this is the one: https://gist.github.com/anonymous/d54f911f5e5083bf24eca6678b172ede . It looks as if it may have scrolled out, though :-/ In general, it appears LevelDB is performing very poorly on my system. There shouldn't be hundreds of these really small files, e.g. this from the LevelDB log (LOG in the directory):
All these level-2 shards at a few kilobytes? That looks strange to me. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
sesse
Jan 29, 2018
As an expert in IndexedDB, perhaps you can talk about how index scans are implemented? Surely we don't read that much data from disk when we do a index-based query?
I look a look in the IndexedDB documentation, and it seems simple enough: For each entry in the index, you have a LevelDB entry. That entry tells you the primary key, where the real record is.
So if you're doing a scan along an index, you'll have some nice and regular reads for the index itself, and then a bunch of really random lookups throughout the actual data. So they are certainly more costly than reading over the primary key (not surprising), although hundreds of megabytes sounds maybe a bit excessive. How many records are you expecting to read in the typical case?
sesse
commented
Jan 29, 2018
I look a look in the IndexedDB documentation, and it seems simple enough: For each entry in the index, you have a LevelDB entry. That entry tells you the primary key, where the real record is. So if you're doing a scan along an index, you'll have some nice and regular reads for the index itself, and then a bunch of really random lookups throughout the actual data. So they are certainly more costly than reading over the primary key (not surprising), although hundreds of megabytes sounds maybe a bit excessive. How many records are you expecting to read in the typical case? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
scottnonnenberg
Jan 30, 2018
Member
@sesse Clearly we're not in the right ballpark with all this guessing. We don't do massive reads. What I suspect is that there are some sort of cleanup/compaction operations going on because we've fragmented the database, or added enough, or deleted enough.
|
@sesse Clearly we're not in the right ballpark with all this guessing. We don't do massive reads. What I suspect is that there are some sort of cleanup/compaction operations going on because we've fragmented the database, or added enough, or deleted enough. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
sesse
Jan 30, 2018
I don't see there being done massive compactions; if so, there would be write activity, not just read activity.
If you can reproduce this, generally have a look at the LOG file LevelDB leaves in the directory. It mentions all new shards, compactions, etc. as they go (but not reads). In my case, it looks somehow like the compaction scheme has gone all awry (way too many level-2 and level-3 files, and any level-0 file that's being written is pretty much immediately causing a compaction), and you'd probably need some real LevelDB expertise to figure out why.
sesse
commented
Jan 30, 2018
|
I don't see there being done massive compactions; if so, there would be write activity, not just read activity. If you can reproduce this, generally have a look at the LOG file LevelDB leaves in the directory. It mentions all new shards, compactions, etc. as they go (but not reads). In my case, it looks somehow like the compaction scheme has gone all awry (way too many level-2 and level-3 files, and any level-0 file that's being written is pretty much immediately causing a compaction), and you'd probably need some real LevelDB expertise to figure out why. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
MartinX3
Feb 9, 2018
https://gist.github.com/1033d5b648075d17d830ec7224ec0060
Linux Ubuntu Mate x64 17.10
Here is a new log.
I restarted my PC and I'm hearing my HDD.
There are many HDD-Accesses-Tasks slowing down the whole system, while this APP stats up.
MartinX3
commented
Feb 9, 2018
|
https://gist.github.com/1033d5b648075d17d830ec7224ec0060 Here is a new log. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
Terenceskill
Feb 9, 2018
Same problem here.
Using Linux Mint 18.3 x64 Mate.
All 4 cpu cores are at ~100%.
The HDD is under heavy load.
Here is my log:
https://gist.github.com/anonymous/315bc653b9243b06aeefa9920ba26b8f
Terenceskill
commented
Feb 9, 2018
|
Same problem here. Here is my log: |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment Hide comment
konklone
Feb 14, 2018
I'll chip in a set of logs, on a Dell XPS 13 running Debian Buster (testing):
https://gist.github.com/konklone/e67d649c95b32d1c1f9709e454c27dcd
I didn't record the CPU load or disk load, but it definitely felt like a 100% saturated set of CPUs, for at least 5 minutes, maybe 10. My .config/Signal directory reads 55M when I run du -sh . in it.
I've observed this behavior before several times, a long and consuming CPU experience while the message count goes up (very slowly) by 10 at a time. It doesn't happen on every single app startup, but it's fairly frequent -- I imagine it relates to the unpredictable delays and densities of conversations I have in Signal between application opens.
konklone
commented
Feb 14, 2018
|
I'll chip in a set of logs, on a Dell XPS 13 running Debian Buster (testing): I didn't record the CPU load or disk load, but it definitely felt like a 100% saturated set of CPUs, for at least 5 minutes, maybe 10. My I've observed this behavior before several times, a long and consuming CPU experience while the message count goes up (very slowly) by 10 at a time. It doesn't happen on every single app startup, but it's fairly frequent -- I imagine it relates to the unpredictable delays and densities of conversations I have in Signal between application opens. |

sesse commentedNov 30, 2017
I see there are several near-duplicates, but I'm unsure if there's anything exactly like this, so I'm reporting with the risk of a duplicate :-)
After upgrading from the Chrome extension to the “standalone” client (which is still based on Chrome), startup has become unbearably slow. Signal shows “Loading…” and uses 100% CPU for ages. After maybe 5 minutes, it says “10 so far…”, then after a few more minutes “20 so far” and so on. Recently, it seems to simply never load.
Startup in the Chrome extension was slow, but it was more like 30 seconds, and I haven't had a whole lot of new messages.
Platform info
Operating System: Debian GNU/Linux sid/unstable
Signal Version: 1.0.39
Link to debug log
Here's what happens on the console after I start Signal and allow it to run for a while:
At this point, it's used 100% CPU for more than ten minutes, and is still in “Loading…”.