New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request] Speeding up query performance of rdm #1007

Open
ngkaho1234 opened this Issue Jul 18, 2017 · 4 comments

Comments

Projects
None yet
5 participants
@ngkaho1234

ngkaho1234 commented Jul 18, 2017

Please mark appropriate

  • rtags (rdm/rc/rp)
  • Emacs Lisp
    • ac-rtags.el
    • company-rtags.el
    • helm-rtags.el
    • flycheck-rtags.el
    • ivy-rtags.el
    • rtags.el

Feature request

The query performance of RTags working on large projects such as Linux kernel is quite slow compared to some existing solutions like Eclipse... For example, the first query of a symbol name (--find-symbols) right after the daemon's being started takes about 5 minutes if the source code and RTags datafiles are all located on a local filesystem on HDD. I don't know if there are any solution we can work on to improve the query performance of RTags on spinning drive.

@Andersbakken

This comment has been minimized.

Show comment
Hide comment
@Andersbakken

Andersbakken Jul 30, 2017

Owner

The database is generally too large to fit in memory so a slow drive is likely to cause problems for most solutions. I've considered building some sort of index in memory when indexing is finished but this could have adverse effects on projects where you make a lot of changes. I haven't done a lot of profiling on the source code that handles find-symbols but 5 minutes is clearly not acceptable. This is after the indexing has finished I assume?

Owner

Andersbakken commented Jul 30, 2017

The database is generally too large to fit in memory so a slow drive is likely to cause problems for most solutions. I've considered building some sort of index in memory when indexing is finished but this could have adverse effects on projects where you make a lot of changes. I haven't done a lot of profiling on the source code that handles find-symbols but 5 minutes is clearly not acceptable. This is after the indexing has finished I assume?

@ngkaho1234

This comment has been minimized.

Show comment
Hide comment
@ngkaho1234

ngkaho1234 Jul 31, 2017

Yes. A query operation was issued after indexing had finished and without Linux page cache being warmed up, that operation took 5 minutes to walk ~20000 data files.
I did an experiment on adding a persistent database for mapping SYMBOL_NAMES -> File ID. Although the time required for the first indexing query to finish greatly reduced from 5 minutes to 2 seconds, the database entry format used still need to be redesigned for clarity and functionality (and Regex support/Case sensitivity toggle won't work under the current indexing scheme). Here is the link: ngkaho1234@b3e9f41

Also, here is the compile commands dot json of linux-4.11.9 (the compilation configuration file is taken from ArchLinux's core/linux): https://gist.github.com/1fb10ea5e08b72ebc578d55c0cc98834

ngkaho1234 commented Jul 31, 2017

Yes. A query operation was issued after indexing had finished and without Linux page cache being warmed up, that operation took 5 minutes to walk ~20000 data files.
I did an experiment on adding a persistent database for mapping SYMBOL_NAMES -> File ID. Although the time required for the first indexing query to finish greatly reduced from 5 minutes to 2 seconds, the database entry format used still need to be redesigned for clarity and functionality (and Regex support/Case sensitivity toggle won't work under the current indexing scheme). Here is the link: ngkaho1234@b3e9f41

Also, here is the compile commands dot json of linux-4.11.9 (the compilation configuration file is taken from ArchLinux's core/linux): https://gist.github.com/1fb10ea5e08b72ebc578d55c0cc98834

@sciamano

This comment has been minimized.

Show comment
Hide comment
@sciamano

sciamano Jul 31, 2017

What would you suggest to do the profiling? I also see issues with the query performance for a large project: the indexing take some time, but that is fine (eclipse and friends are also slow there), but then the query never ends (maybe I was not patient enough...). And in my setup there is no HDD involved, I have just symlinked ~/.cache/rtags to /dev/shm/rtags, so the index is in fact in-memory (and there are dozens of GB of RAM).

sciamano commented Jul 31, 2017

What would you suggest to do the profiling? I also see issues with the query performance for a large project: the indexing take some time, but that is fine (eclipse and friends are also slow there), but then the query never ends (maybe I was not patient enough...). And in my setup there is no HDD involved, I have just symlinked ~/.cache/rtags to /dev/shm/rtags, so the index is in fact in-memory (and there are dozens of GB of RAM).

@MaskRay

This comment has been minimized.

Show comment
Hide comment
@MaskRay

MaskRay Aug 3, 2017

Contributor

The "head-of-line blocking" also bothers me. It easily takes me more than 6 seconds to do the first query.

% du -sh ~/.cache/rtags/_home_ray_Dev_Bin_radare2_/
264M    /home/ray/.cache/rtags/_home_ray_Dev_Bin_radare2_/
Contributor

MaskRay commented Aug 3, 2017

The "head-of-line blocking" also bothers me. It easily takes me more than 6 seconds to do the first query.

% du -sh ~/.cache/rtags/_home_ray_Dev_Bin_radare2_/
264M    /home/ray/.cache/rtags/_home_ray_Dev_Bin_radare2_/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment