Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Porting memory I/O from Geany #863

Closed
techee opened this issue Apr 2, 2016 · 19 comments
Closed

Porting memory I/O from Geany #863

techee opened this issue Apr 2, 2016 · 19 comments

Comments

@techee
Copy link
Contributor

techee commented Apr 2, 2016

In addition to #862 the second big difference in Geany is we need to parse a file loaded into memory instead of parsing a file stored on disk. For this reason there's a small library called MIO:

https://github.com/geany/geany/tree/master/tagmanager/mio

which tries to hide where the file is stored - whether on disk or in memory. One creates a MIO file using either

MIO        *mio_new_file            (const gchar *filename,
                                     const gchar *mode);

to get a file-based backend or

MIO        *mio_new_memory          (guchar        *data,
                                     gsize          size,
                                     MIOReallocFunc realloc_func,
                                     GDestroyNotify free_func);

to get a memory-based backend. Afterwards, one can use POSIX-like functions like mio_read(), mio_write(), mio_getc(), mio_ungetc(), mio_setpos(), etc. without having to care of what nature the file really is.

The question now is - would it be acceptable for universal-ctags to use this library? There are two main benefits I can think of:

  1. If there's some plan to provide universal-ctags as a shared library (make ctags a library #63), something like this will be necessary because users of the library will need to parse files from memory.
  2. In Geany even for files which we want to parse and which are stored on disk we load them to memory first (if they are not too big) and use the memory back-end. This turned out to be about 30% faster (for the C parser) because the bottle-neck is the getc() call and it's cheaper to do real IO just once at the beginning and then just advance the pointer in the buffer when simulating getc() than calling the real getc() many times.

We can do some things to make the MIO library play more nicely with universal-ctags:

  1. It uses some glib types which can be easily converted to ordinary C types.
  2. If there's a preference for some other type names, MIO, etc. could be renamed to something else.
  3. The whole 4-file library can be easily converted to a single source + header implementation. So instead of library this would just be one additional source file.
  4. Also maybe the "virtual" method calls (calling either memory or file back-end) could be just replaced by a simple "if" to have an IO function at a single place instead of one "interface" and two "classes" implementing it. Would save a few LOCs.
  5. Make the coding style match the universal-ctags one.

What do you think?

cc @b4n

@masatake
Copy link
Member

masatake commented Apr 4, 2016

I'm very interested in MIO. I have wanted to share more code with geany. However, as far as I am maintaining ctags, I will not provide universal-ctags as a shared library because designing API is too difficult for me. The most of my use case command line interface is enough. However, I have been worked for providing sophisticated command line interface. This is my basic position.

I wrote a document about input facility of u-ctags: http://docs.ctags.io/en/latest/internal.html

The biggest question is that MIO works on Windows?
I myself don't use Windows, however, some of contributors have worked for making ctags be built and run on Windows. I would like to keep the result of the efforts.

@techee
Copy link
Contributor Author

techee commented Apr 4, 2016

I'm very interested in MIO.

Great to hear!

I have wanted to share more code with geany. However, as far as I am maintaining ctags, I will not provide universal-ctags as a shared library because designing API is too difficult for me. The most of my use case command line interface is enough. However, I have been worked for providing sophisticated command line interface. This is my basic position.

This wouldn't be that hard. The tm_ctags_wrappers.h/c here geany/geany#957 could serve as a base for such an API. Anyway, first it would be good to sync Geany with universal-ctags as much as possible to know exactly what we need from such an API.

I wrote a document about input facility of u-ctags: http://docs.ctags.io/en/latest/internal.html

Good, will have a look at that. But I think I know quite well what's going on in ctags and can do the necessary changes.

The biggest question is that MIO works on Windows?

There's no problem on Windows - Geany works on Windows just fine. There's nothing special about MIO - the file back-end basically just calls the normal IO functions like getc() etc. which are already called anyway and the memory back-end maintains a char array on top of which it simulates file operations and there are no OS-related functions at all.

Good, I'll try to prepare a patch and you can decide if you like it or not and what needs to be changed.

@masatake
Copy link
Member

masatake commented Apr 5, 2016

The biggest question is that MIO works on Windows?
There's no problem on Windows - Geany works on Windows just fine.

Fine.

IF POSSIBLE, I would like to get a pull request for introducing glib first.

universal-ctags uses libxml directly. The code should be replaced with memory-stream based.
#740 (comment)

@vhda
Copy link
Contributor

vhda commented Apr 5, 2016

@techee do you think this would help implementing something like what is requested in #80?

@techee
Copy link
Contributor Author

techee commented Apr 5, 2016

IF POSSIBLE, I would like to get a pull request for introducing glib first.

Hmm, do you want to introduce glib dependency? I think it would be better to remove the glib stuff from MIO (which I actually did yesterday already) and not to introduce more dependencies.

universal-ctags uses libxml directly. The code should be replaced with memory-stream based.
#740 (comment)

This is actually something MIO would help you a lot. As I said, it's more efficient to load every file into a buffer in memory first and then use the MIO memory back-end. And in this case you can get the underlying buffer directly from MIO so this should be trivial once MIO is in.

@masatake
Copy link
Member

masatake commented Apr 5, 2016

@techee, oh, I see. Please, forget what I wrote about glib.

@techee
Copy link
Contributor Author

techee commented Apr 5, 2016

@vhda I'm not completely familiar with what problems you are facing regarding #80 so I cannot tell - MIO will basically allow you to use file-like operations on a buffer in memory.

@pragmaware
Copy link
Contributor

I think that using something like MIO would be a good idea.

Recently I have looked at the performance of ctags and found out that the filesystem access is a major bottleneck. The function that is called more often and takes most processing time is getc(). That's why I have started looking into mmap()'d file access that would allow to load the files faster and access content via pointers instead of a function call. I have even tried to implement it. However I've found out that a naive #ifdef based implementation is difficult because there are really many places where the code relies on the stream-like file system API. Something like MIO is needed for this: replace all the plain posix calls with mio_* prefixed ones, implement memory mapping hidden inside mio_* and then try to improve performance by turning certain functions into macros or accessing the buffer directly only in certain places.

I also think that having ctags as a library would be cool. Though quite a lot of refactoring is needed.

@techee
Copy link
Contributor Author

techee commented Apr 6, 2016

Recently I have looked at the performance of ctags and found out that the filesystem access is a major bottleneck. The function that is called more often and takes most processing time is getc(). That's why I have started looking into mmap()'d file access that would allow to load the files faster and access content via pointers instead of a function call. I have even tried to implement it. However I've found out that a naive #ifdef based implementation is difficult because there are really many places where the code relies on the stream-like file system API. Something like MIO is needed for this: replace all the plain posix calls with mio_* prefixed ones, implement memory mapping hidden inside mio_* and then try to improve performance by turning certain functions into macros or accessing the buffer directly only in certain places.

I profiled it too (this way http://wiki.geany.org/howtos/profiling/gperftools) and the result after using MIO and loading files smaller than 1MB into memory first is this for the linux kernel sources:

https://www.dropbox.com/s/wf6rj7o430mrzzh/pprof14382.0.svg?dl=0

mio_getc() now takes less than 3% but you have to add the about 7% taken by openInputFile() which now loads the file into memory. You might save something from the 7% using mmap but I guess it won't be that much.

By the way, the new parser is about twice as slow as the old C parser. The reason why the old parser was so fast was that it wasn't generating any tokens when skipping the irrelevant code like function bodies - it was just counting braces, skipping comments and waiting for the right end brace without any token generation on the way. I haven't checked the sources but couldn't something like that be done for the new parser too? The cxxParserParseNextToken() is the most expensive function from the profile.

I also think that having ctags as a library would be cool. Though quite a lot of refactoring is needed.

I don't think that much refactoring is needed actually - there should just be one more header that exposes the public stuff. The Geany pull request geany/geany#957 tries to separate ctags from Geany as much as possible and introduces ctags_wrappers.h which tries to be such an interface.

@techee
Copy link
Contributor Author

techee commented Apr 6, 2016

By the way the profile svg doesn't display nicely on dropbox - better to download it locally and open it with web browser.

@pragmaware
Copy link
Contributor

mio_getc() now takes less than 3% but you have to add the about 7% taken by openInputFile() which
now loads the file into memory. You might save something from the 7% using mmap but I guess it
won't be that much.

More gains can probably be achieved by adapting certain functions to work directly on the memory buffer by using pointers and avoiding mio_getc(), mio_fgetpos() etc. The goal is to change getc() with *ptr++, ungetc() with ptr--, getpos with (ptr - buffer)...

By the way, the new parser is about twice as slow as the old C parser. The reason why the old
parser was so fast was that it wasn't generating any tokens when skipping the irrelevant code like
function bodies - it was just counting braces, skipping comments and waiting for the right end brace
without any token generation on the way. I haven't checked the sources but couldn't something like
that be done for the new parser too?

The new parser doesn't treat code inside functions as irrelevant. There are variable declarations, nested prototypes, lambda calls... this is the whole point of a new parser: extract the information that the old parser was skipping.

@techee
Copy link
Contributor Author

techee commented Apr 6, 2016

More gains can probably be achieved by adapting certain functions to work directly on the memory buffer by using pointers and avoiding mio_getc(), mio_fgetpos() etc. The goal is to change getc() with *ptr++, ungetc() with ptr--, getpos with (ptr - buffer)...

Possibly, but it will be hard - if you look at the path getcFromInputFile()->iFileGetLine()->iFileGetc()->mio_getc() then the losses in each of the functions are minimal (2-3%). And anyway, you'll get 13% in the best case.

And you need something like iFileGetLine() to get line counting automatically so you get 9% at most (realistically I'd say 6%).

The new parser doesn't treat code inside functions as irrelevant. There are variable declarations, nested prototypes, lambda calls... this is the whole point of a new parser: extract the information that the old parser was skipping.

I see, then there's not much to do in this respect. By the way is it possible somehow to switch to the old parser? Might be useful sometimes.

@masatake
Copy link
Member

masatake commented Apr 8, 2016

@techee, thank you again. Many queued issues will be solved with MIO.

I see, then there's not much to do in this respect. By the way is it possible somehow to switch to the old parser? Might be useful sometimes.

OldC and OldC++ are kept for testing purpose.

$ ./ctags --fields=+l -o - --languages=-C --languages=-C++ --languages=+OldC  --languages=+OldC++ /tmp/foo/input.c
main    /tmp/foo/input.c    /^main(void)$/;"    f   language:OldC

You cannot run make units with the old parrsers because the string "Old" is embedded into language: field. For running ctags must provides the way to misrepresent a parser name like.
The way to misrepresenting the parser name is also needed to run ripper_tags as xcmd based ruby parser.

@techee
Copy link
Contributor Author

techee commented Apr 10, 2016

@techee, thank you again. Many queued issues will be solved with MIO.

Great to hear.

Thanks for the info about how to use the old parsers, could be useful for testing.

@techee techee closed this as completed Apr 10, 2016
@dtikhonov
Copy link
Member

@techee writes (emphasis added):

If there's some plan to provide universal-ctags as a shared library (#63), something like this will be necessary because users of the library will need to parse files from memory.

I can't say that this jumps out at me as obvious.

@dtikhonov
Copy link
Member

@pragmaware writes:

Recently I have looked at the performance of ctags and found out that the filesystem access is a major bottleneck. The function that is called more often and takes most processing time is getc(). That's why I have started looking into mmap()'d file access that would allow to load the files faster and access content via pointers instead of a function call

I brought this up a couple of times before, for example in issue #423. There, I wrote:

If we want to make ctags faster, the simplest thing to do is to rewrite iFileGetLine to read the file in 4 KB chunks instead of character by character. (Each char may be tested three or four times!) We'll double or triple the speed.

@dtikhonov
Copy link
Member

@techee, @pragmaware: You can't get around disk access penalty using mmap(). Once a file is in the buffer cache, however, a sane line-by-line implementation based by a large buffer offers the same performance as that of mmap.

@b4n
Copy link
Member

b4n commented Apr 21, 2016

@techee writes (emphasis added):

If there's some plan to provide universal-ctags as a shared library (#63), something like this will be necessary because users of the library will need to parse files from memory.

I can't say that this jumps out at me as obvious.

It does for me, because if I am to use such a library I probably want to be able to parse i.e. a buffer the user is currently editing, and then writing to a file just so the library can read it seems highly suboptimal.

@techee
Copy link
Contributor Author

techee commented Apr 21, 2016

If we want to make ctags faster, the simplest thing to do is to rewrite iFileGetLine to read the file in 4 KB chunks instead of character by character. (Each char may be tested three or four times!) We'll double or triple the speed.

The patch loads the complete file into memory if it's smaller than 1MB (which most sources are). This is faster than reading it by 4KB blocks - in the original version of the patch I was reading the file into memory in 4KB blocks instead all at once and the difference was 4s for parsing all the linux kernel sources.

@techee, @pragmaware: You can't get around disk access penalty using mmap().

Sure, but with SSD disks there's almost none (with classical hard drives you are mostly limited by the random access time which is around 10ms so you will be able to parse only about 100 files per second). Anyway, mmap() shouldn't be needed now as the file is loaded into the memory by simply reading its contents.

Once a file is in the buffer cache, however, a sane line-by-line implementation based by a large buffer offers the same performance as that of mmap.

For good performance it's also important to reduce system calls as much as possible as the context switch is expensive. This is probably the main cause for the total 8s improvement after using GIO because all the IO and kernel interaction is done once at the beginning and during parsing no system calls happen any more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants