Skip to content


Subversion checkout URL

You can clone with
Download ZIP
Commits on Jun 15, 2014
  1. @mikezackles
  2. Ctags (Vi) format

    * Add tests (figure out how to get vi to output filename & line number
      to stdout, for integration tests).
    * Sort the tags?
Commits on Jul 31, 2012
  1. Update version to '0.2'

  2. --suppress-qualifier-tags is now the default behaviour

    Use --extra-qualifier-tags to re-enable this behaviour. The option
    --suppress-qualifier-tags remains, but has no effect and will be removed
    in a future release.
    With the old default, the generated tag files were large enough (60MB+)
    to cause my editor (Emacs) performance troubles with fuzzy matching
    enabled; and for the user the sheer number of matching tags (in the
    tab-completion list presented by Emacs) was often overwhelming.
Commits on Jul 23, 2012
  1. Update version to '0.1'.

Commits on Jul 21, 2012
  1. Remove duplicate tags for the same source file

    The same source file can be processed multiple times; for example if the
    same file has several entries in the compile commands database (say,
    with different -D pre-processor definitions).
    Header files are very likely to be processed multiple times: several
    source files specified on the clang-ctags command line might include the
    same header file.
  2. Performance optimization: Don't parse function bodies

    Enabling PARSE_SKIP_FUNCTION_BODIES cuts the total runtime down to 40%.
    It also starts reporting `cursor.is_definition()` as *False* for
    function definitions, so I have to explicitly check for those in my own
    `is_definition` function. I don't know if this change in behaviour is
    intentional or a bug.
    Unfortunately this means we're probably going to start generating tags
    for function declarations as well as definitions, but I'll pay that
    price for the performance increase.
  3. `--suppress-qualifier-tags` for each namespace qualifier

    For a source file containing:
        namespace ns { class cls { int member; }; }
    I generate several tags for `member`:
    * ::ns::cls::member
    * ns::cls::member
    * cls::member
    * member
    See `test_emacs_understands_nested_scopes` in test/ for a thorough
    explanation of the reasons.
    I now provide the option to suppress this behaviour, because it isn't
    necessary if your editor is smart enough to do a substring match on the
    tag name (perhaps even a substring match that knows about C++ namespace
    With Emacs I use "ido-ubiquitous"†, a package that provides the "ido"‡
    fuzzy matching everywhere in Emacs, including at the "find-tag" prompt.
  4. Debug logging with performance info

    The numbers clearly show that most of the time is spent in the parsing
    done by libclang itself.
    Also print the number of the current file being processed and the total
    number of files, because running clang-ctags on a large number of files
    takes a *long* time, so it is useful to get a rough progress indication.
  5. `--non-system-headers` to generate tags for every non-system header file

    I can't seem to tell from a cursor or cursor.location or
    cursor.location.file whether a file is a system header or not.
    System headers have absolute paths... but so do local headers when
    the file name is given as an absolute path.
    I can't find anything in the libclang C API either; the accessors of the
    python cursor.location.file (.name, .time) are clang_getFileName and
    clang_getFileTime in clang/tools/libclang/CIndex.cpp, and there is
    nothing else in there. Perhaps I could add it in a way similar to
    clang_isFileMultipleIncludeGuarded: Use the HeaderSearch object to
    determine if it is a system include. Unlikely to be possible without
    employing some heuristic though, so I might as well do that heuristic in
    my python code.
    Given the above, "non-system-headers" means files with a relative path,
    or under the directory where clang-ctags was invoked from.
  6. `--all-headers` to generate tags for every header file encountered

    This will generate tags for system headers too, so the tags file is
    going to be huge; only use it to create a tags file for a single source
    I've added this option because generating tag for header files is a bit
    awkward with clang-ctags. With GNU etags or exuberant ctags you'd just
    say `etags file.c file.h other-file.h` and it will index each file.
    clang-ctags needs a compilation command-line, and there is no such
    command line for header files (unless you manufacture one based on the
    command line for one of your source files). So I've added this tag so
    that if you index a single source file, you'll be able to find any
    definitions in the included header files.
    Unfortunately the definitions of some things will be in other
    source files that haven't been indexed at all, so this is only somewhat
    It could turn out to be useful as a tool for delving into your system
    headers and learning more about your system!
  7. Compile commands database's `directory` is relative to database file

    Currently, the only producer of the compile_commands.json database is
    CMake, which (as of version 2.8.8, at least) specifies all file and
    directory entries with absolute paths, so this commit isn't strictly
    necessary; but it seems like the right thing to do.
  8. Run compiler command line from directory specified in database

    For entries in the compile_commands.json database, paths specified in
    `command` are relative to `directory` (see ).
  9. Resolve symlinks when comparing absolute paths

    This won't work if the path stored in the compile commands database
    contains un-canonicalised (symlink) components. Furthermone, I have no
    control over that database (it is created by CMake or some other means
    beyond my control). But this is the best I can do for now, and it will
    make the upcoming `make distcheck` work on systems where $TMPDIR's path
    contains symlinks. The real solution will be to fix libclang's
    clang_CompilationDatabase_getCompileCommands (in the python bindings:
    CompilationDatabase.getCompileCommands) to resolve pathnames during its
    Note regarding the unit tests: By `cd`ing to the canonicalised directory
    in, the unit tests can freely use $PWD (see for example
  10. Use absolute paths to the source files

    when (a) searching for a file in the compile commands database, and
    (b) recording the source filename in the tags file.
    (a) is because the compile_commands.json file produced by CMake (as of
    version 2.8.8, at least) specifies files with absolute paths, and the
    libclang routine to retrieve the CompileCommands for a file seems to
    search on plain string matches: libclang doesn't try to match relative
    pathnames given to clang_CompilationDatabase_getCompileCommands against
    absolute pathnames in compile_commands.json.
    Currently CMake is the only producer of compile commands databases that
    I know of, but for compatibility with possible future changes to CMake
    and with other producers of the compile commands database, I should
    really try to support relative paths in the database's "file" field.
    The best way will be to fix libclang, rather than work around it here.
    (b) is to keep things simple (or you can call it lazyness). Ideally the
    tags file would comply with the following specification from the
    etags(1) man page:
        Files specified with relative file names will be recorded in the tag
        table with file names relative to the directory where the tag table
        resides. If the tag table is in /dev or is the standard output,
        however, the file names are made relative to the working directory.
        Files specified with absolute file names will be recorded with
        absolute file names.
    The (future) implementation has a slight complication:
    * When --compile-commands isn't specified, we can use tu.spelling to
      determine whether the file was specified on the command line with an
      absolute path.
    * When --compile-commands *is* specified, we can't use tu.spelling,
      because that comes from the compile command in the database, not from
      the filename specified on the clang-ctags command line. So we'd have
      to pass the original filename all the way through to Etags.tag, but
      also give Etags.tag the absolute filename (though perhaps Etags.tag
      can determine the absolute filename from cwd + cursor.location.file;
      cursor.location.file includes the pathname as given on the compilation
      command line).
  11. Process multiple source files (including header files)

    specified on command line, when `--compile-commands` is given with the
    database for looking up each file's compilation command line.
    Note that I can now index header files specified on the command line, as
    long as I also specify (on the command line) a cpp file (with an entry
    in the compile commands database) that includes the header file.
    Up until this commit I had no way of choosing which header files to
    index; my choices were to generate tags for:
    1. all header files (including every header recursively included by
    other headers), or
    2. perhaps only header files in the same directory as the source
    file (or under the directory where clang-ctags was invoked), or
    3. no header files at all.
    Number 3 is what I was doing up until this commit, and is still what I
    do when invoked without `--compile-commands`: I only generate tags when == tu.spelling.
  12. Basic support for database of compilation commands

    Instead of passing the compiler command line as arguments to
    clang-ctags, you can have a json database with the compilation command
    line for each source file in your project. If your project uses the
    CMake build system, you can generate such a database with
    So instead of saying:
        clang-ctags -- -Isomewhere -Dsomething -c subdir/a.cpp
    you say:
        clang-ctags --compile-commands=path/to/compile_commands.json subdir/a.cpp
    Note that `getCompileCommands(source_file)` can return None (if the
    source_file isn't present in the database). The user of clang-ctags
    could pass in a list of files like "foo.cpp foo.h" where foo.cpp is
    present in the database, but foo.h isn't, so it is not an error if the
    source file isn't found in the database. In a future commit I will
    also generate tags for "foo.h" if it is specified on the command line.
    Note that there may be multiple compilation command lines for a single
    source file: The same source file could be re-used (say, with different
    -D pre-processor defines) to produce different object files. So I
    produce tags for *all* compilations of that source file; in a future
    commit I'll make the Etags and Ctags formatters smart enough to remove
    duplicate tags.
Commits on Jul 18, 2012
  1. Cleanup: Move clang processing from main into do_tags function

    Essentially all I've done is move the middle section of main() into its
    own function. There are no functional changes here. This is in
    preparation for the upcoming work on supporting a database of compile
    The changes to the Etags and Ctags formatter interface were necessary
    because now I instantiate the formatter before clang processing, i.e.
    before I know the filename (tu.spelling). This is also a useful for the
    upcoming compile-commands work, where the formatter will be able to
    generate tags for multiple source files.
Commits on Jul 11, 2012
  1. Tag declarations inside an 'extern "C" { ... }' linkage specification

    I *don't* want the linkage spec itself to be counted when generating the
    parent scopes, i.e. "i" in test/linkage.cpp should be tagged as "::i",
    not "::<linkage spec>::i" (in reality the linkage spec has no
    "displayname" so the tag would have looked like "::::i").
    But I *do* want to tag the declarations inside the linkage spec. That is
    why I had to split is_named_scope() into two functions: is_named_scope()
    and should_tag_children().
    With clang 3.1 the linkage spec's kind is reported as UNEXPOSED_DECL
    instead of LINKAGE_SPEC; if this changes, the test should catch it
    (and I'll have to think about preserving compatibility between the
    different versions of libclang).
  2. Tag unions & enum members.

  3. Support command-line options -a, -e, -o|-f, -v.

    -a (--append), -o (--output, -f), -e, --verbose and --version are common
    to GNU etags and Exuberant Ctags.
    Otherwise the command-line interface of clang-ctags is *not* compatible
    with GNU etags, Exuberant Ctags, or other existing ctags
    implementations. This is because clang-ctags needs the full compilation
    command line to pass on to libclang, so it can only process one source
    file at a time.
    To tag multiple source files, use the "-a" option to append to an
    existing tags file. To obtain a source file's compilation command line,
    interpose CXX when invoking make, or use the compile_commands.json
    database created by CMake's -DCMAKE_EXPORT_COMPILE_COMMANDS=1 option
    (I will add more documentation on this matter over the next few days).
    Output in vi format is currently not supported, so you must use "-e"
    (Emacs etags format).
Commits on Jun 27, 2012
  1. Ignore compiler warnings.

    Now clang-ctags only stops on compilation *errors*.
  2. Nicer error message when unable to process command line.

    The test added in this commit doesn't actually test the specific output
    in this error condition, so the test would also pass before this commit.
    It is there to test that clang-ctags will exit with a failure status.
    When I add command-line flags for clang-ctags (as opposed to the flags
    in the compiler command line, passed on to clang) I will have a way of
    switching debug on and off -- for now I'll just always print it.
  3. Check for clang diagnostics.

    A C++ source file with obvious errors wasn't flagged by index.parse, and
    I could still iterate over all the cursors. The type of "booyah i;" was
    happily reported as TypeKind.INT.
    So now I fail to proceed if clang reported any diagnostics.
    Note that providing argv[0] (clang-ctag's own program name) to
    index.parse caused the following diagnostic on every invocation:
        ./clang-ctags: 'linker' input unused when '-fsyntax-only' is present
    In other words, "clang-ctags" was being treated as an additional
    argument to the compiler, and since it didn't end in a recognised
    extension like ".cpp", it wasn't treated as the source file to compile
    but as input to the linker.
    If there is a compiler name at argv[0] (instead of "clang-ctags") the
    diagnostic is the same; clearly clang_parseTranslationUnit does not
    expect to see the name of the compiler itself on the compilation command
    Changing argv to argv[1:] eliminates the diagnostic.
  4. Emacs requires a "pattern" as well as tagname + line number.

    Without the pattern Emacs jumps to the file, but at line 1 instead of
    the line containing the tag -- even though the tags file already
    contains the line number. Presumably Emacs doesn't rely on the line
    number at all for robustness to changes in the source file.
  5. Output tags in Emacs etags format.

    Until now I've been outputting the tags in no particular format, just
    something human-readable for debugging.
  6. Tags for functions include full signature.

    Instead of searching for "foo" the user will have to search for "foo()"
    or "foo(int)".†  Emacs, at least, offers tab-completion on available
    tag names so I don't think I will provide separate "foo" and "foo()"
    † Unless the editor can match a partially-matching tag -- Emacs doesn't
    do this on the explicit tagname, only on the pattern field. (The tagname
    is the second field, the pattern field the first, which in often just
    the whole source line. See etc/ETAGS.EBNF in the Emacs source tree for
    an explanation of the TAGS file format.)
  7. Don't create tags for class access specifiers.

    Like "public" or "private". For a reason I don't understand,
    cursor.is_definition() returns True for access specifiers.
  8. Use cursor.semantic_parent instead of tracking parents myself.

    Tracking the parents myself only worked for a class's inline methods
    like "f" below, but not for methods like "g" that are defined outside of
    the lexical scope of the class definition:
        class A {
            void f() {}
            void g();
        void A::g() {}
  9. Explicit tags for class members, qualified by the class name.

    Just like we do for namespaces and structs.
  10. Explicit tags for struct members, qualified by the struct's name.

    For source "struct s { int i; };" the user can search for "i" or for "s::i"
    so we have explicit tags for both spellings.
  11. Create explicit tags for each namespace qualifier.

    For source "namespace n1 { namespace n2 { int i; } }" the user can
    search for "i" or "n2::i" or "n1::n2::i" or even "::n1::n2::i" so we
    have explicit tags for each spelling.
    In future commits I'll do the same for class & struct members.
Something went wrong with that request. Please try again.