Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add .C as a cpp extension and make extensions case-sensitive #1025

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Commits on Aug 14, 2023

  1. Add .C as a cpp extension and make extensions case-sensitive

    /# Motivation
    It turns out that some projects use capital .C to refer to C++.
    
    This is true for the OpenFoam project which is a widely used open
    source software for physics simulations.
    
    Although I do not know of other projects that follow this convention
    it is worth noting that both github and `cloc` recognize the .C
    extension as a C++ file. This was showed in more detail in this Issue:
    
    XAMPPRocky#1024
    
    /# Implementation
    
    The file `languages.json` contains a list of extensions for each
    supported language. So adding ".C" should have been enough for this
    feature. But the code was casting extensions to lowercase, which makes
    .c and .C indistinguishable. Because of this the `.to_lower` was also
    removed from the get_extension utility function.
    
    The "to_lower" was probably added for a purpouse. In order to not break
    functionality I added a second check. If the case sensitive test does
    not return any result, a second test is done with the to-lower. This
    will make the code a tiny bit slower, but only when the case-sensitive
    check fails and will only repeat a rather fast check.
    
    /# Testing
    
    Tests are created by the `build.rs` script according to the files
    present in `tests/data`. In order to test my solution I copied the
    `tests/data/cpp.cpp` into a file called `tests/data/cpp_C.C`.
    
    Adding this file without modifying the code will cause a failing test,
    which is what we want. This is mostly a coincidence I think, for some
    reason the C language summarize blank lines differently than the Cpp
    summarize.
    
    I think the more appropriate test would be that each of these files also
    contain the name of the language that they are meant to be as part of
    the top comment. This would require additional code changes which I
    think is a bit outside of the scope of this change. I do intend to
    create a different PR adding this to the tests, if that is ok.
    
    Observation: Since this new file is a copy of an existing one, it does
    not occupy extra space in the git file-store, since it stores files by
    content hash.
    
    /# Additional considerations
    
    In Windows the file-system is case-insensitive but case-preserving.
    I believe that this means that the current version of the code will
    work correctly on windows to identify .C as cpp. But I do not have
    a windows system to test this on.
    glazari committed Aug 14, 2023
    Configuration menu
    Copy the full SHA
    844e66b View commit details
    Browse the repository at this point in the history