-
Notifications
You must be signed in to change notification settings - Fork 380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiroot #3815
Multiroot #3815
Conversation
This shouldn't be merged until #3807 is merged first. |
I see you have proper descriptions in the commit messages, as well as a proper title -- could you copy paste them here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see my remarks
otherwise raises an argparse.ArgumentTypeError which constitutes in a | ||
graceful error message automatically by argparse module. | ||
""" | ||
path = os.path.abspath(path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we use pathlib? That lib has good multiplatform support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are using os.path
module for handling paths. pathlib
is an alternative module. I can see the advantages of its usage, but I think we shouldn't mix them in the code base. For example this existing_abspath()
function is returning a path. Should it be a string or a pathlib.Path
object? Should this be adapted at the caller side transitively?
@@ -28,7 +28,6 @@ def __init__(self, environ): | |||
super(ClangSAConfigHandler, self).__init__() | |||
self.ctu_dir = '' | |||
self.ctu_on_demand = False | |||
self.log_file = '' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this never used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This member was stored in the construct_config_handler()
function, but it is read nowhere. Since the store operation has been removed, this variable became completely unused.
"files which were created during the build. " | ||
"The analyzers will check only the files " | ||
"registered in these build databases.") | ||
parser.add_argument('input', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this change replicated in the check command? There is a logfile argument there too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CodeChecker check
command has a slightly different interface, for example it doesn't have this positional argument. It can be given the compilation database through -l
flag, or a build command can be given by -b
flag.
Traverse the parent directories of the given path and find the closest | ||
compile_commands.json. If no JSON file exists with this name up to the root | ||
then None returns. | ||
The path of the first compilation database is returned even is it doesn't |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... even if it doesn't ..
|
||
while True: | ||
path = os.path.dirname(path) | ||
compile_commands_json = os.path.join(path, COMPILATION_DATABASE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering if should we hard code "compile_commands.json" as the used compilation database name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In another language I'd say, this is a constant variable. The intention here was to express that "compile_commands.json" is a conventionally used name by other tools (e.g. CMake exports it with this name, Clangd and vim plugins are looking for this automatically, etc.). I just wanted to prevent potential typos at the places of usage. This variable is used at several places.
os.path.splitext(source_file_path)[1] in C_CPP_OBJC_OBJCPP_EXTS | ||
|
||
|
||
def build_actions_for_file(file_path: str) -> List[Dict]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
find_build_actions_for_source_file
?
|
||
elif os.path.isdir(analysis_input): | ||
compilation_database_files = \ | ||
find_all_compilation_databases(analysis_input) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not entirely sure why the per file and the directory based method has different compDB discovery implementations.
Isn/t the single source file case just a specialized (restricted in a sense) case of the directiry based?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was the original implementation, that you described. The problem was its extreme bad performance. I couldn't wait for the compilation database collection for analyzing LLVM source code. That's why the directory-based analysis is initiated from the compilation databases, not the source files.
"file": "inner.c" | ||
}] | ||
|
||
with open(cls.comp_db_outer, "w", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe an inner function the create the compdb-s could be created.
Test if the correct set of build actions return to each test source | ||
file. | ||
|
||
WARNING! compilation_database.gather_compilation_database() function is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting consideration. I am thinking maybe we could somehow indicate in the assertion messages the path of the used compilation database? This way we could at least detect if something is wrong with the test environment.
databases. | ||
input The input of the analysis can be either a compilation | ||
database JSON file, a path to a source file or a path | ||
to a directory containing source files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Collected some more places where analyze input is mentioned:
It is at your discretion how much you edit.
- README.md
- 71, 116
- analyzer/user_guide.md
-859, 910 - usage.md
-125
The numbers are line numbers where the new mention of the new option is missing.
Currently the input of analysis is a compilation database JSON file. The purpose of this PR is to support the following analysis invocations: ``` # Analyze one source file. CodeChecker analyze main.c -o reports #analyze all source files under a directory. CodeChecker analyze my_project -o reports ``` This way the analysisa project with multiple root directories (i.e. separate compilation databases for submodules in subdirectories) is supported.
LGTM |
[cmd] Multiroot directory analysis support
Currently the input of analysis is a compilation database JSON file. The
purpose of this PR is to support the following analysis invocations:
This way the analysisa project with multiple root directories (i.e.
separate compilation databases for submodules in subdirectories)
is supported.