Skip to content

ShimaMichael/cpp-Static-code-analyser

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Static Analyser for C++ Projects

A modular, rule‑based static analysis tool for C++ code. The analyser parses a project’s source files, tokenises and (optionally) builds an abstract syntax tree (AST), and applies a set of rules to detect style, complexity, and security issues. Results are reported as human‑readable summaries and per‑violation details.

This project was developed as part of a final‑year dissertation on static code analysis and software quality.

Features

  • Rule‑based analysis framework with a common Rule interface.

  • Project and source file management abstraction (Project, ProjectManager, SourceFile).

  • Tokenisation and lightweight AST support (Token, Tokeniser, ASTNode, ASTParser).

  • Violation representation and reporting (Violation, AnalysisReport, ReportManager).

  • Multi‑threaded analysis using a thread pool (ThreadPool, WorkerThread).

  • Command‑line interface to analyse a directory or single file.

  • Portable C++17 implementation with a Make‑based build system.

Current rule set (C++):

  • Line length rule: flags lines longer than 120 characters.

  • Unused variable rule: flags potentially unused local variables (simplified heuristic).

  • Function complexity rule: estimates cyclomatic complexity from control‑flow keywords.

  • Naming convention rule: flags identifiers that look like SCREAMING_SNAKE_CASE constants.

  • Security rule: flags calls to known unsafe C functions such as strcpy, gets, sprintf, scanf.

Project Structure

.
├── README.md
├── makefile
├── files.sh                
└── src
    ├── main.cpp
    ├── StaticAnalyser.cpp / .hpp
    ├── ProjectManager
    │   ├── Project.cpp / .hpp
    │   ├── ProjectManager.cpp / .hpp
    │   └── ReportManager.cpp
    ├── Report
    │   ├── AnalysisReport.cpp / .hpp
    │   └── ReportManager.hpp
    ├── Rule
    │   ├── Rule.cpp / .hpp
    │   ├── RuleManager.cpp / .hpp
    │   └── (individual rule headers if separated)
    ├── Tokens
    │   ├── Token.cpp / .hpp
    │   ├── Tokeniser.cpp / .hpp
    ├── AST
    │   ├── ASTNode.cpp / .hpp
    │   ├── ASTParser.cpp / .hpp
    ├── source
    │   ├── SourceFile.cpp / .hpp
    ├── Thread
    │   ├── ThreadPool.cpp / .hpp
    │   └── WorkerThread.cpp / .hpp
    ├── cache
    │   ├── CacheManager.cpp / .hpp
    ├── config
    │   ├── ConfigManager.cpp / .hpp
    └── violation
        ├── Violation.cpp / .hpp

Building

Prerequisites

  • C++17‑compatible compiler (e.g. g++, clang++).

  • Standard POSIX environment (tested on macOS / Linux).

  • make for building from the provided Makefile.

Build steps

From the project root:

bash make This will:

Recursively find all .cpp sources under src/.

Compile them into object files under build/.

Link them into the static_analyser executable in the project root.

Typical build output:

$ make
g++ -Wall -Wextra -Wpedantic -std=c++17 -O2 -c src/main.cpp -o build/src/main.o
...
g++ -Wall -Wextra -Wpedantic -std=c++17 -O2 -pthread -o static_analyser build/src/main.o ...

To clean build artifacts:

make clean

Usage

Basic command From the project root after building:

./static_analyser <project-path> [options]

Example:

./static_analyser ./example_project

The analyser will:

Discover source files under the given path (using ProjectManager).

Tokenise and, if configured, parse them into ASTs.

Apply all registered rules.

Produce a summary and per‑file violation details on stdout or via the reporting system.

Example workflow

Build your C++ project (optional but recommended so include paths and macros are correct).

Run the analyser on the project root or src directory.

Inspect the printed summary, or load generated reports if you output to file.

Architecture Overview

High‑level design The analyser is organised into several layers:

Project and source management

ProjectManager is responsible for discovering and registering source files, creating a Project object that describes the codebase. SourceFile encapsulates file path, content, and any derived metadata.

Lexical and syntactic analysis

Tokeniser converts raw source text into a sequence of Token objects (keywords, identifiers, literals, operators, etc.). ASTParser builds an ASTNode tree for files where deeper structural analysis is required.

Rule framework

The abstract base class Rule defines a common interface for all checks. Each concrete rule (e.g. LineLengthRule, UnusedVariableRule) implements an evaluate method that inspects the AST, Token list, and SourceFile to produce a vector of Violation objects.

Violation and reporting

A Violation captures:

A unique violation ID.

The Rule that triggered it.

The SourceFile.

Location (line and column).

A human‑readable message:


Severity level (INFO, WARNING, ERROR) and type (STYLE, COMPLEXITY, SECURITY, etc.).

AnalysisReport aggregates violations across the project and can produce summary strings or structured output. ReportManager coordinates formatting and output.

Execution engine

StaticAnalyser orchestrates the analysis:

Holds the Project.

Manages the set of Rule instances.

Coordinates tokenisation, parsing, and rule evaluation.

Supports multi‑threaded execution using ThreadPool and WorkerThread.

Implemented Rules

The exact implementation details live in the corresponding .cpp files; this section is for high‑level behaviour.

  • LineLengthRule

    Checks each line of a SourceFile. Flags a violation when line length exceeds 120 characters. Reports the line number, column, and the measured length.

  • UnusedVariableRule

    Scans tokens for variable declarations (keywords such as int, float, double, char followed by an identifier). Uses a simplified heuristic to flag variables that are likely unused. Currently focuses on int declarations for demonstration, but can be extended.

  • FunctionComplexityRule

    Counts occurrences of control‑flow keywords such as if, for, and while. Computes an approximate cyclomatic complexity as 1 + ifCount + forCount + whileCount. Flags a violation when the complexity exceeds a configurable threshold (e.g. 10).

  • NamingConventionRule

    Inspects identifier tokens. Detects identifiers in SCREAMING_SNAKE_CASE (all caps, underscores, digits). Flags potential naming convention violations (e.g. constants vs variables).

  • SecurityRule

    Scans for identifiers corresponding to unsafe C library functions such as:

    • strcpy

    • gets

    • sprintf

    • scanf

Emits a violation suggesting safer alternatives.

Configuration and Extensibility Adding a new rule To add a new rule:

Create a new class derived from Rule (e.g. NullPointerRule) and implement the evaluate method.

Use the tokens, AST, and SourceFile parameters as needed.

Construct and return any Violation objects that represent findings.

Register the rule in the StaticAnalyser setup (typically in main.cpp or a RuleManager):

auto analyser = std::make_shared<StaticAnalyser>(project, /*verbose*/ true);
analyser->registerRule(std::make_shared<LineLengthRule>());
analyser->registerRule(std::make_shared<UnusedVariableRule>());
// ...
analyser->registerRule(std::make_shared<NullPointerRule>());

Configuration files

If you add configuration support (e.g. via ConfigManager), you can:

  • Control thresholds (max line length, complexity limit).

  • Enable/disable specific rules.

  • Specify include/exclude patterns for files and directories.

Threading and Performance

ThreadPool and WorkerThread are used to parallelise analysis across multiple files.

The number of threads can be configured via a command‑line flag or configuration (e.g. StaticAnalyser::setThreadCount(int)).

Each worker processes files or tasks independently and pushes results back into the shared AnalysisReport.

Limitations and Future Work

Known limitations:

  • Parsing and analysis are tuned for typical C++ code, but do not yet cover all language features.

  • Some rules (e.g. unused variable detection) currently use relatively simple heuristics and may produce false positives or miss certain patterns.

Possible future improvements:

  • Richer AST‑based analysis with full control‑flow and data‑flow analysis.

  • Additional rules for resource management, exception safety, and concurrency issues.

  • Support for more languages or dialects.

  • IDE plugin or CI/CD integration (e.g. GitHub Actions, GitLab CI).

  • Machine‑readable output formats (JSON, SARIF) for integration with other tools.

Development

Prerequisites

C++17 toolchain (g++ or clang++).

make.

Optional: clang-format, clang-tidy for code style and extra checks.

Acknowledgements Inspired by existing static analysis tools such as LDRA, Cppcheck, clang‑tidy, and IKOS.

Developed as part of an academic project on software quality and static analysis.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors