- http://github.com/lefticus/presentations
- http://cppcast.com
- http://chaiscript.com - Open Session Thursday (tomorrow) morning at 8:00-8:45
- http://cppbestpractices.com
- Independent Contractor
https://en.wikipedia.org/wiki/Static_program_analysis
Static program analysis is the analysis of computer software that is performed without actually executing programs...
. . .
- We will be focusing on
- source code analysis (as opposed to object code)
- tools that are freely available
- issues I've seen in the wild
. . .
- Technically static analysis includes compiler warnings
- modern compiler warnings are very sophisticated, and include things that used to be considered 'analysis' (ex: analysis of printf formatting issues occurs on GCC with no extra warnings enabled)
- compiler warnings will only be brought up if they are unique to a particular compiler
- cppcheck
- clang's analyzer
- msvc++ analyze
- coverity's 'scan' (free for open source projects)
- metrix++
- copy-paste detectors
- There are dozens (hundreds?) of code analysis tools available.
- cppcheck has 320 checks
- clang has 56 checks
- MSVC has ~286 checks
#include <cassert>
bool do_something(int &i)
{
++i;
return i < 10;
}
int main()
{
int i = 0;
assert(do_something(i));
}
#include <cassert>
bool do_something(int &i)
{
++i;
return i < 10;
}
int main()
{
int i = 0;
// what happens in a release build?
assert(do_something(i));
// warning: i is initialized but unused
}
#include <cassert>
bool do_something(int &i)
{
++i;
return i < 10;
}
int main()
{
int i = 0;
// what happens in a release build?
assert(do_something(i));
if (i > 2) { /* ... */ } // no warning
}
- No tool tested complained as long as
i
was used. - clang and cppcheck say they are supposed to catch assert statements with side effects
bool test_value_1() { return true; }
bool do_something()
{
if (test_value_1()) {
return true;
} else {
return true;
}
}
int main()
{
do_something();
}
bool test_value_1() { return true; }
bool do_something()
{
if (test_value_1()) {
return true;
} else {
// pointless duplication
return true;
}
}
int main()
{
do_something();
}
bool test_value_1() { return true; }
bool test_value_2() { return true; }
bool do_something()
{
if (test_value_1()) {
if (test_value_2()) {
return true;
}
return true;
}
return false;
}
int main()
{
do_something();
}
bool test_value_1() { return true; }
bool test_value_2() { return true; }
bool do_something()
{
if (test_value_1()) {
// this test is useless
if (test_value_2()) {
return true;
}
return true;
}
return false;
}
int main()
{
do_something();
}
- Only cppcheck caught either case
- Surprisingly, this more complicated case was caught by Coverity Scan in ChaiScript
if (rt.bare_equal(boxed_type))
{
if (lt.bare_equal(boxed_pod_type))
{
return true;
}
return true;
}
It is unknown why our simplified case was not caught by coverity.
// Assume modern 64bit platform
#include <vector>
#include <iostream>
int main()
{
std::vector<int> v;
for (unsigned l = 0; l < v.size(); ++l)
{
std::cout << v[l] << '\n';
}
}
// Assume modern 64bit platform
#include <vector>
#include <iostream>
int main()
{
std::vector<int> v;
// this is bad because it limits the loop to 2^32 elements
for (unsigned l = 0; l < v.size(); ++l)
{
std::cout << v[l] << '\n';
}
}
// Assume modern 64bit platform
#include <vector>
#include <iostream>
int main()
{
std::vector<int> v;
for (long l = 0; l < v.size(); ++l)
{
std::cout << v[l] << '\n';
}
}
// Assume modern 64bit platform
#include <vector>
#include <iostream>
int main()
{
std::vector<int> v;
// this is bad because it limits the loop to 2^63 elements
for (long l = 0; l < v.size(); ++l)
{
std::cout << v[l] << '\n';
}
}
// Assume modern 64bit platform
#include <vector>
#include <iostream>
int main()
{
std::vector<int> v;
for (int l = 0; l < v.size(); ++l)
{
std::cout << v[l] << '\n';
}
}
// Assume modern 64bit platform
#include <vector>
#include <iostream>
int main()
{
std::vector<int> v;
// this is bad because it limits the loop to 2^31 elements
for (int l = 0; l < v.size(); ++l)
{
std::cout << v[l] << '\n';
}
}
// Assume modern 64bit platform
#include <vector>
#include <iostream>
int main()
{
std::vector<int> v;
// this is bad because it limits the loop to 2^31 elements
for (int l = 0; l < v.size(); ++l) // what else is odd?
{
std::cout << v[l] << '\n';
}
}
// Assume modern 64bit platform
#include <vector>
#include <iostream>
int main()
{
std::vector<int> v;
// this is bad because it limits the loop to 2^31 elements
for (int l = 0; l < v.size(); ++l) // Empty vector
{
std::cout << v[l] << '\n';
}
}
// Assume modern 64bit platform
#include <vector>
#include <iostream>
int main()
{
std::vector<int> v;
// this is bad because it limits the loop to 2^31 elements
for (int l = 0; l < v.size(); ++l) // Anything else odd?
{
std::cout << v[l] << '\n';
}
}
// Assume modern 64bit platform
#include <vector>
#include <iostream>
int main()
{
std::vector<int> v;
// this is bad because it limits the loop to 2^31 elements
for (int l = 0; l < v.size(); ++l) // We should be using a range-based loop
{
std::cout << v[l] << '\n';
}
}
- clang and MSVC warn about sign comparisons
- No tool catches both
Bonus
- cppcheck warns us we are iterating over an empty vector
- clang-tidy's modernize checks warn us the loop should be a range based loop
Adapted from http://googleresearch.blogspot.no/2006/06/extra-extra-read-all-about-it-nearly.html
#include <vector>
uint64_t binarySearch(const std::vector <int64_t> &v, int64_t key) {
int low = 0;
int high = v.size() - 1; // loss of precision
while (low <= high) {
int mid = (low + high) / 2;
int64_t midVal = v[mid]; // What happens with > 2B objects?
if (midVal < key) {
low = mid + 1;
} else if (midVal > key) {
high = mid - 1;
} else {
return mid; // key found
}
}
return -(low + 1); // key not found. What if negative?
}
int main()
{
return binarySearch(std::vector<int64_t>{1,2,3,4,5}, 2);
}
#include <system_error>
enum Enum
{
Value1,
Value2,
Value3
};
int go(Enum t_enum) {
switch (t_enum) {
case Enum::Value1:
return 0;
case Enum::Value2:
return 1;
default:
throw std::runtime_error("unknown value");
}
throw std::runtime_error("unknown value");
}
int main()
{
go(Enum::Value3);
}
#include <system_error>
enum Enum
{
Value1,
Value2,
Value3
};
int go(Enum t_enum) {
switch (t_enum) {
case Enum::Value1:
return 0;
case Enum::Value2:
return 1;
default:
throw std::runtime_error("unknown value");
}
throw std::runtime_error("unknown value");// this code is unreachable
}
int main()
{
go(Enum::Value3);
}
- MSVC warns about this in the IDE
- clang warns with
-Weverything
/ clang-tidy warns
Bonus
- coverity scan notes that the
throw
is unhandled inmain
void do_something()
{
int *i = nullptr;
*i = 5;
}
int main()
{
null_dereference_1();
}
void do_something()
{
int *i = nullptr;
// dereferencing obviously null value
*i = 5;
}
int main()
{
null_dereference_1();
}
int *get_i() {
return nullptr;
}
void do_something()
{
int *i = get_i();
*i = 5;
}
int main()
{
null_dereference_2();
}
int *get_i() {
return nullptr;
}
void do_something()
{
int *i = get_i();
// Indirect dereference
*i = 5;
}
int main()
{
do_something();
}
#include <memory>
void do_something()
{
std::shared_ptr<int> i;
// dereferencing obviously null value
*i = 5;
}
int main()
{
do_something();
}
- Everyone caught the obvious one
- Only cppcheck could catch the indirect nullptr dereference
- No tools could catch the smart pointer version
int main()
{
int a[5];
return a[5];
}
int main()
{
int a[5];
return a[5]; // indexing past end of array
}
#include <array>
int main()
{
std::array<int, 5> a;
return a[5];
}
#include <array>
int main()
{
std::array<int, 5> a;
return a[5]; // indexing past end of array
}
- The c-style array is caught by all tools
std::array
issue is only caught by cppcheck
Bonus
- MSVC notices that the c-style array is used uninitialized.
#include <functional>
#include <iostream>
std::function<void ()> do_something()
{
int some_value = 1;
return [&some_value](){ std::cout << some_value << '\n'; };
}
int main()
{
const auto f = do_something();
f();
}
#include <functional>
#include <iostream>
std::function<void ()> do_something()
{
int some_value = 1;
// some_value won't exist when this function is returned
return [&some_value](){ std::cout << some_value << '\n'; };
}
int main()
{
const auto f = do_something();
f();
}
#include <functional>
#include <iostream>
std::function<void ()> do_something()
{
int some_value;
return [some_value](){ std::cout << some_value << '\n'; };
}
int main()
{
const auto f = do_something();
f();
}
#include <functional>
#include <iostream>
std::function<void ()> do_something()
{
int some_value;
// some_value is used uninitialized
return [some_value](){ std::cout << some_value << '\n'; };
}
int main()
{
const auto f = do_something();
f();
}
#include <functional>
#include <iostream>
std::function<void ()> do_something()
{
int some_value;
return [&some_value](){ std::cout << some_value << '\n'; };
}
int main()
{
const auto f = do_something();
f();
}
#include <functional>
#include <iostream>
std::function<void ()> do_something()
{
int some_value;
// uninitialized and captured local by reference
return [&some_value](){ std::cout << some_value << '\n'; };
}
int main()
{
const auto f = do_something();
f();
}
- cppcheck, clang, and coverity got confused as to whether the variable was initialized if captured by reference
- msvc caught nothing
No tool warned about the actual capture local by reference and return
#include <utility>
struct Object {
void do_something() {}
};
void take(Object &&) { }
void do()
{
Object o;
take(std::move(o));
o.do_something();
}
int main()
{
do();
}
#include <utility>
struct Object {
void do_something() {}
};
void take(Object &&) { }
void do()
{
Object o;
take(std::move(o));
o.do_something(); // use of local after move
}
int main()
{
do();
}
- No tool commented on this problem at all.
Bonus
- cppcheck points out that technically
Object::do_something
can be static
#include <vector>
int main()
{
std::vector<int> v;
std::vector<int> v2(2,4);
std::vector<int> v3(3,4);
v.insert(v.begin(), v2.begin(), v3.end());
}
#include <vector>
int main()
{
std::vector<int> v;
std::vector<int> v2(2,4);
std::vector<int> v3(3,4);
v.insert(v.begin(), v2.begin(), v3.end()); // coverity catches this
}
#include <memory>
int main()
{
auto s = std::make_shared<int>(1);
int *p = s.get();
*p = 1;
s = std::make_shared<int>(2);
*p = 5;
}
#include <memory>
int main()
{
auto s = std::make_shared<int>(1);
int *p = s.get();
*p = 1;
s = std::make_shared<int>(2); // nobody catches this yet
*p = 5;
}
- analyzes code for various metrics
- can calculate cyclomatic complexity - a measure of code's complexity - and warn on overly complex sections of code
static Common_Types get_common_type(const Boxed_Value &t_bv)
{
const Type_Info &inp_ = t_bv.get_type_info();
if (inp_ == typeid(int)) {
return get_common_type(sizeof(int), true);
} else if (inp_ == typeid(double)) {
return Common_Types::t_double;
} else if (inp_ == typeid(long double)) {
return Common_Types::t_long_double;
} else if (inp_ == typeid(float)) {
return Common_Types::t_float;
} else if (inp_ == typeid(char)) {
return get_common_type(sizeof(char), std::is_signed<char>::value);
} else if (inp_ == typeid(unsigned char)) {
return get_common_type(sizeof(unsigned char), false);
} else if (inp_ == typeid(unsigned int)) {
return get_common_type(sizeof(unsigned int), false);
} else if (inp_ == typeid(long)) {
return get_common_type(sizeof(long), true);
} else if (inp_ == typeid(unsigned long)) {
return get_common_type(sizeof(unsigned long), false);
/* etc... */
- From the PMD project can detect duplicated code throughout your code base
} else {
if (*s == '\\') {
if (is_escaped) {
match.push_back('\\');
is_escaped = false;
} else {
is_escaped = true;
}
} else {
if (is_escaped) {
switch (*s) {
case ('b') : match.push_back('\b'); break;
case ('f') : match.push_back('\f'); break;
case ('n') : match.push_back('\n'); break;
case ('r') : match.push_back('\r'); break;
case ('t') : match.push_back('\t'); break;
case ('\'') : match.push_back('\''); break;
case ('\"') : match.push_back('\"'); break;
case ('$') : match.push_back('$'); break;
#include <vector>
template<typename T>
T add(T t, T u)
{
return t + u;
}
template<typename ... T>
std::vector<int> add_values(int value, T ... t)
{
return {add(t, value)...};
}
int main()
{
add_values(4);
}
#include <vector>
template<typename T>
T add(T t, T u)
{
return t + u;
}
template<typename ... T>
std::vector<int> add_values(int value, T ... t)
{
return {add(t, value)...}; // msvc complains `value` is unused in 0th case
}
int main()
{
add_values(4);
}
ChaiScript passed all of these analyzers (plus some free trials of commercials ones).
. . .
And then we discovered the sanitizers.
. . .
And a few more issues were found.
. . .
And then we discovered fuzzy testing
. . .
And MANY more issues were found.
bool Switch() {
bool retval = false;
size_t prev_stack_top = m_match_stack.size();
if (Keyword("switch")) {
retval = true;
if (!Char('(')) {
throw exception::eval_error("Incomplete 'switch' expression", File_Position(m_line, m_col), *m_filename);
/* snip */
while (Eol()) {}
if (Char('{')) {
retval = true;
/* snip */
}
build_match(AST_NodePtr(new eval::Switch_AST_Node()), prev_stack_top);
}
return retval;
}
bool Switch() {
size_t prev_stack_top = m_match_stack.size();
if (Keyword("switch")) {
if (!Char('(')) {
throw exception::eval_error("Incomplete 'switch' expression", File_Position(m_line, m_col), *m_filename);
/* snip */
while (Eol()) {}
if (Char('{')) {
/* snip */
}
build_match(AST_NodePtr(new eval::Switch_AST_Node()), prev_stack_top);
return true;
} else {
return false;
}
}
- C style issues are largely a "solved problem"
- But there are still many ways to abuse current best practices
- Modern C++ and C++ >= 11 analysis checking still has a long way to go
- You must use a combination of compilers / analyzers
- C++ Core Guidelines will almost certainly start steering the course of analysis soon
- Consider
-Werror -Weverything
(with selective disables) on clang- Consider
/W3 /WX /analyze
(with selective disables) on MSVC- Consider building your own analysis with libclang or adding to cppcheck
- Consider adding to https://github.com/lefticus/AnalysisTestSuite where these tests and others live
- http://github.com/lefticus
- http://chaiscript.com - Open Session Thursday (tomorrow) morning at 9:00-9:30
- http://cppcast.com
- http://cppbestpractices.com
- Independent Contractor - Always looking for new clients