Skip to content

Latest commit

 

History

History
4181 lines (3418 loc) · 179 KB

cpp-reference-card.org

File metadata and controls

4181 lines (3418 loc) · 179 KB

CPP / C++ and General Programming Ref Card

Modern C++ Style

Modern C++ code style based on C++11 and C++ core guidelines:

Miscellaneous

  • Prefer usign nullptr instead of 0 or NULL for null pointer.
  • Prefer enum classes (scoped enums) instead of the old C-enums as the old enums are non strongly typed and prone to name clashes.
  • Prefer defining type alias with “using” keyword instead of “typedef”.
  • In order to avoid unecessary copies, pass large objects by reference, const reference or by pointer instead of passing them by value.
  • Prefer using C++ string (std::string or std::wstring) to C-string const char*, char* and so on.
  • Use standard STL containers std::vector, std::deque, std::map, std::unordered_map instead of custom non-standard containers.
  • Instead of using heap-allocated arrays (A* pArray = new A[10];) when the array size is not known at compile-time, use std::vector (most likely), std::deque, std::list or any other STL container for avoiding accidental memory leaks and boilerplate memory management code. Note: std::vector already wraps a heap-allocated C-array.
// Avoid 
typedef double Speed;
typedef double (* MathFunction) (double);

// Better 
using Speed = double;
using MathFunction = double (*) (double);

Prefer using enum class to enums

Lots of possible runtime erros hard to detect can be avoided by using C++11 enum classes instead of enums that are vulnerable to implicit conversion to integer or any other type. Enum classes avoids those implicit conversion buges by yielding compile-time errors whenever there is an enum class implicit conversion that should be made explicit with static cast. Another problem of old enums is that they are not scoped which can lead to name clashes in large code.

Avoid:

enum ErrorCode {
   ErrorCode_OK,
   ErrorCode_SYSTEM_FAILURE, 
   ErrorCode_LOW_VIOLTAGE, 
  ... 
  ..
};

ErroCode x = ::getOperationStatus();

if(error == ErrorCode_OK){
  std::cout << "Proceed to next step" << "\n";
}

int x;
// Implicit convernsion bug 
x = error; 

Better:

enum class ErrorCode {
   OK,
   SYSTEM_FAILURE, 
   LOW_VIOLTAGE, 
  ... 
  ..
};

ErroCode x = ::getOperationStatus();

if(error == ErrorCode::OK){
  std::cout << "Proceed to next step" << "\n";
}
 
int x; 
// Compile-time error !
//-------------------
 x = error; 

// Conversion only possible with static cast 
x = static_cast<int>(error);

Function Parameter Passing:

  • Prefer passing parameters by value, reference or const reference rather than by pointer as in old C++ codes that looks like C with classes.

Avoid:

double vector_norm(const Vector* vec)
{
  // ... compute Euclidian norm ... 
   return value;
}

Better:

double vector_norm(Vector const& vec)
{
  // ... compute Euclidian norm ... 
   return value;
}

Function Parameter Passing of Polymorphic Objects

  • Pass polymorphic objects by pointer (T*) or referece (T&) or (const T&) rather than by smart pointer. Functions that accepts referece or pointer are more flexible tha functions that accepts smart pointers. – (Core Guideline F7)

Example: Class hierarchy.

 class Shape{
    public: 
      virtual double       GetArea() const = 0;
      virtual std::string  Name() const = 0;
      virtual ~Shape() = default;
 };

class Square: public Shape  {  ...   };
class Circle: public Shape  {  ...  };

std::unique<Shape> shapeFactory(std::string cosnt& name)
{
   if(name == "square") return std::make_unique<Square>();
   if(name == "circle") return std::make_unique<Circle>();
   return nullptr;
}

Avoid:

// Avoid: 
void printShapeInfo(std::unique_ptr<Shape> const& shape)
{
  std::cout << "The shape name is " << shape->Name()  
            << " ; area is "        << shape->Area() << "\n" ;
}
 
// Or: 
void printShapeInfo(std::shared_ptr<Shape> const& shape)
{
  std::cout << "The shape name is " << shape->Name()  
            << " ; area is "        << shape->Area() << "\n" ;
}

Better:

  • The previous functions only work with smart pointers, the following functions using reference or pointer works with smart pointers or stack allocated objects.
void printShapeInfoA(Shape const& shape)
{
  std::cout << "The shape name is " << shape.Name()  
            << " ; area is "        << shape.Area() << "\n" ;
}

// If the function can accept a no-shape parameter, better use pointer: 
void printShapeInfoB(Shape* pShape)
{
  if(pShape == nullptr)
     return; // Do nothing.
  std::cout << "The shape name is " << shape->Name()  
            << " ; area is " << shape->Area() << "\n" ;
}

Square shapeStack;
std::unique<Shape> shapeHeap = shapeFactory("square");
printShapeInfoA(shapeStack);
printShapeInfoA(*shapeHeap);

printShapeInfoB(&shapeStack);
printShapeInfoB(shapeHeap.get());

Function Return Value

Many old C++ codes avoided returning large objects by value due to the copy-constructor overhead in C++98. In those codes, functions returned the result by setting some parameter passed by pointer or reference.

Old C++: (Pre C++11 or C++98)

  • Code afraid of returning by value due to the copy overhead.
// Code afraid of returnig by value or returning multiple-values as parameter. 
void sum(std::vector<double> const& xs, std::vector<double> const& ys, std::vector<double>& result)
{  
    // Pre-condition 
    assert(xs.size() == ys.size() && xs.size() == result.size());
    for(size_t i = 0; i < xs.size(); i++)
       result[i] = xs[i] + ys[i];
}

// Usage: 
std::vector<double> xs;
std::vector<double> ys;
xs.resize(3); 
xs.push_back(1); xs.push_back(4); xs.push_back(5);
ys.resize(3);
ys.push_back(6); ys.push_back(8); ys.push_back(9);

std::vector<double> result(xs.size());
sum(xs, ys, result);
DisplayResult(result);

Modern C++: (>= C++11)

  • Returning by value is safe and efficient due to the compiler RVO (Return Value Optimization), copy elision and move semantics (move constructor and move destructor) which eliminates the copy-overhead of temporary objects. Since, C++11 all STL containers implements move semantics member functions which makes returning by value more efficient and safer.
  • Remark:
    • Returning by value is safe and efficient in C++11 due to RVO (Return-value optimization) and move semantics.
vstd::vector<double> 
sum(std::vector<double> const& xs, std::vector<double> const& ys)
{  
    // Pre-condition 
    assert(xs.size() == ys.size());
    std::vector<double> result(xs.size());
    for(size_t i = 0; i < xs.size(); i++)
       result[i] = xs[i] + ys[i];

   // Copy may not happen due to move semantics (move member functions)
   // and/or Return-Value Optimization.
   return result;
}

// Usage: 
//----------------------------------//

// Uniform initialization with initializer list 
std::vector<double> xs {1, 4, 5};
std::vector<double> ys = {6, 8, 9};

std::vector<double> result = sum(xs, ys);
// Or:
auto result  = sum(xs, ys);
DisplayResult(result);    

Memory Ownership

Raw pointers should not own memory or be responsible for releasing memory due to them be prone to memory leaks which can happen due to missing call to delete operator; exceptions befored the delete operator; functions with early return multiple return paths; and shared ownership of the heap-allocated memory.

Summary:

  • Avoid calling new and delete directly, instead use std::make_unique, std::make_shared from header <memory>.
  • Avoid using raw pointers for memory ownership, instead use smart pointers.
  • Smart pointers should only be used for heap-allocated objects (objects allocated at runtime), never stack-allocated ones.
  • Rule of thumb for choosing std::unique_ptr or shared_ptr
    • If more than one objects need to refere to some heap-allocated object during their entire lifetime, the best choice is std::shared_ptr.

Avoid:

Shape* shapeFactory(std::string cosnt& name)
{
   // WARNING: new operator can throw std::bad_alloc 
   if(name == "square") return new Square();
   if(name == "circle") return new Circle();
   return nullptr;
} 

void clientCode(Shape* sh){
   std::cout << "Name = " << sh->Name() << " ; Area = " << sh->Area() << "\n";
}

// Usage: 
//-------------------------------
Shape* shape = shapeFactory("square"); 
clientCode(shape); 

// Exception happens => Memory Leak!  
// Forget to delete ==> Memory leak!
delete shape;

Better:

  • Note: A factory function or any function returning a polymorphic object should preferentially return an unique_ptr smart pointer instead of shared_ptr because unique_ptr has a lower overhead than shared_ptr and it is easier to convert unique_ptr to shared_ptr, but the other way around is harder.
 std::unique_ptr<Shape> 
 shapeFactory(std::string cosnt& name)
 {
    // WARNING: new operator can throw std::bad_alloc 
    if(name == "square") return std::make_unique<Square>(300 ,400);
    if(name == "circle") return std::make_unique<Circle>();
    return nullptr;
 }

 void clientCode(Shape const& sh){
    std::cout << "Name = " << sh.Name() << " ; Area = " << sh.Area() << "\n";
 } 

// Usage: 

// Releases allocated memory automatically when out scope. 
std::unique_ptr<Shape> shape = shapeFactory("square"); 

// Or: 
auto shape = shapeFactory("square"); 
clientCode(*shape);

References and Further Reading

Standard Library Reference Card

STL Components

  • Containers - standard collections or data structures, they are a fundamental building block of most programming languages, in C++ the addition benefit is that most of them abstracts away the memory allocation as they can grow or shrink during the program runtime.
    • Sequential
      • vector
      • deque
      • array
      • list
      • forward list
      • valarray [DEPRECATED] - It would provide a Fortran-like fixed size array for linear algebra. But the STL implementation is incomplete.
    • Associative
      • Ordered Associative Container
        • map - key-value data structure, also known as dictionary. A map always have unique keys. hash-map, hash table and so on.
        • set - A set is data structure which cannot have any repeated values.
        • multimap - A multimap can have repeated keys.
        • multiset
      • Unordered Associative Containers
        • unordered_map
        • unordered_set
  • Iterators
  • Algorithms
  • Adapters
    • Queue
    • Stack
  • Functors - Function-objects or objects that can be called like a function. Functors have several use cases in the STL, for instance many STL containers and algorithms expects functors as arguments or optional arguments and also the STL provides many standard functors in the header <functional>
  • Allocators

Further references:

See:

STL Sequential Container Methods - Cheat Sheet

Use Cases

Use Cases:

  • vector
    • Operations where the vector size is known in advance and it is necessary constant access time for random access to any element. Example of use case: linear algebra and numerical algorithms. Insertion of elements at end or at the front is efficient, however it less efficient than the deque container and whenever a new element is added. Vectors are not ideal for operations where the number of elements is not known because its elements are stored in C-array allocated in the heap, as result, all elements are reallocated whenever a new element is added or removed.
    • Use cases:
      • General sequential container
      • Linear algebra and numerical algorithms
      • C++ replacement for C-arrays
      • C-arrays interoperability
  • deque
    • Operations with requires fast random access time and fast insertion or deletion of elements at both ends. Unlike vectors, deque is not stored internally as a C-array and unlike vectors, whenever an element is inserted, any reallocation happens which means that deques are more efficient than vectors when the size of container is not known in advance.
    • Use Case:
      • General sequential container
      • Fast random access
      • Number of elements aren’t known in advance.

Member Functions / Methods reference table

Method of Container<T>Return typeDescriptionvectordequelistarray
Element Access
operator[](int n)T&return nth-element, doesn’t throw exception.yesyesnoyes
at(int n)T&return nth-element, but throws exception.yesyesnoyes
front()T&return first elementyesyesyesyes
back()T&return last elementyesyesyesyes
data()T*Return pointer to first element of container.yesnonoyes
Capacity
size()size_tReturn number of container elements.yesyesyesyes
max_size()size_tReturn maximum container size.yesyesyesyes
empty()boolReturn true if container is emptyyesyesyesyes
reserve(size_t n)voidReserve a minimum storage for vectors.yesnonono
resize(size_t n)voidResize container to n elements.yesyesyesno
Modifiers
push_back(T t)voidAdd element at the end of containeryesyesyesno
push_front(T t)voidAdd element at the beggining of container.yesyesyesno
pop_back()voidDelete element at the end of container.yesyesyesno
pop_front()voidDelete element at beginning of container.yesyesyesno
emplace_backvoidConstruct and insert element at the end without copying.yesyesyesno
clear()voidRemove all elements.yesyesyesno
fill(T t)voidFill all elementsnononoyes
Iterator
begin()iteratorReturn iterator to beggining
end()iteratorReturn iterator to end
rbegin()iteratorReturn reverse iterator to beggining
rend()iteratorReturn reverse iterator to end
cbegin()iteratorReturn const iterator to beginning
cend()iteratorReturn const iterator to end
crebegin()iteratorReturn const iterator to beginning
crend()iteratorReturn const iterator to end

Constructors

Vector constructors:

// Empty vector 
>> std::vector<double> xs1
(std::vector<double> &) {}

// Intialize vector with a given size
>> std::vector<double> xs2(5, 3.0)
(std::vector<double> &) { 3.0000000, 3.0000000, 3.0000000, 3.0000000, 3.0000000 }

// Constructor with uniform initialization 
>> std::vector<double> xs4 {1.0, -2.0, 1.0, 10 }
(std::vector<double> &) { 1.0000000, -2.0000000, 1.0000000, 10.000000 }

// =========== Constructors with C++11 auto keyword =============//

>> auto xs1 = vector<double>()
(std::vector<double, std::allocator<double> > &) {}
>> 
>> auto xs2 = vector<double>(5, 3.0)
(std::vector<double, std::allocator<double> > &) { 3.0000000, 3.0000000, 3.0000000, 3.0000000, 3.0000000 }
>> 
>> auto xs3 = vector<double>{1, -2, 1, 1}
(std::vector<double, std::allocator<double> > &) { 1.0000000, -2.0000000, 1.0000000, 1.0000000 }
>> 

Deque constructors:

>> std::deque<int> ds1
(std::deque<int> &) {}
>> 
>> std::deque<int> ds2(5, 2)
(std::deque<int> &) { 2, 2, 2, 2, 2 }
>> 
>> std::deque<int> ds3 {2, -10, 20, 100, 20}
(std::deque<int> &) { 2, -10, 20, 100, 20 }
>> 
// ======== Constructors with auto type inference ========== //
>> auto ds1 = std::deque<int>()
(std::deque<int, std::allocator<int> > &) {}
>> 
>> auto ds2 = std::deque<int>(5, 2)
(std::deque<int, std::allocator<int> > &) { 2, 2, 2, 2, 2 }
>> 
>> auto ds3 = std::deque<int>{2, -10, 20, 100, 20}
(std::deque<int, std::allocator<int> > &) { 2, -10, 20, 100, 20 }
>> 

References:

Tips and tricks

Pass containers by reference or const reference

If the intent of the operation is not modify the container, it is preferrable to pass it by const reference in order to avoid copying overhead.

For instance, the function:

double computeNorm(std::vector<double> xs)
{
 // The vector xs is copied here, if it has 1GB of memory.
 // It will use 2GB instead of 1GB!
  ... ... 
}

Should be written as:

double computeNorm(const std::vector<double>& xs)
{
  ... ... 
}
double computeNorm(const std::list<double>& xs)
{
  ... ... 
}
double computeNorm(const std::deque<double>& xs)
{
  ... ... 
}

Use the member function emplace_back to avoid uncessary copies.

Example:

  • file: stl-emplace.cpp
#include <iostream>
#include <ostream>
#include <iomanip>
#include <string>
#include <vector>
#include <deque>

struct Product{
        std::string  name;	
        int          quantity;
        double       price;
        Product(){
                std::cerr << " [TRACE] - Empty constructor invoked\n";
        }
        Product(const std::string& name, int quantity, double price):
                name(name),
                quantity(quantity),
                price(price){
                std::cerr << " [TRACE] - Product created as " << *this << "\n" ;
        }
        // The compiler generate an copy constructor automatically,
        // but this one was written to instrument C++ value semantics
        // and check when copies happen.
        Product(const Product& p){
                this->name		= p.name;
                this->quantity	= p.quantity;
                this->price		= p.price;
                std::cerr << " [TRACE] Copy constructor invoked -> copied = " << *this << "\n";
        }
        // Copy assignment-operator
        void operator=(const Product& p){
                this->name		= p.name;
                this->quantity	= p.quantity;
                this->price		= p.price;
                std::cerr << " [TRACE] Copy assignment operator invoked = " << *this << "\n";		
        }
        // Make class printable 
        friend std::ostream& operator<< (std::ostream& os, const Product& p)
        {
                int size1 = 10;
                int size2 = 2;
                return os << " Product{ "
                                  << std::setw(1) << " name = "       << p.name
                                  << std::setw(10) << "; quantity  = "  << std::setw(size2) << p.quantity
                                  << std::setw(size1) << "; price = "      << std::setw(size2) << p.price
                                  << " }";
        }
};


int main(){
        auto inventory = std::deque<Product>();

        // Using push_back
        std::cerr << "====== Experiment .push_back() ======\n";
        std::cerr << " [INFO] - Adding orange with .push_back\n";
        inventory.push_back(Product("Orange - 1kg", 10, 3.50));
        std::cerr << " [INFO] - Adding rice with .push_back \n";
        inventory.push_back({"Rice bag", 20, 0.80});

        // Using emlace_back
        std::cerr << "====== Experiment .emplace_back() ======\n";	
        std::cerr << " [INFO] - Adding apple with .emplace_back \n";
        inventory.emplace_back("Fresh tasty apple", 50, 30.25);
        std::cerr << " [INFO] - Adding soft drink with .emplace_back \n";
        inventory.emplace_back("Soft drink", 100, 2.50);

        std::cerr << " ====== Inventory =======\n";
        // Print inventory
        int nth = 0;
        for(const auto& p: inventory){
                std::cout << "product " << nth << " = " << p << "\n";
                nth++;
        }	
        return 0;
}

Running:

  • It can be seen in the program output that .emplace_back doen’t invoke the copy constructor, so it has less overhead than .push_back which copies the passed element.
$ clang++ stl-emplace.cpp -o stl-emplace.bin -g -std=c++11 -Wall -Wextra && ./stl-emplace.bin

====== Experiment .push_back() ======
 [INFO] - Adding orange with .push_back
 [TRACE] - Product created as  Product{  name = Orange - 1kg; quantity  = 10; price = 3.5 }
 [TRACE] Copy constructor invoked -> copied =  Product{  name = Orange - 1kg; quantity  = 10; price = 3.5 }
 [INFO] - Adding rice with .push_back 
 [TRACE] - Product created as  Product{  name = Rice bag; quantity  = 20; price = 0.8 }
 [TRACE] Copy constructor invoked -> copied =  Product{  name = Rice bag; quantity  = 20; price = 0.8 }
====== Experiment .emplace_back() ======
 [INFO] - Adding apple with .emplace_back 
 [TRACE] - Product created as  Product{  name = Fresh tasty apple; quantity  = 50; price = 30.25 }
 [INFO] - Adding soft drink with .emplace_back 
 [TRACE] - Product created as  Product{  name = Soft drink; quantity  = 100; price = 2.5 }
 ====== Inventory =======
product 0 =  Product{  name = Orange - 1kg; quantity  = 10; price = 3.5 }
product 1 =  Product{  name = Rice bag; quantity  = 20; price = 0.8 }
product 2 =  Product{  name = Fresh tasty apple; quantity  = 50; price = 30.25 }
product 3 =  Product{  name = Soft drink; quantity  = 100; price = 2.5 }

Methods of C++ STL Vetor<T>

Vector Class MemberDescription
Constructors
vector<a>(int size)Create a vector of size n
vector<a>(int size, a init)Create a vector of size n with all elements set to init
vector<a>(a [])Intialize vector with an C-Array.
Methods
vector<a>[i]Get the element i of a vector. i ranges from 0 to size - 1
int vector<a>::size()Get vector size
a vector<a>::at(i)Get the nth element of a vector and checks if the index is within the bounds
bool vector<a>::empty()Returns true if vector is empty and false, otherwise.
void vector<a>::resize(int N)Resize vector to N elements.
void vector<a>::clear()Remove all elements and sets the vector size to 0.
void vector<a>::push_back(elem a)Insert element at the end of v.
a vector<a>::begin()Returns first element.
a vector<a>::end()Returns last element
void vector<a>::pop_back()Remove last element of vector.

Associative Container - Map methods

Map is a data structure similar to a hash map, also known as dictionary hash table or dictionary. However, stl std::map is not implemented as true hash table as all data inserted in std::map are sorted. Due to the implementation and sorting, std::map is less performant than std::unordered_map, which is implemented as true hash table, therefore in most cases std::unordered_map is better choice than std::map.

Documentation:

Method of map<K, V>Return type
Capacity
empty()boolReturn true if container empty
size()size_tReturn number of elements
max_size()sizet_tReturn maximum number of elements
Element Access
operator[](K k)V&Return value associated to key k. It doesn’t throw exception.
at(K k)V&Return value associated to key k. Note: it can throw exception.
find(const K& k)iteratorSearch for an element and returns map::end if it doesn’t find the given key.
count(const K& k)size_tCount number of elements with a given key.
Modifiers
clear()voidRemove all elements.
insert(std::pair<K, V> pair)voidInsert a new key-value pair.
emplace(Args&&& … args)pair<iterator, bool>

Map example:

  • File: map-container.cpp
#include<iostream>
#include<string>
#include<map>
#include <iomanip>

struct Point3D{
        double x;
        double y;
        double z;
        Point3D(): x(0), y(0), z(0){}
        Point3D(double x, double y, double z): x(x), y(y), z(z){}
        /* Copy constructor 
     * -> Implement redundant copy constructor for logging purposes and 
     * detect when copy happens. 
     */
        Point3D(const Point3D& p){		
                std::cerr << " I was copied" << std::endl;
                this->x = p.x;
                this->y = p.y;
                this->z = p.z;
        }
        ~Point3D() = default;
};

std::ostream& operator<< (std::ostream& os, const Point3D& p){
        os << std::setprecision(3) << std::fixed;
        return os << "Point3D{"
                          << "x = "  << p.x
                          << ",y = " << p.y
                          << ", z = "<< p.z
                          << "}";
}

int main(){	
        auto locations = std::map<std::string, Point3D>();
        locations["point1"] = Point3D(2.0, 3.0, 5.0);
        locations["pointX"] = Point3D(12.0, 5.0, -5.0);
        locations["pointM"] =  {121.0, 4.0, -15.0};
        locations["Origin"] = {}; // Point32{} or Point3D()
	
        // Invokes copy constructor
        std::cerr << "  <== Before inserting" << "\n";
        locations.insert(std::pair<std::string, Point3D>("PointO1", Point3D(0.0, 0.0, 0.0)));
        std::cerr << "  <== After inserting" << "\n";
	
        // operator[] doesn't throw exception 
        std::cout << "point1 = " << locations["point1"] << "\n";
        std::cout << "pointX = " << locations.at("pointX") << "\n";
        std::cout << "pointM = " << locations.at("pointM") << "\n";

        // Safer and uses exception 
        try {
                std::cout << "pointY = " << locations.at("pointY") << "\n";
        } catch(const std::out_of_range& ex){
                std::cout << "Error - not found element pointY. MSG = " << ex.what() << "\n";
        }

        if(auto it = locations.find("pointX"); it != locations.end())
                std::cout << " [INFO]= => Location pointX found =  " << it->second << "\n";

        if(locations.find("pointMAS") == locations.end())
                std::cout << " [ERROR] ==> Location pointMAS  not found" << "\n";
	
        std::cout << "Key-Value pairs " << "\n";
        std::cout << "-------------------------" << "\n";
        for (const auto& x: locations)
                std::cout << x.first << " : " << x.second << "\n";
        std::cout << '\n';

        return 0;
}

Running:

$ clang++ map-container.cpp -o map-container.bin -std=c++1z -Wall -Wextra  && ./map-container.bin

  <== Before inserting
 I was copied
 I was copied
  <== After inserting
point1 = Point3D{x = 2.000,y = 3.000, z = 5.000}
pointX = Point3D{x = 12.000,y = 5.000, z = -5.000}
pointM = Point3D{x = 121.000,y = 4.000, z = -15.000}
pointY = Error - not found element pointY. MSG = map::at
 [INFO]= => Location pointX found =  Point3D{x = 12.000,y = 5.000, z = -5.000}
 [ERROR] ==> Location pointMAS  not found
Key-Value pairs 
-------------------------
Origin : Point3D{x = 0.000,y = 0.000, z = 0.000}
PointO1 : Point3D{x = 0.000,y = 0.000, z = 0.000}
point1 : Point3D{x = 2.000,y = 3.000, z = 5.000}
pointM : Point3D{x = 121.000,y = 4.000, z = -15.000}
pointX : Point3D{x = 12.000,y = 5.000, z = -5.000}

Associative Container - Unordered map

The unordered map, introduced in C++11, is generally faster for insertion and deletion of elements since the unordered map is implemented as a true hash table, unlike the std::map which is implemented as tree. The downside of unordered_map this data structure is the loss of elements sorting.

Benefits:

  • True hash table.
  • Faster for insertion, retrieval and removal of elements that the map.

Downsides:

  • Loss of elements insertion order.

Example:

Constructors:

std::unordered_map<std::string, int> m1;

auto m2 = std::unordered_map<std::string, int>{};

// Uniform initialization 
//--------------------------
>> std::unordered_map<std::string, int> m3 {{"x", 200}, {"z", 500}, {"w", 10}, {"pxz", 70}}
 { "pxz" => 70, "w" => 10, "z" => 500, "x" => 200 }

//  More readable 
>> auto m4 = std::unordered_map<std::string, int> {{"x", 200}, {"z", 500}, {"w", 10}, {"pxz", 70}}
 { "pxz" => 70, "w" => 10, "z" => 500, "x" => 200 }

Insert Elements:

>> auto m = std::unordered_map<std::string, int>{}

>> m["x"] = 100
(int) 100
>> m["x"] = 100;
>> m["z"] = 5;
>> m["a"] = 6710;
>> m["hello"] = -90;
>> m["sword"] = 190;

>> m
{ "sword" => 190, "hello" => -90, "a" => 6710, "x" => 100, "z" => 5 }

Insert element using stl::pair:

>> auto mm = std::unordered_map<std::string, int>{};

>> mm.insert(std::make_pair("x", 200));
>> mm.insert(std::make_pair("z", 500));
>> mm.insert(std::make_pair("w", 10));

>> mm["x"]
(int) 200
>> mm["w"]
(int) 10
>> 

Number of elements:

>> m.size()
(unsigned long) 6
>>

Retrieve elements:

>> m["x"]
(int) 100
>> m["sword"]
(int) 190
>>
// Doesn't  throw exception if element is not found 
>> m["sword-error"]
(int) 0
>> 

// Throw exception if element is not found
>> m.at("x")
(int) 100
>> m.at("sword")
(int) 190
>> m.at("sword error")
Error in <TRint::HandleTermInput()>: std::out_of_range caught: _Map_base::at
>> 
>> 

Find element:

// -------- Test 1 -----------//
auto it = m.find("sword");
if(it != m.end()) {
        std::cout << "Found Ok. => {"
                  << "key = " << it->first
                  << " ; value = " << it->second
                  << " }"
                  << "\n";
	
} else {
        std::cout << "Error: key not found." << "\n";
}
// Output: 
Found Ok. => {key = swordvalue = 190 }
> 

// -------- Test 1 -----------//

auto it = m.find("this key will not be found!");
if(it != m.end()) {
     std::cout << "Found Ok. => {"
               << "key = "      << it->first
               << " ; value = " << it->second
               << " }"
               << "\n";
} else {
    std::cout << "Error: key not found." << "\n";
}
// ----- Output: ----------//
Error: key not found.
>> 

Loop over container elements:

for(const auto& p: m) {
         std::cout << std::setw(5) << "key = " << std::setw(6) << p.first
                   << std::setw(8) << " value = " << std::setw(5) << p.second
                   << "\n";
}

// Output: 
key =  sword value =   190
key =  hello value =   -90
key =      a value =  6710
key =      x value =   100
key =      z value =     5

Loop with iterator and stl “algorithm” std::for_each.

std::for_each(m.begin(), m.end(),
               [](const std::pair<std::string, int>& p){
                       std::cout << std::setw(5)  << p.first
                                 << std::setw(10) << p.second
                                 << "\n";									  
               });
// Output:
sword       190
hello       -90
    a      6710
    x       100
    z         5

Associative Container - Multimap

The container std::multimap is similar to map, however it allows repeated keys.

Header: <map>

Documentation:

Examples:

  • Initialize std::multimap
#include <iostream>
#include <string>
#include <map>

std::multimap<std::string, int> dict;

>> dict
(std::multimap<std::string, int> &) {}
>> 

// Insert pair object 
dict.insert(std::make_pair("x", 100));
dict.insert(std::make_pair("status", 30));
dict.insert(std::make_pair("HP", 250));
dict.insert(std::make_pair("stamina", 100));
dict.insert(std::make_pair("stamina", 600));
dict.insert(std::make_pair("x", 10));
dict.insert(std::make_pair("x", 20));

>> dict
{ "HP" => 250, "stamina" => 100, "stamina" => 600, "status" => 30, "x" => 100, "x" => 10, "x" => 20 }
>> 

Find all pair with a given key

// Find elements:
>> auto it = dict.find("x"); // Iterator
>> 
for(auto it = dict.find("x"); it != dict.end(); it++){ 
  std::printf(" ==> it->first = %s ; it->second = %d\n", it->first.c_str(), it->second); 
}
/** Output: 
  ==> it->first = x ; it->second = 100
  ==> it->first = x ; it->second = 10
  ==> it->first = x ; it->second = 20
 */

Count all elements with a given key

>> dict.count("x")
(unsigned long) 3

>> dict.count("stamina")
(unsigned long) 2

>> dict.count("HP")
(unsigned long) 1

>> dict.count("")
(unsigned long) 0

>> dict.count("wrong")
(unsigned long) 0
>> 

Iterate over multimap:

 for(const auto& pair : dict){ 
   std::printf(" ==> key = %s ; value = %d\n", pair.first.c_str(), pair.second); 
 }
 /** Output: 
    ==> key = HP ; value = 250
    ==> key = stamina ; value = 100
    ==> key = stamina ; value = 600
    ==> key = status ; value = 30
    ==> key = x ; value = 100
    ==> key = x ; value = 10
    ==> key = x ; value = 20
*/

Clear multimap object:

>> auto dict2 = std::multimap<std::string, int> { {"x", 100}, {"y", 10}, {"x", 500}, {"z", 5}};
>> dict2
 { "x" => 100, "x" => 500, "y" => 10, "z" => 5 }

>> dict2.size()
(unsigned long) 4

>> dict2.clear();

>> dict2
{}

>> dict2.size()
(unsigned long) 0

Associative Container - Sets

Set std::set is an associative container implementing the mathematical concept of finite set. This container stores sorted unique values and any attempt to insert a repeated value will discard the value to be inserted.

  • Header: <set>
  • Implementation: Binary search tree.
  • Note: as this collection has sorting, its unordered version, without sorting, std::unordered_set performs better.

Example: Set constructors

  • Instantiate a set object with a default constructor (constructor with empty parameters):
#include <iostream> 
#include <string>
#include <set>

std::set<int> s1;

>> s1.insert(10);
>> s1.insert(20);
>> s1.insert(20);
>> s1.insert(30);
>> s1.insert(40);
>> s1
(std::set<int> &) { 10, 20, 30, 40 }
>> s1.insert(40);
>> s1
(std::set<int> &) { 10, 20, 30, 40 }
  • Instantiate a set with initializer list constructor:
>> auto s2 = std::set<std::string>{ 
    "hello", "c++", "c++", "hello", "world", "world", 
     "c++11", "c++", "c++17", "c++17"
   };
>> s2
{ "c++", "c++11", "c++17", "hello", "world" }
>> 

// Any repeated element is discarded 
>> s2.insert("c++");
>> s2
{ "c++", "c++11", "c++17", "hello", "world" }
  • Instantiate a set with range constructor or iterator pair constructor:
>> std::vector<int> numbers {-100, 1, 2, 10, 2, 1, 3, 15, 3, 5, 4, 4, 3, 3, 2};

>> std::set<int> sa1(numbers.begin(), numbers.end());
>> sa1
(std::set<int> &) { -100, 1, 2, 3, 4, 5, 10, 15 }


>> auto sa2 = std::set<int>{numbers.begin() + 4, numbers.end() - 2};
>> sa2
{ 1, 2, 3, 4, 5, 15 }
  • Instantiate a set with copy constructor.
    • std::set<T>(const T&)
>> std::set<int> xs{1, 1, 10, 1, 2, 5, 10, 4, 4, 5, 1};
>> xs
{ 1, 2, 4, 5, 10 }

>> std::set<int> copy1(xs);
>> copy1
(std::set<int> &) { 1, 2, 4, 5, 10 }

>> auto copy2 = xs;
>> copy2
{ 1, 2, 4, 5, 10 }

>> auto copy3 = std::set<int>{xs};
>> copy3
{ 1, 2, 4, 5, 10 }

>> if(&copy1 != &xs){ std::puts(" => Not the same"); }
 => Not the same

>> if(&copy2 != &xs){ std::puts(" => Not the same"); }
 => Not the same

>> if(&copy3 != &xs){ std::puts(" => Not the same"); }
 => Not the same
  • Instantiating a set with a move constructor.
    • std::set<T>(T&&)
>> std::set<int> xs1{1, 1, 10, 1, 2, 5, 5, 6, 10, 4, 4, 5, 1, 6, 7, 7};

>> xs1
{ 1, 2, 4, 5, 6, 7, 10 }

// Move constructor:  
>> std::set<int> m1(std::move(xs1));
>> m1
(std::set<int> &) { 1, 2, 4, 5, 6, 7, 10 }
>> xs1
(std::set<int> &) {}
>>

>> std::set<int> xs2{1, 1, 10, 1, 2, 5, 5, 6, 10, 4, 4, 5, 1, 6, 7, 7};
>> xs2
(std::set<int> &) { 1, 2, 4, 5, 6, 7, 10 }

// ========  Move constructor ===================
>> auto m2 = std::move(xs2);
>> m2
{ 1, 2, 4, 5, 6, 7, 10 }
>> xs2
(std::set<int> &) {}
>> 

Operations on sets:

Instantiating sample set:

>> auto aset = std::set<int> {1, 1, 10, 1, 2, 5, 5, 6, 10, 4, 4, 5, 1, 6, 7, 7};
>> aset
{ 1, 2, 4, 5, 6, 7, 10 }   

Count number of elements:

>> aset.size()
(unsigned long) 7
>> 

Clear set (remove all elements):

>>  auto asetb = std::set<int> {1, 1, 10, 1, 2, 5, 5, 6, 10, 4, 4, 5, 1, 6, 7, 7};
>> asetb
{ 1, 2, 4, 5, 6, 7, 10 }

>> asetb.clear();
>> asetb
{}

>> asetb.empty()
(bool) true

Check whether an element is in the set without iterator:

>> aset.count(10)
(unsigned long) 1
>> aset.count(100)
(unsigned long) 0
>> aset.count(1)
(unsigned long) 1
>> aset.count(-12)
(unsigned long) 0
>> 
>> if(aset.count(10) != 0 ) { std::puts("Element in the set."); }
Element in the set.
>> if(aset.count(10)) { std::puts("Element in the set."); }
Element in the set.
>> if(aset.count(25) != 0 ) { std::puts("Element in the set."); }
>> if(aset.count(25)) { std::puts("Element in the set."); }
>> 

Check if element is in the set with iterator:

>> aset
{ 1, 2, 4, 5, 6, 7, 10 }

>> aset.find(10)
(std::set<int, std::less<int>, std::allocator<int> >::iterator) @0x22f1ff0
>> 

std::set<int>::iterator it;
>> if((it = aset.find(10)) != aset.end()) std::printf(" ==> Found element = %d\n", *it)
 ==> Found element = 10

>> if((it = aset.find(2)) != aset.end()) std::printf(" ==> Found element = %d\n", *it) 
 ==> Found element = 2

>> if((it = aset.find(-100)) != aset.end()) std::printf(" ==> Found element = %>> ", *it)

// Or: ----------------------------------------------------
>> auto itr = aset.find(7);
>> if(itr == aset.end()) std::puts("Element not found");
>> if(itr != aset.end()) std::puts("Element  found");
Element  found
>> int element = *itr
(int) 7
>> 

Remove element from set:

>> aset
{ 1, 2, 4, 5, 6, 7, 10 }

>> auto itr2 = aset.find(10);
// Remove element using iterator.
>> aset.erase(itr2);

>> aset
{ 1, 2, 4, 5, 6, 7 }

// Segmentation fault!! 
>> aset.erase(aset.find(-10));
free(): invalid pointer

Iterate over a set:

int i = 0;
for(const auto& x: aset){  std::printf(" element[%d] = %d\n", ++i, x); }

// For-range based loop
>> for(const auto& x: aset){  std::printf(" element[%d] = %d\n", ++i, x); }
 element[1] = 1
 element[2] = 2
 element[3] = 4
 element[4] = 5
 element[5] = 6
 element[6] = 7
 element[7] = 10

// Iterator based loop 
int j = 0;
for(auto it = aset.begin(); it != aset.end(); it++){  std::printf(" element[%d] = %d\n", ++j, *it); }

>> for(auto it = aset.begin(); it != aset.end(); it++){  std::printf(" element[%d] = %d\n", ++j, *it); }
 element[1] = 1
 element[2] = 2
 element[3] = 4
 element[4] = 5
 element[5] = 6
 element[6] = 7
 element[7] = 10

Bitset Container

Class template for representing a sequence of N bits.

Default Constructor:

#include <bitset>

 >> #include <bitset>

 >> std::bitset<4> b;

 >> std::cout << " b = " << b << std::endl;
  b = 0000

Test bits;

// Set bit 0 
>> b.set(0)
(std::bitset<4UL> &) @0x7f92db9c7010

>> b
(std::bitset<4> &) @0x7f92db9c7010
>> std::cout << " b = " << b << std::endl;
 b = 0001

// Set bit 1 and 3 
>> b.set(1).set(3)
(std::bitset<4UL> &) @0x7f92db9c7010

>> std::cout << " b = " << b << std::endl;
 b = 1011

Test bits:

// Check whether bit 0 is set  (equal to 1)
>> b.test(0)
(bool) true

// Check whether bit 1 is set
>> b.test(1)
(bool) true

// Check whether bit 1 is set
>> b.test(2)
(bool) false

// Check whether bit 1 is set
>> b.test(3)
(bool) true
>> 

// Clear bit 0 
>> b.set(0, false);
>> b.test(0)
(bool) false

Create a bitset initialized with some integer value:

>> std::bitset<8> b1{0xAE};

>> std::cout << "b1 = " << b1 << std::endl;
b1 = 10101110

// Test bits 
>> b1.test(0)
(bool) false
>> b1.test(1)
(bool) true
>> b1.test(7)
(bool) true
>> b1.test(6)
(bool) false
>> 

// Number of bits 
>> b1.size()
(unsigned long) 8
>> 

Convert to numerical value:

// Convert to numerical value 
>> b1.to_ulong()
(unsigned long) 174

>> 0xAE
(int) 174

Flip bitset:

>> b1.flip()
(std::bitset<8UL> &) @0x7f92db9c7018

>> b1.to_ulong()
(unsigned long) 81

>> std::cout << "b1 flipped = " << b1 << std::endl;
b1 flipped = 01010001
>> 

Create bitset from binary string:

>> auto bb = std::bitset<8>("01010001");

>> bb
(std::bitset<8> &) @0x7f92db9c7020

>> std::cout << " bb = " << bb << "\n";
 bb = 01010001

>> bb.to_ulong()
(unsigned long) 81
>> 

>> bb.test(0)
(bool) true

>> bb.test(1)
(bool) false

Getting individual bits:

>> std::cout << "bit0 = " << bb[0] << " ; bit1 = " << bb[1] << " ; bit2 = " << bb[2] << "\n";
bit0 = 1 ; bit1 = 0 ; bit2 = 0
>> 

>> if(bb[1]){ std::puts("bit is set"); } else { std::puts("bit is cleared"); }
bit is cleared

>> if(bb[2]){ std::puts("bit is set"); } else { std::puts("bit is cleared"); }
bit is cleared

>> if(bb[3]){ std::puts("bit is set"); } else { std::puts("bit is cleared"); }
bit is cleared

>> if(bb[5]){ std::puts("bit is set"); } else { std::puts("bit is cleared"); }
bit is cleared

>> if(bb[6]){ std::puts("bit is set"); } else { std::puts("bit is cleared"); }
bit is set
>> 

Getting reference to individual bit:

>> auto gpio0 = bb[0]
(std::bitset<8>::reference &) @0x7f92db9c7038

>> (int) gpio0
(int) 1

>> gpio0 = true;
>> (int) gpio0
(int) 1

>> gpio0 = false;
>> (int) gpio0
(int) 0

>> (bool) gpio0
(bool) false

Bitset to string:

>> auto ba = std::bitset<8>("01010101");
>> ba
(std::bitset<8> &) @0x7f92db9c7058

>> std::string repr(ba.to_string('0', '1'));
>> repr
(std::string &) "01010101"
>> 

See:

General C++ Reference Card

Data Types and Data Models

C++ Types and Data Models

This table shows the numeric types data sizes in bits per memory model, architechture operating system and ISA - Instruction Set Architechture. Note: *ptr is the pointer size in bits.

DataArch. ISAOperating System*ptr.shortintlonglong
Modelsize_tlong
16 Bits Systems
IP16PDP-11Unix (1973)16-16--
IP16L32PDP-11Unix (1977)16161632-
LP32x86 (16 bits)Microsft Win16 and Apple’ MacOSX32161632-
32 Bits Systems
I116LP32MC680000, x86 (16 bits)Macintosh (1982), Windows1616
ILP32IBM-370Vax Unix32163232-
ILP32LL or ILP32LL64x86 or IA32Microsft Win323216323264
64 Bits Systems
LLP64, IL32LLP64 or P64x86-x64 (IA64, AMD64)Microsft Win64 (x64 / x86)6416323264
LP64 or I32LP64IA64, AMD64Linux, Solaris, DEC OSF, HP UX6416326464
ILP64-HAL6416326464
SILP64-UNICOS

Sumary:

  • ILP32
    • int, long and pointer are all 32 bits
  • ILP32LL - Used by most compilers and OSes on 32 bits platforms. (De facto standard for 32 bits platforms)
    • int, long, and pointer are all 32 bits, but the type long long has 64 bits in size.
  • LP64 - Used by most 64 bit Unix-like OSes, including Linux, BSD and Apple’s Mac OSX (De facto standard for 64 bits platforms)
    • int, long and ponter are all 64 bits.
  • ILP64
    • int, long and pointer are all 64 bits.
  • LLP64 (Used by Windows 64 bits)
    • pointers and long long are 64 bits and the types int and long are 32 bits.

Note:

  • It is not safe to rely on the size of numeric data type or make assumptions about the numeric sizes. In cases where the size of the data type matters such as serialization, embedded systems or low level code related to hardware it is better to use fixed-width integer.
  • Underflow and overflow can lead to undefined behaviors and unpredictable results.

References:

Float Point Numebers

TypeSize (bits)Size (bytes)Description
Float Points
float324Single-precision IEEE754 float point
double648Double-precision IEEE754 float point
long float12816Quadruple-precision IEEE754 float point

Fixed-Width Numeric Types

TypeSizeSizeDescriptionMaximum number of
(bits)(bytes)decimal digits
Fixed-width integer
int8_t818-bits signed int2
uint8_t1628-bits unisgned int (positive)2
int16_t16216-bits signed int4
uint16_t32416-bits unsigned int4
int32_t32432-bits signed int9
uint32_t32432-bits unsigned int9
int64_t64864-bits signed int18
uint64_t64864-bits unsigned int18

Sample code for showing numeric limits:

File:

/*******************************************************************************************
 * File: numeric-limits.cpp 
 * Brief: Shows the numeric limits for all possible numerical types.  
 * Author: Caio Rodrigues
 *****************************************************************************************/

#include <iostream>
#include <limits>    // Numeric limits 
#include <iomanip>   // setw, and other IO manipulators 
#include <string>    // std::string 
#include <cstdint>   // uint8_t, int8_t, ...
#include <functional>

struct RowPrinter{
        int m_left;  // Left alignment 
        int m_right; // Right alignment  
        RowPrinter(int left, int right): m_left(left), m_right(right){
                // Print bool as 'true' or 'false' instead of 0 or 1.
                std::cout << std::boolalpha;
        }
	
        template<class A>
        auto printRow(const std::string& label, const A& value) const -> void {
                std::cout << std::setw(m_left)  << label
                                  << std::setw(m_right) << value << "\n";
        }
};

#define SHOW_INTEGER_LIMITS(numtype) showNumericLimits<numtype>(#numtype)
#define SHOW_FLOAT_LIMITS(numtype)   showFloatPointLimits<numtype>(#numtype)

template <class T>
void showNumericLimits(const std::string& name){
        RowPrinter rp{30, 25};	
        std::cout << "Numeric limits for type: " << name << "\n";
        std::cout << std::string(60, '-') << "\n";
        rp.printRow("Type:",                    name);
        rp.printRow("Is integer:",              std::numeric_limits<T>::is_integer);
        rp.printRow("Is signed:",               std::numeric_limits<T>::is_signed);
        rp.printRow("Number of digits 10:",     std::numeric_limits<T>::digits10);
        rp.printRow("Max Number of digits 10:", std::numeric_limits<T>::max_digits10);

        // RTTI - Run-Time Type Information 
        if(typeid(T) == typeid(uint8_t)
           || typeid(T) == typeid(int8_t)
           || typeid(T) == typeid(bool)
           || typeid(T) == typeid(char)
           || typeid(T) == typeid(unsigned char)
                ){
                // Min Abs - samllest positive value for float point numbers 
                rp.printRow("Min Abs:",         static_cast<int>(std::numeric_limits<T>::min()));
                // Smallest value (can be negative)
                rp.printRow("Min:",             static_cast<int>(std::numeric_limits<T>::lowest()));
                // Largest value  
                rp.printRow("Max:",             static_cast<int>(std::numeric_limits<T>::max()));	
        } else {
                rp.printRow("Min Abs:",         std::numeric_limits<T>::min());
                rp.printRow("Min:",             std::numeric_limits<T>::lowest());
                rp.printRow("Max:",              std::numeric_limits<T>::max());
        }
        rp.printRow("Size in bytes:",       sizeof(T));
        rp.printRow("Size in bits:",        8 * sizeof(T));
        std::cout << "\n";
}

template<class T>
void showFloatPointLimits(const std::string& name){
        RowPrinter rp{30, 25};	
        showNumericLimits<T>(name);
        rp.printRow("Epsilon:",        std::numeric_limits<T>::epsilon());
        rp.printRow("Min exponent:",   std::numeric_limits<T>::min_exponent10);
        rp.printRow("Max exponent:",   std::numeric_limits<T>::max_exponent10);
}

int main(){
        SHOW_INTEGER_LIMITS(bool);
        SHOW_INTEGER_LIMITS(char);
        SHOW_INTEGER_LIMITS(unsigned char);
        SHOW_INTEGER_LIMITS(wchar_t);

        // Standard integers in <cstdint>
        SHOW_INTEGER_LIMITS(int8_t);
        SHOW_INTEGER_LIMITS(uint8_t);
        SHOW_INTEGER_LIMITS(int16_t);
        SHOW_INTEGER_LIMITS(uint16_t);
        SHOW_INTEGER_LIMITS(int32_t);
        SHOW_INTEGER_LIMITS(uint32_t);
        SHOW_INTEGER_LIMITS(int64_t);
        SHOW_INTEGER_LIMITS(uint64_t);

        SHOW_INTEGER_LIMITS(short);
        SHOW_INTEGER_LIMITS(unsigned short);
        SHOW_INTEGER_LIMITS(int);
        SHOW_INTEGER_LIMITS(unsigned int);
        SHOW_INTEGER_LIMITS(long);
        SHOW_INTEGER_LIMITS(unsigned long);
        SHOW_INTEGER_LIMITS(long long);
        SHOW_INTEGER_LIMITS(unsigned long long);
	
        SHOW_FLOAT_LIMITS(float);
        SHOW_FLOAT_LIMITS(double);
        SHOW_FLOAT_LIMITS(long double);
	
    return 0;
}

Output:

$ clang++ numeric-limits.cpp -o numeric-limits.bin -g -std=c++11 -Wall -Wextra && ./numeric-limits.bin

...   ...   ...   ...   ...   ...   ...   ...   ...   ... 

Numeric limits for type: short
------------------------------------------------------------
                         Type:                    short
                   Is integer:                     true
                    Is signed:                     true
          Number of digits 10:                        4
      Max Number of digits 10:                        0
                      Min Abs:                   -32768
                          Min:                   -32768
                          Max:                    32767
                Size in bytes:                        2
                 Size in bits:                       16

Numeric limits for type: unsigned short
------------------------------------------------------------
                         Type:           unsigned short
                   Is integer:                     true
                    Is signed:                    false
          Number of digits 10:                        4
      Max Number of digits 10:                        0
                      Min Abs:                        0
                          Min:                        0
                          Max:                    65535
                Size in bytes:                        2
                 Size in bits:                       16

Numeric limits for type: int
------------------------------------------------------------
                         Type:                      int
                   Is integer:                     true
                    Is signed:                     true
          Number of digits 10:                        9
      Max Number of digits 10:                        0
                      Min Abs:              -2147483648
                          Min:              -2147483648
                          Max:               2147483647
                Size in bytes:                        4
                 Size in bits:                       32

Numeric limits for type: unsigned int
------------------------------------------------------------
                         Type:             unsigned int
                   Is integer:                     true
                    Is signed:                    false
          Number of digits 10:                        9
      Max Number of digits 10:                        0
                      Min Abs:                        0
                          Min:                        0
                          Max:               4294967295
                Size in bytes:                        4
                 Size in bits:                       32

Numeric limits for type: long
------------------------------------------------------------
                         Type:                     long
                   Is integer:                     true
                    Is signed:                     true
          Number of digits 10:                       18
      Max Number of digits 10:                        0
                      Min Abs:     -9223372036854775808
                          Min:     -9223372036854775808
                          Max:      9223372036854775807
                Size in bytes:                        8
                 Size in bits:                       64

Numeric limits for type: unsigned long
------------------------------------------------------------
                         Type:            unsigned long
                   Is integer:                     true
                    Is signed:                    false
          Number of digits 10:                       19
      Max Number of digits 10:                        0
                      Min Abs:                        0
                          Min:                        0
                          Max:     18446744073709551615
                Size in bytes:                        8
                 Size in bits:                       64

Numeric limits for type: long long
------------------------------------------------------------
                         Type:                long long
                   Is integer:                     true
                    Is signed:                     true
          Number of digits 10:                       18
      Max Number of digits 10:                        0
                      Min Abs:     -9223372036854775808
                          Min:     -9223372036854775808
                          Max:      9223372036854775807
                Size in bytes:                        8
                 Size in bits:                       64

Numeric limits for type: unsigned long long
------------------------------------------------------------
                         Type:       unsigned long long
                   Is integer:                     true
                    Is signed:                    false
          Number of digits 10:                       19
      Max Number of digits 10:                        0
                      Min Abs:                        0
                          Min:                        0
                          Max:     18446744073709551615
                Size in bytes:                        8
                 Size in bits:                       64

... ....    ... ....    ... ....    ... ....    ... .... 

Numeric Literals

LiteralSuffixTypeDescriptionSizeof Bytes
2001-intsigned integer4
20uu or Uunsingned int4
0xFFuu or Uunsigned intunsingned int literal in hexadecimal (0xff = 255)4
100l or 100Ll or Llong8
100ul or 100ULul or ULunsigned long8
0xFAul or 0xFAULunsigned longunsigned long literal in hexadecimal format (0xfa = 250)8
100.23f or 100.23Ff or Ffloat32 bits IEEE754 Float Point number mostly used in games and computer graphics.8
20.12 (default)double64 bits IEEE754 Float Point number commonly used in scientific computing.4

Types of Parameter Passing

Parameter PassingAlternativeParameter t passed by
Value
T tby value
const T* tconst T* t
Pointer
T* tT *tpased by pointer
T t []T* tby pointer, this notation is used for C-array parameters
Reference
T& tT &tby reference or L-value reference
const T& tconst T &tby const reference or const L-value reference.
T const& t-by const reference - alternative notation
T&& tT &&tby r-value reference
template<class T> function(T&& t)-Universal reference can become either L-value or R-value reference.

Notes:

  • Function here means both member function (class methods) or free functions (aka ordinary functions).
  • Parameters passed by value cannot be modified within the function as they are copied. It happens for all C++ types, including instances of classes what is different from most OO languages like Java, C#, Python and etc.
  • When an object is passed by value, its copy constructor is invoked, as a result a copy is created.
  • Prefere passing large objects such large matrices or arrays by reference or const reference when the function is not supposed to modify the parameter in order to avoid memory overhead due to copy.
  • I is better to pass objects instantiated on the heap (dynamic memory) with new operator using smart pointers (unique_ptr, shared_ptr) in order to avoid memory leaks.

Operators and operator overload

Summary Table

DescriptionOperatorClass operator overload declaration
Equal toa == b
Logical not!a, !false, !true
Logical anda && b
Logical ora |\vert b
Pre increment (prefix)++i
Post incrementi++
Pre decrement++i
Post incrementi–
Addition assignment (+=)a += b ; a <- a + b
Subtraction assignment (-=)a -= b ; a <- a - b
Multiplication assignment (*=)a *= b ; a <- a * b
Division assignment (/=)a /= b ; a <- a / b
Subscript, array indexa[b]A C::operator [](S index)
Indirection - defeference*aA C::operator *()
Address or reference&aA* C::operator &()
Structure dereferencea->memberFunction(x)
Structure reference (.)a.memberFunction(x)- N/A
Function call (function-object declaration)A(p0, p1, p2)R C::operator()(P0 p0, P1 p1, P2 p2)
Ternary conditional - similar to if x = (if cond 10 20)a ? b : c- N/A
Scope resolution operatorClass::staticMethod(x)- N/A
Sizeof - returns size of type at compile-timesizeof(type)- N/A

For more details check out:

Operator Overload Snippet 1

class SomeClass{
private:
    // ---->> Private data here <------
public:
    SomeClass(){}
    SomeClass(double x, double y){
        m_x = x;
        m_y = y;
    }
    // Copy assignment operator 
    SomeClass& operator=(const SomeClass& other){
        //  ...  ......
    }
    // Equality operator - check whether current object is equal to
    // the other.
    //-----------------------------------------------
    bool operator==(const SomeClass& p){
        return this->x == p.x && this->y == p.y;
    }
    // Not equal operator - checks whether current object is not equal to
    // the other.
    //-----------------------------------------------
    bool operator!=(const SomeClass& p){
        return this->x != p.x || this->y != p.y;
    }
    // Not logical operator (!) Exclamation mark.
    // if(!obj){ ... }
    //-----------------------------------------------
    bool operator! (){
        return this->m_data != nullptr;
    }
    // Operator ++obj
    //-----------------------------------------------
    SomeClass& operator++(){
        this->m_counter += 1;
        return *this;
    }

    // Operator (+)
    // SomeClass a, b;
    // SomeClass c = a + b;
    SomeClass operator+(SomeClass other){
        SomeClass res;
        res.x = m_x + other.x;
        res.y = m_y + other.y;
        return res;
    }
    // Operator (+)
    SomeClass operator+(double x){
        SomeClass res;
        res.x = m_x + x
        res.y = m_y + x
        return res;
    }
    // Operator (*)
    SomeClass operator*(double x){
        SomeClass res;
        res.x = res.x * x;
        res.y = res.y * x;
        return res;
    }

    // Operator (+=)
    // SomeClass cls;
    // cls += 10.0;
    SomeClass& operator +=(double x){
        m_x += x;
        m_y += y;
        return *this;
    }
    // Operator index -> obj[2]
    // SomeClass cls;
    // double z = cls[2];
    //-----------------------------------------------
    double operator[](int idx){
        return this->array[idx];
    }
    // Function application operator
    // SomeClass obj;
    // double x = obj();
    //-----------------------------------------------
    double operator()(){
        return m_counter * 10;
    }
    // Function application operator
    // SomeClass obj;
    // double x = obj(3.4, "hello world");
    //-----------------------------------------------
    double operator()(double x, std::string msg){
        std::cout << "x = " << x << " msg  = " << msg;
        return 3.5 * x;                                       
    }
    // Operator string insertion, allows printing the current object 
    // SomeClass obj;
    // std::cout << obj << std::enl;
    //-----------------------------------------------
    friend std::ostream& operator<<(std::ostream &os, const SomeClass& cls){
        // Print object internal data structure 
        os << cls.m_x << cls.m_y  ;
        return os;
    }
};

Operator Overload Snippet 2

File: SomeClass.hpp - Header file.

class SomeClass{
private:
    // ---->> Private data here <------
public:
    SomeClass(){}
    SomeClass(double x, double y);
    bool operator==(const SomeClass& p);
    bool operator!=(const SomeClass& p);
    bool operator! ();
    SomeClass& operator++();
    SomeClass operator+(SomeClass other);
    SomeClass operator+(double x);
    SomeClass operator*(double x);
    SomeClass& operator +=(double x);
    double operator[](int idx);
    double operator()();
    double operator()(double x, std::string msg);
    friend std::ostream& operator<<(std::ostream &os, const SomeClass& cls);
};

File: SomeClass.cpp - implementation

SomeClass::SomeClass(){}

SomeClass::SomeClass(double x, double y){
        m_x = x;
        m_y = y;
    }
    
// Equality operator - check whether current object is equal to
// the other.
//-----------------------------------------------
bool SomeClass::operator==(const SomeClass& p){
    return this->x == p.x && this->y == p.y;
}

// Not equal operator - checks whether current object is not equal to
// the other.
//-----------------------------------------------
bool SomeClass::operator!=(const SomeClass& p){
    return this->x != p.x || this->y != p.y;
}
    
// Not logical operator (!) Exclamation mark.
// if(!obj){ ... }
//-----------------------------------------------
bool SomeClass::operator! (){
    return this->m_data != nullptr;
}
    
// Operator ++obj
//-----------------------------------------------
SomeClass& SomeClass::operator++(){
    this->m_counter += 1;
    return *this;
}

// Operator (+)
// SomeClass a, b;
// SomeClass c = a + b;
SomeClass SomeClass::operator+(SomeClass other){
    SomeClass res;
    res.x = m_x + other.x;
        res.y = m_y + other.y;
        return res;
}
// Operator (+)
SomeClass SomeClass::operator+(double x){
    SomeClass res;
    res.x = m_x + x;
    res.y = m_y + x;
    return res;
}
// Operator (*)
SomeClass SomeClass::operator*(double x){
    SomeClass res;
    res.x = res.x * x;
    res.y = res.y * x;
        return res;
}

// Operator (+=)
// SomeClass cls;
// cls += 10.0;
SomeClass& SomeClass::operator +=(double x){
    m_x += x;
    m_y += y;
    return *this;
}
    
    
// Operator index -> obj[2]
// SomeClass cls;
// double z = cls[2];
//-----------------------------------------------
double SomeClass::operator[](int idx){
    return this->array[idx];
}
    
// Function application operator
// SomeClass obj;
// double x = obj();
//-----------------------------------------------
double SomeClass::operator()(){
    return m_counter * 10;
}

// Function application operator
// SomeClass obj;
// double x = obj(3.4, "hello world");
//-----------------------------------------------
double SomeClass::operator()(double x, std::string msg){
    std::cout << "x = " << x << " msg  = " << msg;
    return 3.5 * x;                                       
}

// Operator string insertion, allows printing the current object 
// SomeClass obj;
// std::cout << obj << std::enl;
//-----------------------------------------------
friend std::ostream& SomeClass::operator<<(std::ostream &os, const SomeClass& cls){
    // Print object internal data structure 
    os << cls.m_x << cls.m_y  ;
    return os;
}

Array index operator overload

This example how to overload the operator array index to allow returning a value or performing an assignment operation.

File: array-index-overload.cpp

#include <iostream>
#include <vector>

class Container{
private:
        std::vector<double> xs =  { 1.0, 2.0, 4.0, 6.233, 2.443};
public:
    Container(){}
    double& operator[](int index){
        return xs[index];
    }
};

int main(){
    Container t;
    std::cout << "t[0] = " << t[0] << std::endl;
    std::cout << "t[1] = " << t[1] << std::endl;
    std::cout << "t[2] = " << t[2] << std::endl;
    std::cout << "\n--------\n";
    t[0] = 3.5;
    std::cout << "t[0] = " << t[0] << std::endl;
    t[2] = -15.684;
    std::cout << "t[2] = " << t[2] << std::endl;    
    return 0;
}

Running:

$ cl.exe array-index-overload.cpp /EHsc /Zi /nologo /Fe:out.exe && out.exe
t[0] = 1
t[1] = 2
t[2] = 4

--------
t[0] = 3.5
t[2] = -15.684

Conversion Operators and user-defined type conversion

Conversion operators allow to convert a class to any type implicitly or explicitly with type-cast operator static_cast<T>.

Example:

  • ROOT Script File: conversion-operator.cpp
#include <iostream>
#include <string>

#define LOGFUNCTION(type)  std::cerr << "Convert to: [" << type << "] => Called: line " \
        << __LINE__ << "; fun = " << __PRETTY_FUNCTION__ << "\n"

// Or: struct Dummy { 
class Dummy{
public:
        bool flag = false;

        // Type conversion operator which converts an instance
        // of dummy to double.	
        explicit operator double() {
                LOGFUNCTION("double");
                return 10.232;
        }	
        #if 1
        // Implicit conversion to int is not allowed, it is only possible to convert
        // this object explicitly with static_cast. 	
        explicit operator int() const {
                LOGFUNCTION("int");
                return 209;
        }	
        explicit operator long() const {
                LOGFUNCTION("long");
                return 100L;
        }
        operator std::string() const {
                LOGFUNCTION("std::string");
                return "C++ string std::string";
        }
        explicit operator const char*() const {
                LOGFUNCTION("const char*");
                return "C string";
        }		
        operator bool() const {
                LOGFUNCTION("bool");
                std::cerr << " Called " << __FUNCTION__ << "\n";
                return flag;
        }
        #endif 
};

Testing:

  • C-style casting
>> .L conversion-operator.cpp 
>> Dummy d;  

>> (double) d
Convert to: [double] => Called: line 15; fun = double Dummy::operator double()
(double) 10.232000

>> (int) d
Convert to: [int] => Called: line 22; fun = int Dummy::operator int() const
(int) 209

>> (long) d
Convert to: [long] => Called: line 26; fun = long Dummy::operator long() const
(long) 100

>> (std::string) d
Convert to: [std::string] => Called: line 30; fun = std::string Dummy::operator basic_string() const
(std::string) "C++ string std::string"
>> 
  • C++ style casting:
>> static_cast<int>(d)
Convert to: [int] => Called: line 22; fun = int Dummy::operator int() const
(int) 209
>> 
>> static_cast<long>(d)
Convert to: [long] => Called: line 26; fun = long Dummy::operator long() const
(long) 100
>> 
>> static_cast<double>(d)
Convert to: [double] => Called: line 15; fun = double Dummy::operator double()
(double) 10.232000

>> static_cast<std::string>(d)
Convert to: [std::string] => Called: line 30; fun = std::string Dummy::operator basic_string() const
(std::string) "C++ string std::string"
>> 

>> static_cast<bool>(d)
Convert to: [bool] => Called: line 38; fun = bool Dummy::operator bool() const
 Called operator bool
(bool) false
>> 

>> d.flag = true
(bool) true

>> static_cast<bool>(d)
Convert to: [bool] => Called: line 38; fun = bool Dummy::operator bool() const
 Called operator bool
(bool) true
>> 
  • Simulating implicit conversion:
    • Note: implicitly assignment type conversion is not allowed for operators annotated with explicit. So it is not possible to perform the assignment: const char* s = d
// Implicit conversion 
>> std::string message = d
Convert to: [std::string] => Called: line 30; fun = std::string Dummy::operator basic_string() const
(std::string &) "C++ string std::string"
>> 
>> std::cout << "text = " << message << "\n";
text = C++ string std::string
>> 
>> 

>> const char* s = d
ROOT_prompt_16:1:13: error: no viable conversion from 'Dummy' to 'const char *'
const char* s = d
            ^   ~
// Conversion operators marked as explicit can only casted using C-style cast or 
// or static_cast<T>
>> const char* s = static_cast<const char*>(d)
Convert to: [const char*] => Called: line 34; fun = const char *Dummy::operator const char *() const
(const char *) "C string"

>> d ? "true" : "false";
Convert to: [bool] => Called: line 38; fun = bool Dummy::operator bool() const
 Called operator bool

>> d ? "true" : "false"
Convert to: [bool] => Called: line 38; fun = bool Dummy::operator bool() const
 Called operator bool
(const char *) "true"

>> d.flag = false;

>> d ? "true" : "false"
Convert to: [bool] => Called: line 38; fun = bool Dummy::operator bool() const
 Called operator bool
(const char *) "false"
>> 
  • Bool type conversion in conditional statements.
>> d.flag = true;

>> if(d) { std::cout << "Flag is true OK" << std::endl; }
Convert to: [bool] => Called: line 38; fun = bool Dummy::operator bool() const
 Called operator bool
Flag is true OK

>> d.flag = false;

>> if(!d) { std::cout << "Flag is false OK" << std::endl; }
Convert to: [bool] => Called: line 38; fun = bool Dummy::operator bool() const
 Called operator bool
Flag is false OK
>> 
>> 
  • Note: The macro __PRETTY_FUNCTION__ is only available in GCC or CLANG, in MSVC use __FUNCSIG__

Further Reading:

General C++ Terminology and Concepts

Design Principles

  • Performance Oriented
    • Zero-cost abstractions.
    • Avoid runtime cost.
    • Value speed over safety.
    • Don’t pay for what you don’t use.
  • Backward compatibility - avoid breaking old code.
    • Backward compatibility with C
    • Backward compatibility with old versions of C++.
  • Explicit is Better than implicit (Python lemma). For instance, explicit conversion with C++-style casting operators static_cast or reinterpret_pointer are better and safer than C implicit conversion.
  • Type-safety: Compile-time errors are always better than run-time errors as compile-timer errors can be caught earlier and doesn’t cause bad surprises if it is deployed elsewhere.

Pointers

  • Pointer: A variable which holds the address of another variable. It is used for indirect access of variables, accessing memory mapped IO in embedded systems or in low-level software and also for referencing heap-allocated objects. All C++ ordinary pointers (not function pointers or pointer to member functions) have the same size and store a numeric address of some memory location, the only difference between pointers of different type is the type of memory location that they reference.
  • Types of Pointers
    • Ordinary pointers: int*, const char*, Object* …
    • Pointer to function, aka function pointer
    • Pointer to member function (pointer to class method)
    • Pointer to member variable (pointer to class variable or field)
    • Smart “pointers”: (they are not pointers) Stack-allocated objects used for managing heap-allocated objects through RAAI and pointer emulation.
  • Wild pointer
    • Non-intialized pointer.
  • Dangling pointer
    • A pointer which points to an object that was deleted or to a non-valid memory address. Segmentation faults crashes can happen if one attempts to delete a dangling pointer or invoke object’s method through a dangling pointer.
  • Null pointer
  • Void pointer void*
    • A pointer without any specific type associated. A pointer to any type can be converted to void pointer and void pointer can be coverted back to any type. A void pointer also cannot be used before being casted.
    • Can point to:
      • To primitive types int, float, char and so on.
      • To class instances.
      • To functions. Function pointers can be casted to void*
    • Cannot point to:
      • member functions or class methods. So, pointers to member functions cannot be casted to void*.
      • member variables or pointer to class variables. So, pointers to member variables cannot be casted to void*.
    • Use cases:
      • Root class. C++ doesn’t have a root class from which all classes inherites like Java’s Object class. A root class allows unrelated types to be stored in the same data structure or collection and perform type erasure. Void* pointer can work as “pseudo” root class as the pointer to any class can be coverted to it.
      • Type erasure of pointer to primitive types, pointer to classes and pointer to member functions.
      • Type erasure in C-APIs, for instance, malloc and C-API GetProcAddress from Windows which returns a function pointer to a function exported by a DLL casted as void*.
  • Owning X Non-owning pointers
    • An owning pointer is responsible to release some allocated memory for a heap-allocated object. In general, raw pointers should not be used as owning pointer as they provide no indication if they point to a heap-allocated object or stack-allocated object or to an heap-allocated array. Another problem, is that every Type ptr* = new Type statement needs to be matched by an delete statement and it is easy to forget to track all possible execution paths. Besides that, raw pointers aren’t exception safe since a matching delete statement may not be executed if an exception occurs. In modern C++, only smart pointers should be used as owning pointers.
  • Opaque pointer, also called handler
  • _Pointer “this”_
    • Every class has a pointer this of type Class* which points to the current object. The pointer this is similar to Java’s this keyword inside classes.
    • Use cases:
      • Return a reference or pointer to current object.
      • Ambiguity resolution, for instance, if a function has a parameter named count, and a class member has the same name, the ambiguity in assignment operation can be solved with this->count = count;
      • Make it explicit and indicate that a class method is being invoked, for instance, this->method(arg0, arg1, arg2) is more explicit than using method(arg0, arg1, arg2), which could be an external function instead of a class’ member function.

Classes

Member Functions

  • member function
    • C++ terminology for class method.
  • virtual member function (aka virtual function or virtual method)
    • For short: Method that can be overriden, in other words, derived classes can replace the base class implementation.
    • Any class’ member function (aka method) which can be overriden by derived classes. Only methods annotated with virtual can be overriden.
  • pure virtual member function
    • For short: Abstract method. A derived class must provide an implementation.
    • A member function from base class annotated as virtual, however without any implementation. It is the same as an abstract method that should be implemented by derived classes.
  • static member function
    • For short: static method.
    • A class method that can be called without any instance.
  • special member functions
    • Destructor
    • Constructors
      • Default constructor
      • Copy constructor
      • Move constructor
    • Copy assignment operator
    • Move assignment operator
  • Common constructors
    • Default / Empty constructor
      • Signature: CLASS()
      • Constructor without arguments used for default initialization. If this constructor is not defined, the compiler generates it by default. Without this constructor, it is not possible to store a instances of a particular class by value in STL containers.
    • Conversion Constructor
      • Signature: Class(T value)
      • Constructor with a single argument or callable with a single argument. This type of constructor instantiates an object with implicit conversion by assignment or when an instance of type T is passed to a function expecting an object of the underlying class. For instance, this constructor allows intialization as:
        • Class object = value; // Value has type T
        • Class object = 100; // Calls constructor Class(int x).
      • To forbid this implicit conversion use the keyword explicit.
        • explicit Class(T value)
    • List initializer constructor
      • Signature: CLASS(std::intializer_list<T>)
      • Constructor which takes an initializer list as argument. This constructor makes possible to initialize an object with:
        • CLASS object {value0, value1, value2, value3 … };
        • auto object = CLASS {value0, value1, value2, value3 … };
    • Range constructor
      • Signature: CLASS(beginIterator, endInterator)
      • Constructor which takes an iterator pair as arguments. It allows to instantiate objects from STL container iterators.
  • Types of polymorphism in C++
    • Dynamic - Resolution at runtime
      • AKA: subtyping polymorphism.
      • Inheritance and virtual functions.
    • Static - Resolution at compile-time
      • Function overload - multiple functions with different signatures sharing the same name.
      • Templates (Parametric polymorphism)
  • Polymorphism Binding
    • Early Binding
      • The class method (aka member function) to be called is resolved at compile-time.
    • Late Binding
      • The calss method to be called is resolved at runtime, rather than at compile-time. Late binding is only possible with inheritance and member functions marked as virtual.
      • Drawbacks:
        • Performance cost.
        • Compilers cannot inline virtual member functions.

Linkage

  • External Linkage (Default)
    • Variables and functions are accessible from all compilation units (source files) through the whole program. All global variables and functions definitions without the static keyword or outside an anonymous namespace have external linkage.
    • Multiple symbols (variable or function) cannot have the same name.
  • Internal Linkage
    • Global variables or functions only acessible in the compilation unit (source file) they are defined. Such variables and functions are defined with static (C-style) keyword annotatation or are defined inside an anonymos namespace (preferable in C++).
    • Multiple symbols can have the same.
    • Symbols with default internal linkage:
      • const objects, constexpr objects, typedefs and objects annoated with static keyword.
  • No linkage
    • Local variables in functions or member functions. They are only accessible in the scope they are defined or stack-allocated variables.
  • References:

Undefined Behavior X Unspecified Behavior

  • Undefined Behavior: The C++ ISO Standard provides no gurantees about the program behavior under a particular condition. It means that anything can happen such as runtime crashing, returning an invalid or random value and so on. Undefined behavior should be avoided in order to ensure that the program can work with all possible compilers and platforms.
  • Unspecified Behavior
    • It is basically “implementation defined behavior”, the C++ ISO standard requires the behavior to be well defined by a compliant compiler.

Compilation

  • Cross-compilation -> Compiling a source code for a different processor architecture or operating system than the compiler was run (host operating system). Cross compilation is common for embedded systems, example: compiling an a program/app or firmware on Windows / x64 for an ARM 32 bits processor.

ABI - Application Binary Interface and Binary Compatibility

The ABI - Application Binary Interface is are a set of specifications about how a source coede is compiled to object-code (machine code). As C++ does not have a standard and stable ABI, it is not possible to static link object codes generated by different compilers or reuse a shared library without a C interface built with a different compiler. Due to the mentioned ABI issues, binary reuse of a C++ code becomes almost impossible, as a result, in general, most C++ codes are only reused as source.

The ABI is defined by the compiler and the operating system and it the binary interface is not specified by the ISO C++ standard. Among other things, the ABI specifies:

  • Class layout: VTable Layout, padding, member function-pointer, RTTI and so on.
  • Exception implementation and exception hanling
  • Linkage information
  • Name decoration schema (name mangling)
    • The schema or rules used to encode symbols in a unique way. In C, every symbol in an object code has the same name as the function that it refers to. As the object code must have a unique symbol for every function and C++ supports templates, classes or function overloading, the compiler must generate a unique name for every symbol. This process is called name mangling or name decoration. This name encoding is compiler dependent and one of the sources of ABI incompatibilities.
    • Note: the statement (extern “C”) disables name mangling specifying to the compiler that the function has C-linkage and the function symbol is the same as its name.

Notes:

  • The ABI incompatibility can also happen even between different versions of the same compilers.
  • Due to the ABI problems, it is almost impossible to distribute pre-compiled C++ code as static or shared libraries. As a result, unlike C shared libraries, it is hard to find pre-compiled C++ libraries available as shared libraries.
  • The only way to build binary componets with C++ which can be reused by other codes in C, C++ or other programming languages via FFI (Foreign-Function Interface) is by defining a C-interface (extern “C”) for all C++ classes and functions.
  • Newer verions of GCC and Clang on Unix-like operating systems are adopting the Itanium ABI which mitigates the ABI problem, however it is not guaranteed by the C++ standard.

References:

ABI - Fragile Base Class or Fragile Binary Interface

  • C++ has the fragile base class problem that happens when changes in a base class break its ABI requiring recompilation of all derived classes, client code or third-party code. This issue is specially important for large projects, SDKs osftware development kit, libraries or plugin-systems where third-party a code is dynamically loaded at runtime.

What can keep or break a base class ABI compatibility:

  • DO Changes which that do not break the base class ABI: (KDE Guide)
    • Append new non-virtual member functions.
    • Add Enumeration to class.
    • Changet the implementation of virtual member functions (overridable methods) without changing its signature (interface).
    • Create new static member functions (static methods)
    • Add new classes
    • Append or remove friend functions
    • Rename class private member variables
  • DONT Changes that breaks the class ABI and disrupts binary compatibility: (KDE Guide)
    • Change the order of existing virtual member functions
    • Add virtual member function (method) to a class without any virtual member function or virtual base class.
    • Add or remove virtual member functions
    • Addition or removal of member variables
    • Change the order of member variables
    • Change the type of member variables

Techniques for keeping the ABI compatibility:

  • PIMPL - Use the PIMP (Pointer to implementation) technique for encapsulatiing the member variables into a opaque pointer which the implementation is not exposed in the header file. The opaque pointer becomes the unique class member variable exposed in the header file, as a result any change of the encapsulated member variables no longer breaks the class ABI.
  • Interface Class - An interface class has only virtual member functions, virtual constructor and no member variables.
  • Extend, but not modify, do not change interfaces or base classes relied by external codes, libraries or client code. If a new functionality is needed, it is better to create a new class extending the base class instead of modifying it what would break extenal codes relying on it.
  • Prefer composition to inheritance
  • C-interface (extern “C”) with opaque pointer - C-interface or C-wrapper with C-linkage functions and opaque pointers. The classes and functions are not exposed and the client code can only access the library using the C-API or functions with C-linkage. This is the only reliable way to share compiled code between different compilers.

References:

Processes and Operating Systems

Protection mode, Kernel and User Spaces

Real Mode

  • Old operating systems like Microsft MSDOS and Windows 95 ran in real mode, which means that any programs can access the physical memory (RAM memory), memory mapped IO and hardware directly without any restriction which could result in security and stability problems as any process could take down the whole operating system. Summary: no separating between kernel and user spaces.

Protected Mode

  • Modern operating systems such as Windows, MacOSX and Linux run in protected mode, which has the kernel space and user space.
  • User Space - Programs running in user space, runs with less privilege, they are not allowed to run some CPU machine instructions and to access hardware devices or physical memory directly. Applications in user space, can only a restricted portion of the physical memory assigned by the operating system, called virtual memory. This protection is enforced both by the operating system and the processor.
  • Kernel Space - Only programs running in kernel space can access the whole physical memory, any process memory and execute all CPU instructions.

Processes ad Virtual Memory

Process

  • A unique instance of a running program with its own PID (Process Identifier), address-space, virtual memory and threads. Any application, executable or program can have multiple processes running on the same machine with different states.

Process State (PCB - Process Control Block) Every process has the following states.

  • CPU Registers (IP Instruction pointer and stack pointer). A CPU core only has a single IP Instruction pointer. However every process has its own IP pointer because the operating system switches between processes in a very fast way performing context switch, saving and restoring the CPU register for every process giving the illusion that multiple processes are running simultaneously.
  • => PID - Unique Process ID (Identifier) number.
  • => Command line arguments used to start the process.
  • => Current directory.
  • => Environment variables
  • => One or more threads
  • => File descriptors associated with the process.

Virtual Memory

  • Portion of physical memory assigned to a process by the operating system’s kernel. In most operating systems, a process cannot access the physical memory, all the memory that it can see and referece is its virtual memory. For instance, the address of a pointer to some variable is not the address of the variable in the physical memory, instead it is the address of the variable in the current process virtual memory.
    • => C++ => Pointers to variables stores the numerical value of a virtual memory address. (Note: only for programs that runs on operating systems, not valid for firmwares.)
    • => The C++ standard does not define whether pointer addresses refer to virtual or physical memory, this behavior is platform-dependent.
  • Physical Address
  • Virtual Address
  • Process Isolation: One of the purposes of the virtual memory is to not allow a user-space process to read the memory of another process.
    • Note: Operating systems provide APIs for reading and writing process memory, otherwise debuggers would not exist.
  • Virtual Memory Segments: Every process, no matter the programming language it was written, has the following memory segments in its virtual memory:
    • Stack segment => Stores stack frames, functions local variables and objects and return addresses.
    • Heap segment (ak free store) => Dynamically allocated variable with C++ operator new or C function malloc.
    • Data Segment => Stores initialized and non-initialized global variables.
    • Text Segment => Stores the program machine code that cannot be modified. (read-only)

Other Virtual Memory Segments

  • Memory Mapped Files (Inter process communication)
    • Allows a disk file to be mapped into the virtual memory and be accessed just as an ordinary memory through pointer manipulation. This segment can be mapped the virtual memory of many processes without incurring on copying overhead.
  • Shared Memory - allows processes to shared data without copying.
  • Dynamic Library Loading (DLLs)
  • Thread Stack

System Calls and Operanting Systems C-APIs

Operating System APIs - Most operating systems are written in C and processor-specific assembly. Their APIs (Application Programming Interfaces) and services are exposed in C language, this API can be:

  • System Calls
    • => Documented on Linux, BSD and etc. Undocumented on Windows. Note: Linux has fixed number for every system call which is documented and standardize. On Windows, the system calls may change on every release, so it is only safe to rely on the Win32 API encapsulating them.
  • Basic C APIs that encapsulates system calls. Some those APIs are:
    • Win32 API - Windows API
    • POSIX API - Standardized UNIX API shared by most Unix-like operating systems, Linux, BSD, MacOSX and so on.

References

Common Acronyms, abbreviations and technologies

  • Note: Technical standards aren’t laws, they are specifications, recommendations for standardization and set of good practices.
Acronym, name or technologyDescription
Organizations
ANSIAmerican National Standards Institute
NISTNational Institute of Standards and Technology
ISOInternational Organization for Standardization
IEEEInstitute of Electrical and Electronics Engineers
IECInternational Electrotechnical Commission
CERNEuropean Organization for Nuclear Research
MISRAMotor Industry Software Reliability Association
Technical Standards
ISO/IEC 14882 - C++C++ Programming Language Standard and Specification used by most compiler vendors.
ISO/IEC 14882:2003C++03 Standard
ISO/IEC 14882:2011C++11 Standard
ISO/IEC 14882:2014C++14 Standard
ISO/IEC 14882:2017C++17 Standard
ANSI X3.159-1989C-89 - C programming language standard
ISO/IEC 9899:1990C-90 standard
ISO/IEC 9899:1999C-99 standard
IEE754Floating Point technical standard
ISO 8601Date and time standard widely used on computers and internationalization.
Technical Standards for Embedded Systems
IEC 61508Standards for funcitonal safety of Electrical/Electronic/Programmable Safety-Related System
ISO 26262IEC 61508 Applied to automotives up to 3.5 tons - comprises electronic/electrical safety (includes firmware)
IEC 62304International standard for medical device software life cycle
General - C++
CPPC++ Programming Language
TMPTemplate Meta Programming
STLStandard Template Library
ODROne Definition Rule
ADLArgument Dependent Lookup
ASMAssembly
GPGeneric Programming
CTORConstructor
DTORDestructor
RAAIResource Acquisition Is Initialization
SFINAESubstitution Is Not An Error
RVOReturn Value Optmization
EPExpression Template
CRTPCurious Recurring Template Pattern
PIMPLPointer to Implementation
RTTIRuntime Type Identification
MSVCMicrosoft Visual C++ Compiler
VC++Microsoft Visual C++ Compiler
ASTAbstract Syntax Tree
RPCRemote Procedure Call
rhsright-hand side
lhsleft-hand side
Operating Systems Technologies
IPCInterprocess Communication
COMComponent Object Model - (Microsoft Technology)
OLEObject Linking and Embedding (Windows/COM)
IDLInterface Description Language
MIDLMicrosft Interface Description Language - used for create COM components
DDEDynamic Data Exchange - Windows shared memory protocol
RTDReal Time Data (Excel)
U-NIX likeAny operating based on UNIX (Opengroup trademark) such as Linux, Android, BSD, MacOSX, iOS, QNX.
BLOBBinary Large Object
GOF (Gang of Four)Book: Design Patterns: Elements of Reusable Object-Oriented Software
POSIXPortable Operating System Interface (POSIX)
Network Protocols
RFCInternet Taskforce - Request for Comment
ARPAddress Resolution Protocol
DHCPDynamic Host Configuration Protocol
IPInternet Protocol (Sockets)
TCPTransmissiion Control Protocol (Sockets)
UDPUser Datagram Protocol
DNS (UDP Protocol)Domain Name System
ICMP (ping)Internet Control Message Protocol - Ping Protocol
HTTPHyper Text Transfer Protocol
FTPFile Transfer Protcol
ModbusNetwork protocol used by PLCs
CAN Bus (not TCP/IP)Controller Area Network - distributed network used in cars and embedded systems.
Executable Binary Formats
PE, PE32 and PE64Portable Executable format - Windows object code format.
ELF, ELF32 and ELF64Executable Linkable Format - [U]-nix object code format.
MachOBinary format for executables and shared libraries used by the operating systems iOS and OSX.
DLLDynamic Linked Library - Windows shared library format.
SOShared Object - [U]-nix, Linux, BSD, AIX, Solaris shared library format.
DSODynamic Shared Object, [U]-nix shared library format.
Cryptography
HMACKeyed-Hash Message Authentication Code
MACMessage Authentication Code
AESAdvanced Encryption Standard
Crypto Hash Functions
MD5
SHA1
SHA256
Processor Architectures
CISCComplex Instruction Set Computer
RISCReduced Instruction Set Computer
SIMDSingle Instruction, Multiple Data
Havard ArchitechtureUsed mostly in DSPs, Microcontrollers and embedded systems.
Von-Neumann ArchitechtureUsed mostly in conventional processors.
IBM-PC Architecture Components
BIOSBasic Input/Output System - Firmware used to initialize and load OS in IBM-PC arch.
UEFIUnified Extensible Firmware Interface - BIOS replacement on new computers.
DMADirect Memory Access
MMUMemory Management Unit - Hardware that translates physical memory to virtual memory.
PCIPeripheral Component Interconnect Express - BUS used in IBM PCs
NICNetwork Interface Controller/Card
RAID (storage)Redundant Array of Independent Disks
Hardware and processors
CPUCentral Processing Unit
MPUMicro Processor Unit
FPUFloating Point Unit
DSPDigital Signal Processor
MCUMicrocontroller Unit
SOCSystem On Chip
GPUGraphics Processing Unit
FPGAField Programmable Gate Array
ASICApplication-Specific Integrated Circuit
ECUEngine Control Unit or Electronic Control Unit - Car’s embedded computer.
Peripherals
RAMRandom Access Memory
ROMRead-Only Memory
EPROMErasable Programmable Read-only Memory
EEPROMElectrically Erasable Programmable Read-Only Memory
GPIOGeneral Purpose IO
ADCAnalog to Digial Converter
DACDigital to Analog Converter
PWMPulse Width Modulation
Serial interface I2C
Serial interface SPISeria Peripheral Interface
Serial interface UARTSerial communication similar to the old computer serial interface RS232
Serial interface Ethernet
CAN busController Area Network - Widely used BUS in the automotive industry.
DSIDisplay Serial Interface
MEMsMicroelectromechanical Systems - mechanical sensors implemented in silicon chips.

Bits, bytes, sizes, endianess and numerical ranges

Bits in a byte by position

 MSB                                LSB 
(Most significant bit)         (Least significant bit)
   |                                  |
 | b7 | b6 | b5 | b4 | b3 | b2 | b1 | b0 |         Bit Decimal    Bit shift    Multiplier
 +----|----+----+----+----+----+----+----+           Value         Operation   DEC   HEX
   |    |    |    |    |   |     |     |             ........     ........     ........
   |    |    |    |    |   |     |     \---------->> b0 x 2^0   =  b0 << 0     1    0x01
   |    |    |    |    |   |      \--------------->> b1 x 2^1   =  b1 << 1     2    0x02 
   |    |    |    |    |    \--------------------->> b2 x 2^2   =  b2 << 2     4    0x04
   |    |    |    |     \------------------------->> b3 x 2^3   =  b3 << 3     8    0x08 
   |    |    |    \------------------------------->> b4 x 2^4   =  b3 << 4    16    0x10 
   |    |    \------------------------------------>> b5 x 2^5   =  b5 << 5    32    0x20 
   |    \----------------------------------------->> b6 x 2^6   =  b6 << 6    64    0x40
   \---------------------------------------------->> b7 x 2^7   =  b7 << 7   128    0x80

Example:

Binary number: 0b10100111 = 0b1010.0111 = 167 = 0xA7
1010 => Upper nibble in the hex table is equal to 'A'
0111 => Lower nibble in the hex table is equal to '7'

| b7 | b6 | b5 | b4 | b3 | b2 | b1 | b0 |  
+----+----+----+----+----+----+----+----+
| 1  |  0 | 1  | 0  | 0  | 1  | 1  |  1 |

Decimal Value of a Bit of Order N = 2^N

Decimal Value of 0xA7 = Σ b[i] x 2^i 
              = b0 x 2^0 + b1 x 2^1 + b2 x 2^2 + b3 x 2^3 + b4 x 2^4 + b5 x 2^5 + b6 x 2^6 + b7 x 2^7
              = 1  x 2^0 + 1  x 2^1 + 1  x 2^2 + 0  x 2^3 + 0  x 2^4 +  1 x 2^5 + 0  x 2^6 + 1  x 2^7
              = 1  x 1   + 1  x 2   + 1  x 4   + 0  x 8   + 0  x 16  +  1 x 32  + 0  x 64  + 1  x 128
              = 1 + 2 + 4 + 0 + 0 + 32 + 0 + 128
              = 167 OK

Decimal Value of 0xA7 = Σ H[i] x 16^i  where H[i] is a hexadecimal digits 
                      = 16^0 * 7  + A * 16^1  
                      = 1 * 7     + 10 * 16  
                      = 7         + 160 
                      = 167 OK

Decimal - Hexadecimal and Binary Conversion Table

DecimalHexadecimalBinary
Base 10Base 16Base 2
000000
110001
220010
330011
440100
550101
660110
770111
881000
991001
10A1010
11B1011
12C1100
13D1101
14E1110
15F1111

Bits and byte bitmak by position

Bit NBinaryDecimalHex
00b0000.000110x01
10b0000.001020x02
20b0000.010040x04
30b0000.100080x08
40b0001.0000160x10
50b0010.0000320x20
60b0100.0000640x40
70b1000.00001280x80
All bits set0b1111.11112550xFF

Binary - Octal Conversion

OctalBinary
Base 8Base 2
0000
1001
2010
3011
4100
5101
6110
7111

Ascii Table - Character Enconding

Ascii Table

Special Characters and New Line Character(s)

DecHexCharDecHexCharDecHexCharDecHexChar
000NUL ‘\0’3220SPACE6440@9660`
101SOH3321!6541A9761a
202STX34226642B9862b
303ETX3523#6743C9963c
404EOT3624$6844D10064d
505ENQ3725%6945E10165e
606ACK3826&7046F10266f
707BEL ‘\a’39277147G10367g
808BS ‘\b’4028(7248H10468h
909HT ‘\t’4129)7349I10569i
100ALF ‘\n’422A*744AJ1066Aj
110BVT ‘\v’432B+754BK1076Bk
120CFF ‘\f’442C,764CL1086Cl
130DCR ‘\r’452D-774DM1096Dm
140ESO462E.784EN1106En
150FSI472F/794FO1116Fo
1610DLE483008050P11270p
1711DC1493118151Q11371q
1812DC2503228252R11472r
1913DC3513338353S11573s
2014DC4523448454T11674t
2115NAK533558555U11775u
2216SYN543668656V11876v
2317ETB553778757W11977w
2418CAN563888858X12078x
2519EM573998959Y12179y
261ASUB583A:905AZ1227Az
271BESC593B;915B[1237B{
281CFS603C<925C\1247C
291DGS613D=935D]1257D}
301ERS623E>945E^1267E~
311FUS633F?955F_1277FDEL
CharCaretNameHexDecObservation
Notation
‘\0’Null character0x0000-
‘\t’Tab0x0909-
’ ’Space0x2032-
‘\r’^M(CR) - Carriage Return0x0D13Line separator for text files on Old Versiosn of MacOSX
‘\n’^J(LF) - Line Feed0x0A10Line separator for text files on most Unix-like OSes, Linux and MacOSX
‘\r\n’^J^M(CR-LF)-Line separator for text files on Windows

Bits, bytes, Megabytes and information units

UnitIn bitsIn bytesIn KbytesIn Mega BytesIn Gigabytes
bit1----
byte81---
Kbyte (kb)1024 x 810241--
Mega Byte (MB)1024 x 1024 x 81024 x 102410241-
Giga Bytes (GB)---1024-
Tera Bytes---1024

Summary:

  • Basic unit 1 bit = (0 or 1), (True or False), (On or Off)
  • 1 Nibble = 4 bits
  • 1 byte = 8 bits
  • 1 kb (kbyte) = 1024 bytes
  • 1 Mb (Mega byte) = 1024 Kbytes
  • 1 Gb (Giga byte) = 1024 Megabytes
  • 1 TB (Tera Byte) = 1024 Giga bytes
  • 1 PT (Penta Byte) = 1024 Tera bytes

Bit Manipulation for Low Level and Embedded Systems

The following bit manipulation idioms are widely used in legacy C code, embedded systems code, device driver code or for manipulating arbitrary bits of some variable:

Memory Mapped IO

The following code simulates a MMIO memory mapped IO in a embedded system (a microcontroller), more specifically a 8-bits GPRIO - General Purpose IO a digital IO located at the fixed address 0xFF385A (defined in the device’s datasheet or memory map). Setting the first bit (bit 0) of this IO device, makes the LED attached to the first pin be turned ON, clearing this bit makes the led to be turned off.

  • volatile keyword => Tells the compiler to disable optimization for this variable and indicates that it can be changed any time.
  • reinterpret_cast => Indicates that it is a memory reinterpretation cast, indicates that the memory at address 0xFF385A is being reintreted as an 8-bit unsigned integer.
  • constexpr => Compile-time constant, has no storage space. Costs any program memory (ROM, flash) space. The value GPRIO_ADDRESS is replaced where it is used.
  • The hypothetical program (firmware) runs without any operating system, therefore, it has access to all physical memory.
#include <cstdint> 

// address taken from device's datasheet supplied by manufacturer. 
constexpr uintptr_t GPRIOA_ADDRESS = 0xFF385A;
 
// Access memory mapped IO register at 0xFF385A using pointer. 
volatile const std::uint8_t>* pGPRIOA = reinterpret_cast<std::uint8_t*>(GPRIOA_ADDRESS);    

// Access memory mapped IO register at 0xFF385A using reference. 
volatile std::uint8_t& GPRIOA = *reinterpret_cast<std::uint8_t*>(GPRIOA_ADDRESS);    

Bitwise Operators Reminder

(|) => X_or_Y   = a | b; => bitwise OR 
(&) => X_and_Y = a & b;  => bitwise AND 
(^) => X_xor_Y = X ^ Y;  => bitwise XOR 
(~) => not_x = ~X;       => bitwise NOT => Invert all bits 

Left shift => bitshift Operator: 
X << Y = X * 2^Y    => Shift Y bits to the left.

Right shift => bitshift Operator:
X >> Y = X / 2^Y    => Shift Y bits to the right.

Read/Get the N-th bit

bit_value = (GPRIOA >> N) & 0x01;

// Check if bit 4 is set 
if((GPRIOA >> 4) & 0x01)
{ 
  ... 
}

// Check if 0-th bit is set 
if((GPRIOA >> 0) & 0x01 == 1)
{ 
   ... 
}

  // Check if 6-th bit is set 
if((GPRIOA >> 6) & 0x01 == 1)
{ 
   ... 
}

Setting the Nth-bit

Set the N-th bit (turn bit into 1) of a general variable:

// Verbose way 
<VARIABLE> = <VARIABLE> | (1 << N); 
// Short way 
<VARIABLE> |= (1 << N); 

Set the 4-th bit - (turn on the 4th LED in this case)

// Verbose way 
GPRIOA = GPRIOA | (1 << 4); 
// Short way 
GPRIOA |= 1 << 4; 

Clear the Nth-bit

Clear the N-th (turn the bit into zero) bit of general variable:

// Verbose way 
<VARIABLE> = <VARIABLE> & ~(1 << N); 
// Short way 
<VARIABLE> &= ~(1 << N); 

Clear the 5-th bit (turn on the 4th LED in this case)

// Verbose way 
GPRIOA = GPRIOA & ~(1 << 5); 
// Short way 
GPRIOA &= ~(1 << 5); 

Analysis:

         Bitshift operation 
         1 << 5 = 2^5 = 32 = 0x20 = 0b00010000

                      B7  B6  B5  B4  B3  B2  B1  B0   BITS 
                      --------------------------------  
           1 << 5  =>  0   0   1   0   0   0   0   0    => Equivalent value to 1 << 5 
         ~(1 << 5) =>  1   1   0   1   1   1   1   1    => Invert all bits of (1 << 5)
          GPRIOA   =>  b7  b6  b5  b4  b3  b2  b1  b0   => Bits of GPRIOA 
                   -----------------------------------
GPRIOA & ~(1 << 5) =>  b7  b6  0   b4  b4  b3  b1  b0   => Result of AND (&) bitwise operation 

Invert all bits

VARIABLE = ~VARIABLE; 

Invert all bits of GPIOA:

GPIOA = ~GPIOA;

Toggle the Nth-bit

Toggle operation: if the bit is 1, turn it into 0, if it is 0, turn it into 1.

// Verbose way 
<VARIABLE> = <VARIABLE> ^ (1 << N); 
// Short way 
<VARIABLE> ^= (1 << N); 

Toggle the bit 6 of GPIOA register:

// Verbose 
GPIOA = GPIOA ^ (1 << 6); 
// Short 
GPIOA ^= (1 << 6); 

Numerical Ranges for Unsigned Integers of N bits

N bitsMinMaxMax in HexadecimalNumber of values
802550x00FF256
10010230x03FF1024
12040950x0FFF4096
160655350xFFFF65536
3201E9 =~ 10 billions-2^32
6401E19-2^64

Formula:

Maximum Unsigned NumberOf N bits  = 2^(n - 1) 
Max Unsigned 8 bits  =  2^8 - 1  = 256 - 1  = 255 
Max Unsigned 10 bits =  2^10 - 1 = 1024 - 1 = 1023 

Numerical Ranges for Signed Integers of N bits

N bitsMinMax
8-128127
10-512511
12-20482047
16-327683767
32-2147483648+2147483647
64~ -1E19 = -1 x 10^19~ 1E19 = 1 x 10^19
minNumberOfNbits = -2^(N - 1)
maxNumberOfNbits = 2^(N - 1) - 1

minNumberOfNbits[N = 8] = -2^(8 - 1)    = -2^7     = -128 
maxNumberOfNbits[N = 8] =  2^(8 - 1) - 1 = 2^7 - 1 = +127

minNumberOfNbits[N = 16] = -2^(16 - 1)    = -2^15     = -32768
maxNumberOfNbits[N = 16] =  2^(16 - 1) - 1 = 2^15 - 1 = +32767

Endianess - Big Endian X Little Endian

The endinaess is the order in which bytes are stored in which the bytes of some data are encoded in the memory, disk, file or network protocol.

The endianess matters in:

  • Embedded Systems
  • Dealing with raw binary data
  • Data Serialization
  • Processor memory layout
  • Network data transmission

Little Endian - LE

The least significant byte is stored first. In a big-endian processor or system, the number 0xFB4598B2 (bytes 0xFB 0x45 0x98 0xB2 ) would be stored as:

  • LSB - Least Significant Byte
  • MSB - Most Significant Byte
Memory AddressOrderDataTag
0x10000xB2LSB
0x10110x98
0x10220x45
0x10330xFBMSB

Endianess and C++:

  • This session in CERN’s REPL shows the memory layout endianess of the number 0xFB4598B2 in a Intel x64 processor (Vanilla Desktop - IBM-PC processor). Note: In a bing-endian processor the byte order display in the next code block would be in reverse order.
>> int k = 0xFB4598B2
(int) -79325006
>> 

// Print integers in hex formats 
std::cout << std::hex; 

>> std::cout << "k = " << k << "\n";
k = fb4598b2
>> 

>> *p
(char) '0xb2'

>> *(p + 1)
(char) '0x98'

// Print bytes using pointer offset
>> std::cout << "p[0] = 0x" << (0xFF & (int) *(p + 0)) << "\n";
p[0] = 0xb2
>> 
>> std::cout << "p[1] = 0x" << (0xFF & (int) *(p + 1)) << "\n";
p[1] = 0x98
>> std::cout << "p[2] = 0x" << (0xFF & (int) *(p + 2)) << "\n";
p[2] = 0x45
>> std::cout << "p[3] = 0x" << (0xFF & (int) *(p + 3)) << "\n";
p[3] = 0xfb

// Print bytes using array notation 
>> std::cout << "p[0] = 0x" << (0xFF & (int) p[0]) << "\n";
p[0] = 0xb2
>> std::cout << "p[1] = 0x" << (0xFF & (int) p[1]) << "\n";
p[1] = 0x98
>> std::cout << "p[2] = 0x" << (0xFF & (int) p[2]) << "\n";
p[2] = 0x45
>> std::cout << "p[3] = 0x" << (0xFF & (int) p[3]) << "\n";
p[3] = 0xfb
>>  

Big Endian - BE

The most signficant byte is stored first, the bytes of some data are stored in reverse order than the little endian (LE) encoding.

Detect Edianess in C++ at runtime

Memory AddressOrderDataTag
0x10000xFBMSB
0x10110x45
0x10220x98
0x10330xB2LSB

Check whether current system is little endian:

bool isLittleEndian()
{
    int n = 1;
    return *(reinterpret_cast<unsigned char*>(&n)) == 1;
}

Check whether current system is big endian:

bool isBigEndian()
{
    int n = 1;
    return *(reinterpret_cast<unsigned char*>(&n)) == 0;
}

Processors Endianess

Processor / CPU FamilyEndianessNote:
Intel x86, x86-x64 and IA-32Little EndianDefault processor of IBM-PC architechture
ARMLittle EndianDefault endianess, can also be Big-Endian
SparcsBig Endian
Motorola 68000Big Endian
*JVM - Java Virtual MachineBig-Endian*Not a processor.
MIPSSupports Both
PowerPCSupports Both

References

Quotes

Bjarne Stroustrup

  • Bjarne Stroustrup

C++11 feels like a new language: The pieces just fit together better than they used to and I find a higher-level style of programming more natural than before and as efficient as ever.

C++is a multi-paradigm language. In other words, C++was designed to support a range of styles. No sin-gle language can support every style. However, a variety of styles that can be supported within the frame-work of a single language. Where this can be done, significant benefits arise from sharing a common type system, a common toolset, etc. These technical advantages translates into important practical benefits suchas enabling groups with moderately differing needs to share a language rather than having to apply a num-ber of specialized languages.

  • Bjarne Stroustrup - C++ Programming Language

There are only two kinds of languages: the ones people complain about and the ones nobody uses.

C makes it easy to shoot yourself in the foot; C++ makes it harder, but when you do it blows your whole leg off.

C++ has indeed become too “expert friendly” at a time where the degree of effective formal education of the average software developer has declined. However, the solution is not to dumb down the programming languages but to use a variety of programming languages and educate more experts. There has to be languages for those experts to use– and C++ is one of those languages.

What I did do was to design C++ as first of all a systems programming language: I wanted to be able to write device drivers, embedded systems, and other code that needed to use hardware directly. Next, I wanted C++ to be a good language for designing tools. That required flexibility and performance, but also the ability to express elegant interfaces. My view was that to do higher-level stuff, to build complete applications, you first needed to buy, build, or borrow libraries providing appropriate abstractions. Often, when people have trouble with C++, the real problem is that they don’t have appropriate libraries or that they can’t find the libraries that are available.

The technical hardest problem is probably the lack of a C++ binary interface (ABI). There is no C ABI either, but on most (all?) Unix platforms there is a dominant compiler and other compilers have had to conform to its calling conventions and structure layout rules - or become unused. In C++ there are more things that can vary - such as the layout of the virtual function table - and no vendor has created a C++ ABI by fiat by eliminating all competitors that did not conform. In the same way as it used to be impossible to link code from two different PC C compilers together, it is generally impossible to link the code from two different Unix C++ compilers together (unless there are compatibility switches).

Alexander A. Stepanov

  • Alexander A. Stepanov

I still believe in abstraction, but now I know that one ends with abstraction, not starts with it. I learned that one has to adapt abstractions to reality and not the other way around.

  • Alexander A. Stepanov - From Mathematics to Generic Porgramming.

To see how to make something more general, you need to start with something concrete. In particular, you need to understand the specifics of a particular domain to discover the right abstractions.

  • Alexander A. Stepanov, From Mathematics to Generic Programming

When writing code, it’s often the case that you end up computing a value that the calling function doesn’t currently need. Later, however, this value may be important when the code is called in a different situation. In this situation, you should obey the law of useful return: A procedure should return all the potentially useful information it computed.

  • Alexander A. Stepanov

Object-oriented programming aficionados think that everything is an object…. this [isn’t] so. There are things that are objects. Things that have state and change their state are objects. And then there are things that are not objects. A binary search is not an object. It is an algorithm

  • Alexander A. Stepanov

You cannot fully grasp mathematics until you understand its historical context.

By generic programming, we mean the definition of algorithms and data structures at an abstract or generic level, thereby accomplishing many related programming tasks simultaneously. The central notion is that generic algorithms, which are parameterized procedural schemata that are completely independent of the underlying data representation and are derived from concrete, efficient algorithms.

Alan Kay

  • Alan Kay

Simple things should be simple, complex things should be possible.

  • Alan Kay

It’s easier to invent the future than to predict it.

  • Alan Kay

Normal is the greatest enemy with regard to creating the new. And the way of getting around this is you have to understand normal not as reality, but just a construct. And a way to do that, for example, is just travel to a lot of different countries and you’ll find a thousand different ways of thinking the world is real, all of which are just stories inside of people’s heads. That’s what we are too. Normal is just a construct, and to the extent that you can see normal as a construct in yourself, you have freed yourself from the constraints of thinking this is the way the world is. Because it isn’t. This is the way we are.

Edsger W. Dijkstra

  • Edsger W. Dijkstra

Program testing can be used to show the presence of bugs, but never to show their absence!

  • Edsger W. Dijkstra (1970) “Notes On Structured Programming” (EWD249), Section 3 (“On The Reliability of Mechanisms”), p. 6.

The art of programming is the art of organizing complexity, of mastering multitude and avoiding its bastard chaos as effectively as possible.

John Von Neumman

  • John von Neumann

If you tell me precisely what it is a machine cannot do, then I can always make a machine which will do just that;

  • John von Neumann - The Role of Mathematics in the Sciences and in Society (1954)

A large part of mathematics which becomes useful developed with absolutely no desire to be useful, and in a situation where nobody could possibly know in what area it would become useful; and there were no general indications that it ever would be so. By and large it is uniformly true in mathematics that there is a time lapse between a mathematical discovery and the moment when it is useful; and that this lapse of time can be anything from 30 to 100 years, in some cases even more; and that the whole system seems to function without any direction, without any reference to usefulness, and without any desire to do things which are useful.

  • John von Neumann, The Computer and the Brain

Any computing machine that is to solve a complex mathematical problem must be ‘programmed’ for this task. This means that the complex operation of solving that problem must be replaced by a combination of the basic operations of the machine.

  • John von Neumman

Problems are often stated in vague terms… because it is quite uncertain what the problems really are.

  • John von Neumman

The sciences do not try to explain, they hardly even try to interpret, they mainly make models. By a model is meant a mathematical construct which, with the addition of certain verbal interpretations, describes observed phenomena. The justification of such a mathematical construct is solely and precisely that it is expected to work - that is correctly to describe phenomena from a reasonably wide area. Furthermore, it must satisfy certain esthetic criteria - that is, in relation to how much it describes, it must be rather simple.

  • John von Neumman

The calculus was the first achievement of modern mathematics and it is difficult to overestimate its importance. I think it defines more unequivocally than anything else the inception of modern mathematics; and the system of mathematical analysis, which is its logical development, still constitutes the greatest technical advance in exact thinking.