## **std::vector**

`std::vector` is a container describing a contiguous piece of **"typed"** (_e.g._ `float`, `int`, custom objects) memory. Access can be done through indices, or iterators making it the preferred container to get performance in HPC systems due to its vectorization and cache friendly (zero per-element indexing overhead) characteristics. `std::vector` containers are heap-allocated allowing for flexible reallocations if required. For stack-allocated containers see `std::array`. 

In its simplest form it is a self-describing piece of contiguous memory with 3 underlying pointers that describe:

- **data**: memory address
- **size**: number of elements 
- **capacity**: allocated memory as number of elements (not bytes) >= size

gcc and clang implementations of `std::vector` return a size of 24 bytes ( 3 x 8 bytes = pointer size) for `sizeof` accounting for pointers.

`std::vector` object underlying contiguous memory representation:

```
|----------capacity()-------------------------------------------->|
|----------size()---------------------->|
 
|      logical valid region             |    undefined behavior   |
|       initialized memory              |   uninitialized memory  |
|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|X|X|X|X|X|X|X|X|X|X|X|X|X|
 ^                                       ^
 iterators:
 begin()                                 end()
 
 index:
 first: 0                              last: size()-1
 
 pointer:
 data() = 0x556d9d3b9170 = &v[0]
 
 |_| : element width from sizeof(T) in std::vector<T>
```

The main advantages of using a `std::vector` over raw pointers:
- Self-describing, user must not reimplement and pass around size and capacity
- Safe memory access via iterators and for-range loops
- Minimize access to uninitialized memory
- Default deallocation with RAII when out of scope, optional manual deallocation
- Portability across algorithms relying on iterators

Possible scenarios for using raw pointers could be:
- Direct unsafe access to malloc'ed memory (always populated)
- Small assembly footprint
- Interoperability with C libraries
- Lack of C++ compiler

As usual, always measure your bottlenecks, but `std::vector` is [largely preferred over raw pointers for arrays](http://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rsl-arrays). 

In [None]:
// Example 
#include <vector>
#include <iostream> // std::cout
#include <cstdint> // std::int32_t

std::vector<std::int32_t> myVector(4);
// 3 pointers * 8 bytes/pointer
std::cout << "Sizeof of std::vector class object: " << sizeof(myVector) << "\n\n";

std::cout << "myVector information:\n";
std::cout << "Data Address: " << myVector.data() << "\n";
std::cout << "Size:" << myVector.size() << " elements\n";
std::cout << "Capacity: " << myVector.capacity() << " elements\n";
std::cout << "Memory allocated: " << sizeof(decltype(myVector.back())) * myVector.capacity() << " bytes\n";
std::cout << "values: | ";
std::for_each(myVector.cbegin(), myVector.cend(), [] (const auto v) {std::cout << v << " | ";} );
std::cout << "\n\n";

# Managing std::vector allocation

The size and capacity of a vector can be modified using the following functions to:

1. increase capacity (allocate):
- [reserve](https://en.cppreference.com/w/cpp/container/vector/reserve)

2. modify size and initialize (if required increases capacity):
- [std::vector<T,Allocator>::vector constructor](https://en.cppreference.com/w/cpp/container/vector/vector)
- [resize](https://en.cppreference.com/w/cpp/container/vector/resize)
- [push_back](https://en.cppreference.com/w/cpp/container/vector/push_back)
- [emplace_back](https://en.cppreference.com/w/cpp/container/vector/emplace_back)
- [assign](https://en.cppreference.com/w/cpp/container/vector/assign)

3. reduce capacity (deallocate):
- [swap](https://en.cppreference.com/w/cpp/container/vector/swap)
- [operator=](https://en.cppreference.com/w/cpp/container/vector/operator%3D)
- [shrink_to_fit](https://en.cppreference.com/w/cpp/container/vector/shrink_to_fit) (hint only)

4. reduces size only (do not deallocate):
- [clear](https://en.cppreference.com/w/cpp/container/vector/clear)
- [erase](https://en.cppreference.com/w/cpp/container/vector/erase)


**[reserve](https://en.cppreference.com/w/cpp/container/vector/reserve)**: increases the capacity to reserve memory for a certain number of elements, it **doesn't** modify the *size* neither does shrink current *capacity*. The new capacity will fit the requested number of elements without **initialization**. 

- Consider using **reserve** in HPC environments to control exact unintialized memory allocation costs.
- Minimize the use of **reserve** per std::vector object. A single allocation is ideal, reallocations can be expensive as the operating system (OS) must deal with fragmented memory.

For example:

In [None]:
std::cout << "Capacity: " << myVector.capacity() << "\n";

In [None]:
myVector.reserve(6);
std::cout << "Capacity: " << myVector.capacity() << "\n\n";

**[resize](https://en.cppreference.com/w/cpp/container/vector/resize)**: increases or decreases the size, if necessary increases the capacity and initializes memory up to the new size. **Careful** must be taken in the latter as memory (capacity) growth is **implementation dependent** when using resize, which **must** be considered when programming with `std::vector` in a HPC environment due to the costs associated to memory allocation, deallocation and initialization.

For example: 

In [None]:
std::cout << "Capacity: " << myVector.capacity() << "\n";
std::cout << "Size: " << myVector.size() << "\n\n";

In [None]:
myVector.resize(5);
std::cout << "Capacity: " << myVector.capacity() << " unchanged\n";
std::cout << "Size: " << myVector.size() << " \n\n";

In [None]:
myVector.resize(8);
std::cout << "Capacity: " << myVector.capacity() << " increased\n";
std::cout << "Size: " << myVector.size() << "\n\n";

In [None]:
myVector.resize(15);
std::cout << "Capacity: " << myVector.capacity() << " increased and larger than requested Size\n";
std::cout << "Size: " << myVector.size() << "\n";

In the current state of `myVector` memory access at myVector[15] is undefined behavior, some compilers (e.g. MSVC) flag accessing anywhere between *size* and *capacity* as an error, while others (gcc, clang) could be more permissive. Prefer using iterators for safe access with no penalty and incorporate memory sanitizers in your checks.

```
contents:
|0|0|0|0|0|0|0|0|0|0|0|0|0|0|X|...
                             accessing myVector[ >= 15] is undefined behavior
```

**[std::vector<T,Allocator>::vector constructor](https://en.cppreference.com/w/cpp/container/vector/vector)**: std::vector constructors use by default the "new" allocator for contiguous memory allocation. In addition,  constructor overloads allocates to a certain size and capacity and initialize the memory to a default value for memory safety.

In [None]:
std::vector<float> myFloats(10, 1.f); // size 10, values = 1.f

std::cout << "values: | ";
std::for_each(myFloats.cbegin(), myFloats.cend(), [] (const auto v) {std::cout << v << " | ";} );
std::cout << "\n\n";