## **std::vector**

`std::vector` is a container describing a contiguous piece of **"typed"** (_e.g._ `float`, `int`, custom objects) memory. Access can be done through indices, or iterators making it the preferred container to get performance in HPC systems due to its vectorization and cache friendly (zero per-element indexing overhead) characteristics. `std::vector` containers are heap-allocated allowing for flexible reallocations if required. For stack-allocated containers see `std::array`. 

In its simplest form it is a self-describing piece of contiguous memory with 3 underlying pointers that describe:

- **data**: memory address
- **size**: number of elements 
- **capacity**: allocated memory as number of elements (not bytes) >= size

gcc and clang implementations of `std::vector` return a size of 24 bytes ( 3 x 8 bytes = pointer size) for `sizeof` accounting for pointers.

`std::vector` object underlying contiguous memory representation:

```
|----------capacity()-------------------------------------------->|
|----------size()---------------------->|
 
|      logical valid region             |    undefined behavior   |
|       initialized memory              |   uninitialized memory  |
|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|X|X|X|X|X|X|X|X|X|X|X|X|X|
 ^                                       ^
 iterators:
 begin()                                 end()
 
 index:
 first: 0                              last: size()-1
 
 pointer:
 data() = 0x556d9d3b9170 = &v[0]
 
 |_| : element width from sizeof(type)
```

The main advantages of using a `std::vector` over raw pointers:
- Self-describing, user must not reimplement and pass around size and capacity
- Safe memory access via iterators and for-range loops
- Minimize access to uninitialized memory
- Default deallocation with RAII when out of scope, optional manual deallocation
- Portability across algorithms relying on iterators

The only scenario for raw pointer could be:
- Direct unsafe access to malloc'ed memory (always populated)
- Small assembly footprint

As usual, always measure your bottlenecks, `std::vector` is largely preferred over raw pointers. 

In [2]:
// Example 
#include <vector>
#include <iostream> // std::cout
#include <cstdint> // std::int32_t

std::vector<std::int32_t> myVector(4);
// 3 pointers * 8 bytes/pointer
std::cout << "Sizeof of std::vector class object: " << sizeof(myVector) << "\n\n";

std::cout << "myVector information:\n";
std::cout << "Data Address: " << myVector.data() << "\n";
std::cout << "Size:" << myVector.size() << " elements\n";
std::cout << "Capacity: " << myVector.capacity() << " elements\n";
std::cout << "Memory allocated: " << sizeof(decltype(myVector.back())) * myVector.capacity() << " bytes\n\n";

Sizeof of std::vector class object: 24

myVector information:
Data Address: 0x5640cce53ec0
Size:4 elements
Capacity: 4 elements
Memory allocated: 16 bytes



# Managing std::vector allocation

The size and capacity of a vector can be modified using the following functions to:

1. increase capacity :
- [reserve](https://en.cppreference.com/w/cpp/container/vector/reserve)

2. reduce capacity:
- [swap](https://en.cppreference.com/w/cpp/container/vector/swap)
- [operator=](https://en.cppreference.com/w/cpp/container/vector/operator%3D)
- [shrink_to_fit](https://en.cppreference.com/w/cpp/container/vector/shrink_to_fit) (hint only)

3. modify size (if required increase capacity):
- [resize](https://en.cppreference.com/w/cpp/container/vector/resize)
- [push_back](https://en.cppreference.com/w/cpp/container/vector/push_back)
- [emplace_back](https://en.cppreference.com/w/cpp/container/vector/emplace_back)
- [assign](https://en.cppreference.com/w/cpp/container/vector/assign)

4. reduces size only:
- [clear](https://en.cppreference.com/w/cpp/container/vector/clear)
- [erase](https://en.cppreference.com/w/cpp/container/vector/erase)


**[reserve](https://en.cppreference.com/w/cpp/container/vector/reserve)**: increases the capacity to reserve memory for a certain number of elements, it **doesn't** modify the *size* neither does shrink current *capacity*. The new capacity will fit the requested number of elements without **initialization**. 

- Consider using **reserve** in HPC environments to control exact unintialized memory allocation costs.
- Minimize the use of **reserve** per std::vector object. A single allocation is ideal, reallocations can be expensive as the operating system (OS) must deal with fragmented memory.

For example:

In [3]:
std::cout << "Capacity: " << myVector.capacity() << "\n";

Capacity: 4


In [4]:
myVector.reserve(6);
std::cout << "Capacity: " << myVector.capacity() << "\n\n";

Capacity: 6



**[resize](https://en.cppreference.com/w/cpp/container/vector/resize)**: increases or decreases the size, if necessary increases the capacity and initializes memory up to the new size. **Careful** must be taken in the latter as memory (capacity) growth is **implementation dependent** when using resize, which **must** be considered when programming with `std::vector` in a HPC environment due to the costs associated to memory allocation and deallocation. 

For example: 

In [5]:
std::cout << "Capacity: " << myVector.capacity() << "\n";
std::cout << "Size: " << myVector.size() << "\n\n";

Capacity: 6
Size: 4



In [6]:
myVector.resize(5);
std::cout << "Capacity: " << myVector.capacity() << " unchanged\n";
std::cout << "Size: " << myVector.size() << " \n\n";

Capacity: 6 unchanged
Size: 5 



In [7]:
myVector.resize(8);
std::cout << "Capacity: " << myVector.capacity() << " increased\n";
std::cout << "Size: " << myVector.size() << "\n\n";

Capacity: 10 increased
Size: 8



In [8]:
myVector.resize(15);
std::cout << "Capacity: " << myVector.capacity() << " increased and larger than requested Size\n";
std::cout << "Size: " << myVector.size() << "\n";

Capacity: 16 increased and larger than requested Size
Size: 15


In the current state of `myVector` memory access at myVector[15] is undefined behavior, some compilers (e.g. MSVC) flag accessing anywhere between *size* and *capacity* as an error, while others (gcc, clang) could be more permissive. Prefer using iterators for safe access with no penalty and incorporate memory sanitizers in your checks.

In [None]:
```
contents:
|0|0|0|0|0|0|0|0|0|0|0|0|0|0|X|
                             myVector[15] is undefined behavior
```