# 03 ‚Äî Strings, Vectors, and the Cost of Copying üìâ

## 1. The Standard Containers

In C, you use char* for strings and int* (with malloc) for arrays.
In C++, we use std::string and std::vector.

### Why switch?
1.  Memory Management: They use RAII. No free(). No memory leaks.
2.  Resizing: They grow automatically.
3.  Safety: std::string handles null-termination for you.

### ‚ö†Ô∏è Critical for your MPI/OpenMP background
You might worry: "Is std::vector compatible with C libraries that expect raw pointers?"

YES. The C++ Standard guarantees that std::vector stores elements in contiguous memory, exactly like a C array.

You can get the raw pointer using .data():


In [2]:
#include <vector>
#include <iostream>
#include <cstring> // for memcpy

{
// Create a vector
std::vector<int> numbers = {10, 20, 30, 40};

// Access underlying raw pointer (int*)
int* raw_ptr = numbers.data();

// Prove it works like a C-array
std::cout << "First element via pointer: " << raw_ptr[0] << std::endl;

// Modify via pointer
raw_ptr[1] = 999;
std::cout << "Modified vector element: " << numbers[1] << std::endl;
}

First element via pointer: 10
Modified vector element: 999


---

## 2. The Copy Trap: struct A a = b;

This is the most important behavioral difference between C and C++ structs.

### The C Scenario (Shallow Copy)
In C, a = b performs a bitwise copy (memcpy). 
* If the struct contains a pointer, only the address is copied.
* Result: Two structs pointing to the same buffer. If one calls free(), the other is left dangling. Double-free crashes are common.

### The C++ Scenario (Deep Copy)
In C++, containers like std::vector and std::string define a Copy Constructor.
When you write a = b:
1.  a allocates new heap memory.
2.  a copies the contents from b's buffer to the new buffer.
* Result: Two completely independent objects.
* Cost: High. This is an O(N) operation involving malloc.


In [2]:
// Let's verify the Deep Copy behavior
std::vector<int> original = {1, 2, 3, 4, 5};

// In C, this would be a pointer copy.
// In C++, this ALLOCATES new memory and copies all elements.
std::vector<int> copy = original;

// Modify the copy
copy[0] = 100;

// Check the original
std::cout << "Copy[0]: " << copy[0] << std::endl;
std::cout << "Original[0]: " << original[0] << " (Unchanged!)" << std::endl;

// Pointers are different
std::cout << "Original Address: " << original.data() << std::endl;
std::cout << "Copy Address:     " << copy.data() << std::endl;


Copy[0]: 100
Original[0]: 1 (Unchanged!)
Original Address: 0x636152bfc530
Copy Address:     0x636151fdae20


---

## 3. The Solution: References (&)

Because copying is expensive, how do we pass data to functions without copying?
In C, you use pointers: void func(const int* ptr).
In C++, we use References: void func(const int& ref).

A reference is a guaranteed non-null pointer that has the syntax of a value.

| Feature | Pointer (T*) | Reference (T&) |
| :--- | :--- | :--- |
| Can be null? | Yes | No |
| Can be re-seated? | Yes | No (bound at birth) |
| Syntax | *ptr | ref (no dereference needed) |


In [3]:
// BAD C++: Passes by Value (Triggers a COPY)
void bad_print(std::string s) {
    std::cout << "Bad: " << s << std::endl;
}

In [4]:
// GOOD C++: Passes by Reference (No Copy)
// const: I promise not to change it
// &: Pass the memory address, not the value
void good_print(const std::string& s) {
    std::cout << "Good: " << s << std::endl;
}

In [5]:
{
std::string huge_text = "Imagine this text is 100MB big...";

// bad_print(huge_text); // Allocates 100MB, copies, prints, frees.
good_print(huge_text);   // Passes 8 bytes (pointer), prints.
}

Good: Imagine this text is 100MB big...


## 4. Range-Based Loops

Finally, this leads to the most common C++ idiom: the Range-based For Loop.

Note the use of auto&.


In [6]:
{
    std::vector<std::string> names = {"Alice", "Bob", "Charlie"};

    // 1. By Value (auto name : names)
    // COPIES every string. Slow.
    for (auto name : names) {
        // name is a temporary copy
    }

    // 2. By Reference (auto& name : names)
    // Fast. Allows modification.
    for (auto& name : names) {
        name += "_User"; // Modifies original list
    }

    // 3. By Const Reference (const auto& name : names)
    // Fast. Read-only. (Preferred if just reading)
    for (const auto& name : names) {
        std::cout << name << std::endl;
    }
}

Alice_User
Bob_User
Charlie_User


## Summary

1.  std::vector is a safe, resizeable, C-compatible array.
2.  std::string is a safe string buffer.
3.  Assignment (=) in C++ performs a Deep Copy (allocates memory). It is safe but potentially slow.
4.  Always pass complex objects (strings, vectors, classes) by const reference (const T&) to avoid unnecessary copies.
