# Vectors, Lists, and Sets

- Vector
  - declaration with template type
  - vector literal
  - push_back
  - `[]` vs `at`
  - foreach iteration
  - `auto`
- Set
  - no duplicates
  - `set` is ordered
  - `unordered_set` is not obviously ordered
  - performance comparison
- Class activity: user registration
  - use vector to track names + ID
  - use set to track IDs (so no duplicate IDs)
- HW
  - Implement someline like `uniq` utility
    - print unique, sorted lines from stdin/file

In [1]:
#include <iostream>
using std::cout, std::endl;

#include <string>
using std::string;

## Vector

In [2]:
#include <vector>
using std::vector;

In [3]:
vector<int> numbers = {1, 2, 3, 4, 5};

for (int i = 0; i < numbers.size(); i++) {
    cout << numbers[i] << endl;
}

1
2
3
4
5


- Inline vector literal using `{ }`.
- Vector access using `[ ]`
- `.size()`
- **template** types - it is not enougth to have a "vector": you have a vector of some type.
  - this way the compiler knows what type of thing is inside


In [4]:
vector<int> numbers;

for (int i = 0; i < 10; i++) {
    numbers.push_back(i);
}

for (int num : numbers) {
    cout << num << endl;
}

0
1
2
3
4
5
6
7
8
9


- Declaration of `numbers` makes empty vector
- `.push_back`
- for-each iteration syntax

```c++
vector<int> numbers;
```

**not**

```c++
vector<int> numbers();
```

The second version is interpreted as a function declaration. 

I.E. a function named `numbers` that takes no arguments and returns a `vector<int>`

In [5]:
vector<int> other_numbers(numbers);

for (auto num : other_numbers) {
    cout << num << endl;
}

0
1
2
3
4
5
6
7
8
9


- copy constructor
- `auto` keyword when declaring a thing of a type the compiler can figure out

In [6]:
vector<string> words = {"Do", "you", "know", "the", "muffin", "man?"};
for (auto word : words) { cout << word << endl; }

Do
you
know
the
muffin
man?


In [7]:
words.pop_back();
words.pop_back();
words.pop_back();
for (auto word : words) { cout << word << endl; }

Do
you
know


- `.pop_back()` removes items from the back.

In [8]:
vector<string> words = {"and", "it", "came", "to", "pass"};
words.pop_back();

cout << "size is " << words.size() << endl;
cout << "last word is: " << words[words.size() - 1] << endl;
cout << "words: " << endl;

for (auto word : words) { cout << word << endl; }

cout << "----" << endl;
cout << "size is " << words.size() << endl;
cout << "last word is: " << words[words.size() - 1] << endl;
cout << "words: " << endl;

for (int i = 0; i < 5; i++) {
    cout << words[i] << endl;
}

size is 4
last word is: to
words: 
and
it
came
to
----
size is 4
last word is: to
words: 
and
it
came
to
pass


<div style='font-size: 200pt'> 🤨 </div>

- the `[ ]` operator does not check that the index is valid. 
  - maybe you get lucky and you try to access invalid memory and your program crashes
  - maybe you get unlucky and end up accessing memory that is valid (and thus has data) but isn't part of the vector, so your program continues onward with bogus data and trainwrecks somewhere down the line.

In [9]:
vector<string> words = {"and", "it", "came", "to", "pass"};
words.pop_back();

cout << "size is " << words.size() << endl;
cout << "last word is: " << words[words.size() - 1] << endl;
cout << "words: " << endl;

for (auto word : words) {
    cout << word << endl;
}

cout << "----" << endl;
cout << "size is " << words.size() << endl;
cout << "last word is: " << words[words.size() - 1] << endl;
cout << "words: " << endl;

for (int i = 0; i < 5; i++) {
    cout << words.at(i) << endl;
}

size is 4
last word is: to
words: 
and
it
came
to
----
size is 4
last word is: to
words: 
and
it
came
to


Standard Exception: vector::_M_range_check: __n (which is 4) >= this->size() (which is 4)

- `.at()` throws an expection when you go out of bounds

## 👷🏼‍♀️ `grocery.cpp`

Write a program that queries the user for a list of items they want on their grocery list.

Then print out the grocery list.

```
Item: eggs
Item: milk
Item: cheese
Item: bread
Item:
- eggs
- milk
- cheese
- bread
```

Can you add an item to the front of a vector?

## List

In [10]:
#include <list>
using std::list;

In [11]:
list<int> numbers = {1, 2, 3, 4, 5};
for (auto num : numbers) {
    cout << num << endl;
}

1
2
3
4
5


In [12]:
numbers.push_front(0);
numbers.push_back(6);
for (auto num : numbers) {
    cout << num << endl;
}

0
1
2
3
4
5
6


In [13]:
numbers.pop_front();
numbers.pop_back();
for (auto num : numbers) {
    cout << num << endl;
}

1
2
3
4
5


- you can add and remove items to the front **or** back of a list.

In [14]:
numbers[0]

input_line_25:2:9: error: type 'list<int>' does not provide a subscript operator
 numbers[0]
 ~~~~~~~^~


Interpreter Error: 

In [15]:
numbers.at(1)

input_line_26:2:10: error: no member named 'at' in 'std::__cxx11::list<int, std::allocator<int> >'
 numbers.at(1)
 ~~~~~~~ ^


Interpreter Error: 

Lists do not give you random access (i.e. the `[]` operator)

Why would you use a `list` instead of a `vector`?

<div style='font-size: 100pt'> 🤔 </div>

## Set

In [16]:
#include <set>
using std::set;

In [17]:
set<int> numbers = {1, 2, 3, 1, 2, 3};
for (int num : numbers) {
    cout << num << endl;
}

1
2
3


- `set` does not store duplicates

In [19]:
string month;
set<string> seen;
while (std::getline(std::cin, month)) {
    if (month == "") { break; }
    seen.insert(month);
}
for (auto item : seen) { cout << item << endl; }

jan
feb
mar
jan
feb
jul

feb
jan
jul
mar


- use `set` to filter out duplicates

In [20]:
string fruit;
set<string> fruits;
while (std::getline(std::cin, fruit)) {
    if (fruit == "") { break; }
    if (fruits.find(fruit) == fruits.end()) {
        fruits.insert(fruit);
    } else {
        cout << "We already have that fruit." << endl;
    }
}

cout << endl << "FRUITS" << endl;
for (auto fruit : fruits) {
    cout << fruit << endl;
}

apple
apple
We already have that fruit.
pear
banana
pear
We already have that fruit.


FRUITS
apple
banana
pear


- `.find()`, `.end()` syntax
- Note the order the fruits are printed

In [21]:
#include <unordered_set>
using std::unordered_set;

In [22]:
string fruit;
unordered_set<string> fruits;
while (std::getline(std::cin, fruit)) {
    if (fruit == "") { break; }
    if (fruits.find(fruit) == fruits.end()) {
        fruits.insert(fruit);
    } else {
        cout << "We already have that fruit." << endl;
    }
}

cout << endl << "FRUITS" << endl;
for (auto fruit : fruits) {
    cout << fruit << endl;
}

kiwi
kiwiw
kiwi
We already have that fruit.
cherry
banana
banana
We already have that fruit.
strawberry
apple
peach
pear
pear
We already have that fruit.


FRUITS
peach
pear
apple
banana
cherry
kiwiw
strawberry
kiwi


- everything was the same except for `unordered_set`
- the items in the set don't have a defined order
- so why have `unordered_set`?

### `set_timing.cpp`

## Big O

We use the term **big-O** to describe how the performance of an algorithm or data structure changes with respect to the number of elements (typically represented by **n**).

- When the time a algorithm takes to do something doesn't change, no matter the size, we say that algorithm is "constant time" or $O(1)$
  - The `unordered_set` behavior is $O(1)$
- When the time of the algorithm grows in direct proportion to the number of items, we say that algorithm is "linear" or $O(n)$
  - If the input goes up by 10x, then the runtime should go up by about 10x. 
- When the time of the algorithm grows in proportion to the logarithm of the input, we say that algorithm is "logarithmic" or $O(\log n)$
  - We see that the `set` runtime grows by a fixed amount for each power of 10 increase in the input size
  - So the `set` algorithm is $O(\log n)$

## 👷🏻 `user_registration.cpp`

Write a program that registers users with unique IDs.

The user is first prompted to provide a unique ID.
- If the ID is already in use, indicate this to the user and prompt them again.
- If the ID is available proceed

The user is then prompted for their first and last name.

The next user is then registered.

When a user enters nothing (i.e. empty line) for the ID, registration stops and all the registered users are printed with the following format:

`First Last (ID)`

## Key Ideas

- `vector`
- `set`, `unordered_set`
- Filtering unique values