# Arrays and Strings

## Simple Arrays

Until now, we have seen how to declare, initialize, and use variables with a single value in them such as a character, an integer, a float... In science in general is very common to have to deal with vectors. In C++, we can make **arrays** of any type of variable, objects (we will talk about them in another lesson) included. A simple way to define an array is:
```c++
int array[3];     // defines an int array of length 3
char array2[5];   // defines a char array of length 5
```
When we write `T array[n]`, we are reserving `n` slots in the memory for a type `T`, that can be `int`, `char`, `double`... In C++, arrays start at `0`. Look at the following example:

In [None]:
!gedit src/array_01.cpp

In [None]:
!g++ -o array_01 src/array_01.cpp

In [None]:
!./array_01 

As you can see, eventhough we reserved the memory for only 3 elements, we can access the fourth and fifth element. What happened here? You have to imagine the memory of the computer as a big column of cells. When we reserve the memory for 3 elements, 3 of those cells, which are condiguous, are blocked and used only by the array we just defined. Let's see if we can visualize it better. Let's assume the pice of code:
```c++
// A
int a[3];

// B
a[0] = 20;
a[1] = 32;
a[2] = 41;

// C
```
Now look at the figure below. The letters in the code corresponds to the state of the memory at that point of the code. In `A`, we don't have anything reserved or initialized. At `B`, we have declared an integer array `a` of length 3, so the memory is reserved, but we still have not initialized the values. Finally, at `C`, we have initialized the values of the array. 

![figure1.png](attachment:figure1.png)

Summarizing, when we declare int `a[3]`, we are declaring a pointer `a` that will point to the memory address `101` (in this example), and reserving the following two memory addresses, `102` and `103`. We will go deeper in pointers in another lesson, so don't worry a lot about this for now. When we access `a[1]`, we tell the computer to access the memory address of `a`, `101`, and move one forward. We can obviously access the memory address `104` by calling `a[3]`, but that memory address might be used by another part of the computer, and changing it might make your computer break for ever. You probably don't want that, so one needs to be careful when playing with these kind of arrays.

## Vectors

As we probably all agree, dealing with arrays can be a little bit annoying. In C++ there is a safer and more friendly version of arrays: the **vector** structure. A vector is a class that works in the same way as an array, but it blocks accessing or modifying a memory address that is not in the vector. It also has a lot of other utilities. In order to define a vector, one needs to include the `vector` library. Also, when declaring a vector, we need to specify the type. We can also specify the size and the initial values. You can learn more in [the cplusplus vector page](http://www.cplusplus.com/reference/vector/vector/vector/)
```c++
#include <vector>

std::vector<int> a;           // Declares a vector a of integers
std::vector<float> b(10);     // Declares a vector b of floats with size = 10 (b[0] -- b[9])
std::vector<double> c(4,0.0); // Declares a vector c of doubles with size = 4 and initializes them to 0.0.
```
In C++11 and later, one can initialize arrays and vectors directly with values:
```
double a[] = {1.0,2.0,3.0};
std::vector<double> b = {1.0,2.0,3.0};
```
Let's also look at what can we do with a vector. We will also introduce `size_t`, a variable type equivalent to a `long unsigned int`. Lets assume we declared a vector `v`, and filled it with a certain amount of data. These are some of the most common operations in vectors:
- `v.at(i)` or `v[i]` is the way to access the index `i` of a vector. We can read or write in there. As an example, `v[2] = 3` sets the third element to 3, and exact same result is accomplished by `v.at(2) = 3`.
- `v.size()` is a vector function that returns the number of elements in a vector. It will return the number of elements allocated, even if they are not initialized. 
- `v.push_back(var)` will add a new element at the end of the vector with value `var`. The size will increase by 1.
- `v.back()` returns the last element of the vector. Is equivalent to `v[v.size() - 1]`.
- `v.pop_back()` removes the last element of a vector.
- `v.clear()` deletes all elements
- `v.resize(n)` changes the size of the vector to `n`. Let's see an example:

```c++
std::vector<int> v(3,0); // Vector v has 3 elements initialized to 0
std::cout << v.size() << std::endl; // Will print 3, which is the size of v

v.resize(5); // The size of the vector v is increased to 5.
             // We add 2 elements, but they are not initialized
std::cout << v.size() << std::endl; // Will print 5, which is the current size of v
```

Look at the following code and let's try to guess the statements:

In [None]:
!gedit src/vectors_01.cpp

In [None]:
!g++ -o vectors_01 src/vectors_01.cpp

In [None]:
!./vectors_01

Since vectors are self-contained objects, we can also do operations that are not possible with simple arrays. As an example:
- We can copy them as we copy an integer or a float:
```c++
std::vector<double> a(3,0.0);
std::vector<double> b;
b = a; // Now b has length 3 and all the elements set to 0.0
```
- We can swap them as we can also swap integers:

```c++
int a = 2;   // Define integer a
int b = 5;   // Define integer b
int tmp_int; // Define temporary integer

tmp_int = a;
a = b;
b = tmp_int;  // Now a and b are swaped

std::vector<int> a(3,1.0);
std::vector<int> b(5,2.0);
std::vector<int> tmp_vec;

tmp_vec = a;
a = b;
b = tmp_vec;  // Now vectors a and b are swaped.
```

## 2D (and ND) arrays

More than once we will need to use more than one dimentional arrays. We can declare arrays with as many dimentions as we want. As an example, we will work with 2D, but is extendable to more. We can define a 2D array/vector as:
```c++
double a[3][4]; // Defines a double array of 3 rows and 4 columns.

// Note the space between > >. In C++, >> is an operator, and some compilers will complain!
std::vector<std::vector<int> > av(3, std::vector<int>(4)); // Defines an int vector of 3 rows and 4 columns
```
To access an element is simple: `a[i][j]` will access the ith row and jth column. Also, in the same way as 1D arrays, we can initialize them directly with numbers. However, we have to be careful. In arrays, all the dimentions must be set except the first one.
```c++
int a[][] = { {1,2},
              {5,8},
              {3,1} }; // Will fail compilation
              
int a[][2] = { {1,2},
               {5,8},
               {3,1} }; // Will not fail compilation
```
In vectors, we cannot initialize unless we specify to use C++11. This is done by adding the compilation flag `-std=c++11`
```c++
std::vector<std::vector<int> > b = { {1,2},
                                     {5,8},
                                     {3,1} };
```
Look at the example below:

In [None]:
!gedit src/vectors_02.cpp

In [None]:
!g++ -std=c++11 -o vectors_02 src/vectors_02.cpp

In [None]:
!./vectors_02

IMPORTANT NOTE. In fortran, the 2-d arrays are defined as first columns and then rows. Thus, if you pass a 2D array from fortran to C++, it will be transposed!

## Examples

### Example 1

Complete the code below to create a vector with the order inverted from the vector already initialized. You can look at what the answer should look like by compiling and executing the solution.

In [None]:
!gedit src/example_01.cpp

In [None]:
!g++ -o example_01 src/example_01.cpp

In [None]:
!./example_01

In [None]:
!g++ -std=c++11 -o example_01_sol src/example_01_sol.cpp

In [None]:
!./example_01_sol

### Example 2

Given the 2D vector defined in the code, transpose it, whithout creating any extra vector, and print the result. As usual, check the solution if you get stacked!

In [None]:
!gedit src/example_02.cpp

In [None]:
!g++ -std=c++11 -o example_02 src/example_02.cpp

In [None]:
!./example_02

In [None]:
!g++ -std=c++11 -o example_02_sol src/example_02_sol.cpp

In [None]:
!./example_02_sol

## Char arrays

C++ can be a little bit annoying when dealing with strings. The most important characteristic of C++ is that a string ends always with a null character `\0`. Fortunately, this is automatically taken care by the compiler, but is something one needs to keep in mind in case we want to interface C++ and other languages like fortran. The definition and initialization of char strings is identical to other types:
```c++
char name[10] = "Marc";
```
In this case, the memory reserved is for 10 characters, but only 9 are usable, since the last one must be the null character. We can also let the compiler find the length:
```c++
char name[] = "Marc";
```
When dealing with these kind of variables, if you encounter any problem, the best thing to do is to look at [cplusplus.com](http://www.cplusplus.com/) for more information.

Let's see how can we input a word to our code. The program below shows an example on how to input a word as a command line argument, store it in a char array, and print the first letter. The syntax is complicated, so for now we will just use `argv[i]`, and forget about the pointer properties.

In [None]:
!gedit src/char_arrays_01.cpp

In [None]:
!g++ -o char_arrays_01 src/char_arrays_01.cpp

In [None]:
!./char_arrays_01 myword

## Examples

### Example 3

Write a program that receives a text as a command line argument, and counts how many characters does it have, but ignoring the spaces.

In [None]:
!gedit src/example_03.cpp

In [None]:
!g++ -o example_03 src/example_03.cpp

In [None]:
!./example_03 "This text has 23 characters"

In [None]:
!g++ -o example_03_sol src/example_03_sol.cpp

In [None]:
!./example_03_sol "This text has 23 characters"

## Strings

Dealing with strings, as with simple arrays, is a little bit annoying and tricky. Fortunately for us, C++ has a data structure called `string`. It is basically a vector of characters, and needs the library `<string>`. There are a few functions we can use for strings:
- `string.length()` or `string.size()` return the length (number of characters) 
- `string.empty()` returns true if the string is "", and false if tehre is something inside
- `string.find(item)` or `string.find(item,index)` returns the position of the first coincidence of item in the string. If it does not find it, it returns -1. Item can be a character or another string.
- `string.substr(index,length)` returns a substring starting at position index and with length characters.
- `string.push_back(char)` appends a new character char to the end of the string.
- `string.append(str)` appends to string the string str.
- `string.insert(index,substr)` inserts the substring substr at position index in the string.
- `string.replace(index,len,str)` replaces len characters in string starting at index by str.
- `string.resize(n)` resizes string. If new string is longer, null caracters are placed. If shorter, extra characters are dropped.
- `string.clear()` empties string.
- `str1 == str2` will be true if all the characters and legths are the same
- `str1 + str2` will return a new string with str2 appened at the end of str1

A part from all these string operations, we can also modify characters by including the library `<cctype>`. Some of the useful functions in that library are:
- `isalpha(c)` returns true if c is a-z, A-Z.
- `isdigit(c)` returns true if c is 0-9.
- `isspace(c)` returns true if c is `' '` or `'\n'`.
- `toupper(c)` converts c to uppercase. Leaves upper case and non-alphabetic chars untouched.
- `tolower(c)` converts c to lowercase. Leaves lower case and non-alphabetic chars untouched.

Look at the following code to see how to use some of those functions, how to declare and initialize strings, and how to manipulate them.

In [None]:
!gedit src/string_01.cpp

In [None]:
!g++ -o string_01 src/string_01.cpp

In [None]:
!./string_01

There are ways to manipulate strings that are more compact, but they require the use of **iterators**, and we will come back to it later. I put an example on that in the second problem, so if you want to use it, you can use it.

## Problems

### Problem 1

Given an array of length `N` that contains only integers, print the special numbers of array. A number in this array is called Special number if it is divisible by at least one other number in the array. N should be passed as a command line argument, and the code will generate random integer numbers between `1` and `5*N`.

In [None]:
!gedit src/problem_01.cpp

In [None]:
!g++ -o problem_01 src/problem_01.cpp
!g++ -o problem_01_sol src/problem_01_sol.cpp

In [None]:
print("Your solution:")
!./problem_01 10
print("\nSolution:")
!./problem_01_sol 10

### Problem 2

Given two strings as a command line argument, make a program that removes from the second string all the letters that appear in the first one. As an example, `my` and `Marc is amazing` will return `arc is aazing`. 

In [None]:
!gedit src/problem_02.cpp

In [None]:
!g++ -o problem_02 src/problem_02.cpp
!g++ -o problem_02_sol src/problem_02_sol.cpp

In [None]:
print("Your solution:")
!./problem_02 "ma" "Marc is amaZing" 
print("\nSolution:")
!./problem_02_sol "ma" "Marc is amaZing"