# Lecture 2

### boilerplate for snippets 

In [None]:
// general includes
#include <iostream>   // std::cout|endl 
#include <vector>     // std::vector
#include <cstddef>    // std::size_t
#include <functional> // std::function
//#include "lecture2.hpp"
using namespace std;

In [3]:
namespace iue {
using SomeType = double;
SomeType myadd(SomeType a, SomeType b){return a + b;}
SomeType mymul(SomeType a, SomeType b){return a * b;}
SomeType mydiv(SomeType a, SomeType b){return a / b;}
}

In [5]:
using namespace iue;

## Goals for today




- how to digest C++ source code: *take the perspective of the compiler*
- local reasoning about code/effects: *value vs. reference semantics*


## Expressions, types, value categories

Looking at (a certain portion of) C++ source code, it is not immediately apparent what the consequences are exactly:


In [4]:
using Type = SomeType;
Type b{1};                            // (1) 
Type a = b;                           // (2) 
Type c = a + b + 7;                   // (3) 

**What is the state of `b` after (1)?**

**How are `a` and `b` related after (2)?**

**Does (3) have side effects on `a`? What about `b`?**

To answer these questions, it is required to know details about the involved *types* and how the statements in the source code are mapped to certain functionality of those types (e.g. overloaded operators). Additionally, knowledge of the relevant parts of the language specification is required (e.g. precedence of operators). 

## Expressions

An *expression* [(cppref)](https://en.cppreference.com/w/cpp/language/expressions) describes a computation by prescribing a sequence of operations to be performed on a set of operands. For the *fundamental type* `int`, some expressions where the effect of the involved operations can be guessed are:


In [5]:
using Type = int; 
Type a{1};
Type b{2};
Type c{3};
Type res; 
res = a + b;                                 // (1)
res = a + b * c;                             // (2)
res = 2 + 1 / c + c;                         // (3)
res = 2.5 + (c + a);                         // (4) 
a + b + c;                                   // (5) 

6

Above, operands are variables and all operators are binary operators, i.e. require two operands.

**Name the operators present in the above snippet!**

**Examples for other (non-binary) operators?**
- '++'
- '(*ptr)' or 'ptr->', dereferencing
- (double) a; = casting
- sqrt(3.0)

We could transform the above by introducing functions `myadd`, `mymul`, `mydiv` to perform the binary operations:

In [6]:
res = myadd(a, b);                          // (1)
res = myadd(a, mymul(b, c));                // (2) 
res = myadd(myadd(2, mydiv(1, c)), c);      // (3)
res = myadd(2.5, myadd(c, a));              // (4)  
myadd(myadd(b, c), a);                      // (5) 

6.0000000

**Which knowledge is required for the transformation?**


**Can we also implement/substitute the assignment operator?**

The *order of evaluation* [(cppref)](https://en.cppreference.com/w/cpp/language/eval_order) of sub-expressions is not defined; but the operator precedence/associativity defines which operands are associated with each operator. This means that in the below snippet there is **no guarantee in which order the calls to the functions `add` `mul` `div` are evaluated**, just the **combination** of the results is defined by the associativity of the operator **`+`**:

In [7]:
res = myadd(a, b) + mymul(a, b) + mydiv(a, b);

To illustrate this effect, you can think of intermediate results of sub-expressions being stored in temporary objects created during execution of an expression. The lifetime of these temporary objects ends with the full-expression: after evaluation of the expression, all temporary objects are destroyed:

In [8]:
{
    Type tmp3 = mydiv(a, b);    // (3)
    Type tmp1 = myadd(a, b);    // (1) 
    Type tmp2 = mymul(a, b);    // (2)
    res = tmp1 + tmp2 + tmp3;   // (4)
}

**Any situation where the order of evaluations of subexpr. is a problem?**

## Types

Up to now we used the *fundamental type* `int`, so we were not really paying much attention, but:

Every *entity* that is used in an *expression* has a *type* [(cppref)](https://en.cppreference.com/w/cpp/language/type); also each of the (sub)expressions.
This type prescribes the effects when an entity is used in a certain semantic embedding, i.e. as operand to an operator.



In [9]:
std::vector<double> data(100,0.0); // init  
auto copy = data;           // (1) type will be determined at compile time
const auto& d = data[10];   // (2) type will be determined at compile time 

*All* types are determined and fixed at compile time, even if they are not explicitly visible in the source, e.g. using `auto` deduction rules [(cppref)](https://en.cppreference.com/w/cpp/language/auto).

**What are a typical compile time errors w.r.t. types?**
- Type errors: you cannot do that or that with this types

**Discuss pros/cons comparing to interpreted languages (Python/JavaScript)?**

Let's now consider a simple *user-defined type* `Widget`:


In [6]:
struct Widget {
    int i;
};

This type is a *trivial type* [(cppref)](https://en.cppreference.com/w/cpp/named_req/TrivialType), just like all *arithmetic types* ( e.g. `double` `char` `long` ...).

### As `Widget` is trivial, we can construct, assign, copy, and mutate it just like a fundamental type.  Nevertheless, it cannot subsitute for an `int` as other operators (arithmetic, increment) are not available. 


**How could we adapt `Widget` to obtain a non-trivial type?**

The type of each expression in C++ belongs to one of these three groups:
- Value types 
- Reference types
- (Pointer types)

### Value types 

Expressions/objects of *value type* directly represent the (memory of the) underlying objectm

In [9]:
Widget w{1};    // (1)
Widget w2 = w;  // (2) copy underlying representation (memory)
std::cout << w2.i << std::endl; // (3) printing i

w2.i = 3;
std::cout << w.i<< ", w2: "<< w2.i <<std::endl;

1
1, w2: 3


**Is there any benefit of a value type over a reference type for a function parameter?**

More flexibility, can be modified?


### Reference types
Reference types represent an alias to a value type. A reference cannot be re-initialized after initial construction.

In [11]:
Widget w{1};                    // (1) value type
Widget &w2 = w;                 // (2) init w2 as an alias to object w
std::cout << w2.i << std::endl; // (3) perfect alias, no de-referencing required

w2.i = 3;
std::cout << w.i<< ", w2: "<< w2.i <<std::endl;

1
3, w2: 3


**Is there any benefit using a reference instead of a value type for a function parameter?**
Yes, no copy is required!



### (Pointer types)
Pointers types hold the memory address of a value type. The memory address can be reset anytime.
To access the underlying object, dereferencing is required.

In [13]:
Widget w{1};                        // (1)
Widget *w2 = &w;                    // (2) w2 holds/references the memory address of w
std::cout << (*w2).i << std::endl;  // (3) de-referencing required when accessing i

1


@0x7f665a123de0

**Why should you try to avoid using *raw pointers*?**

### Read-only access: `const`
Each type can be qualified as `const`, i.e. only allowing to read, but not modify the underlying object.

### `const` value types

In [14]:
const Widget w{1};    
std::cout << w.i << std::endl;  // (2) ok
w.i = 1;                        // (3) compile error: writing access 

input_line_29:4:5: error: cannot assign to variable 'w' with const-qualified type 'const __cling_N510::Widget'
w.i = 1;                        // (3) compile error: writing access 
~~~ ^
input_line_29:2:15: note: variable 'w' declared const here
 const Widget w{1};    
 ~~~~~~~~~~~~~^~~~


Interpreter Error: 

**Is there any benefit from marking a local variable of value type `const`?**

### `const` reference types

In [None]:
Widget w{1};                    // (1)
const Widget &w2 = w;           // (2) init w2 as an alias (read-only) to object w
std::cout << w2.i << std::endl; // (3) ok
w2.i = 1;                       // (4) compile error: writing access
w.i = 1;                        // works

**Is there any benefit from marking a reference `const`?**

### (`const` pointer types)

In [None]:
Widget w{1};                        // (1)
const Widget *w2 = &w;              // (2) w2 holds/references the memory address of w
std::cout << (*w2).i << std::endl;  // (3) ok
(*w2).i = 1;                        // (4) compile error: writing access

**Again, why should you also try to avoid using `const` *raw pointers* ?**

## Value Categories

Additionally to a *type*, each expression is also characterized by its *value category* [(cppref)](https://en.cppreference.com/w/cpp/language/value_category).
Each expression either belongs to
the 
### *lvalue expression* category (designating objects with a storage location) or to
### the *rvalue expression* category (no storage location associated).

**Note**: actually there are more *value categories*, but we come back to this in a later lecture.

The consequence is that *rvalue expressions* (rvalues) cannot stand on the left-hand-side of an assignment, as no storage location is associated, which could serve as target for a meaningful assignment. So rvalues can solely appear on the right-hand-side of an assignment, hence their name.
On the other hand, *lvalue expressions* (lvalues) can be the target for an assignment, i.e. can stand on the left-hand-side but can also be used on the right-hand-side of an assignment.

Based on this separation, the language defines exact rules, which expressions (e.g. assignments) are compatible based on the involved types **and** their value categories of the operands on the left-hand-side and right-hand-side.

In [None]:
// LHS = RHS
int i = 1.0; 
i = 1.0;     // (1) lvalue = rvalue --> works
2.0 = i;     // (2) rvalue = lvalue --> not allowed
i = i;       // (3) lvalue = lvalue --> works


### Non-const lvalue references

**Rule**: non-const lvalue references can only be initialized using lvalue expressions determining non-const objects.

In [12]:
Widget var{1}; // init
const Widget cvar{};
Widget &lref1 = var;            // (1) lvalue = lvalue --> works
Widget &lref2 = cvar;           // (2) lvalue = const lvalue --> does not work
Widget &lref3 = Widget{};       // (3) lvalue = rvalue (temporary) --> does not work

input_line_26:5:9: error: binding reference of type '__cling_N52::Widget' to value of type 'const __cling_N52::Widget' drops 'const' qualifier
Widget &lref2 = cvar;           // (2) lvalue = const lvalue --> does not work
        ^       ~~~~
input_line_26:6:9: error: non-const lvalue reference to type '__cling_N52::Widget' cannot bind to a temporary of type '__cling_N52::Widget'
Widget &lref3 = Widget{};       // (3) lvalue = rvalue (temporary) --> does not work
        ^       ~~~~~~~~


Interpreter Error: 

The situation for function arguments is identical:

In [None]:
void myfunc(Widget &lref){} 

Widget var{};
const Widget cvar{};

myfunc(var);            // (1) ... same as above
myfunc(cvar);           // (2) ... same as above 
myfunc(Widget{});       // (3) ... same as above


### Const lvalue references

**Rule**: If a lvalue reference is declared `const` it can also be initialized with const lvalues **and rvalues**.


In [13]:
Widget var{1}; // init
const Widget cvar{};
const Widget &lref1 = var;      // (1) const lvalue ref = lvalue --> works
const Widget &lref2 = cvar;     // (2) const lvalue ref = const lvalue -->works
const Widget &lref3 = Widget{}; // (3) const lvalue ref = rvalue  --> works 

The situation for function arguments is identical:

In [14]:
void myfunc(const Widget &lref){}      

Widget var{};
const Widget cvar{};

myfunc(var);            // (1) ... same as above
myfunc(cvar);           // (2) ... same as above  
myfunc(Widget{});       // (3) ... same as above


# Summary

- beside declarations and definitions, nearly everything is an expression or operator
- each expression has a type
- three main flavours of types: value, reference, pointer
- types can be qualified as `const`: prohibit mutating access
- each expression also has a value category: lvalue or rvalue
- binding rules for references
- do not use *raw pointers* (if there is no proper reason)

## Revisiting ex0

Equipped with this knowledge we can now look again at the function declarations in **ex0** and reason about the choice of types for the arguments for the individual functions:


In [None]:
namespace ex0 {

using Vector = std::vector<double>;
using Compare = std::function<bool(const double& a, const double& b)>;

void print(const Vector& vec);
void reset(Vector& vec);
Vector copy(const Vector& vec);
Vector concat(const Vector& a, const Vector& b);
void swap(Vector& a, Vector& b);
void fill_uniform_random(Vector& vec, std::size_t n, double lower, double upper);
void sort(Vector& vec, Compare comp);

}