# Introduction to Python and Natural Language Technologies

## Type system and built-in types

## Errata and Q&A

19 September 2017

# Errata

In this section, we shall fix some omissions from the lecture slides and try to clear up any confusion about certain concepts that we saw (or induced) during the lecture and the lab.

## 1. In-place sorting

In Python, the `sorted()` function can be used on any **sequence** to return a sorted **list**:

In [1]:
l1 = sorted(['b', 'c', 'a'])  # a list
l2 = sorted(('b', 'c', 'a'))  # a tuple
l3 = sorted('bca')  # a string

print(l1, l1 == l2 == l3)
print(type(l1) == type(l2) == type(l3) == list)

['a', 'b', 'c'] True
True


`sorted()` always returns a new object.

In [2]:
l = [1, 2, 3]
s = sorted(l)
print('This should be False:', id(l) == id(s))

This should be False: False


It is also possible to sort lists in place with the `sort()` _method_ of `list`s. It accepts the `key` and `reverse` parameters.

In [3]:
l = [2, 3, 1]
s = sorted(l)
print(l, s, l == s)  # l is unsorted, s is a new list
l.sort()
print(l, s, l == s)  # Now l is sorted

[2, 3, 1] [1, 2, 3] False
[1, 2, 3] [1, 2, 3] True


## 2. Test for equality

It is worth noting that all built-in Python types support equality testing with the operator `==`. The comparison is always **by value**. To test whether the two objects are the same, use `id(o1) == id(o2)`.

In [4]:
l1 = [1, 2, 3]
l2 = [1, 2, 3]
print(l1 == l2, id(l1) == id(l2))  # True, False

True False


`str`, `tuple` and `list` also support `<`, `<=`, etc. `set` does too, but the semantics are different ($\subset$, $\subseteq$, ...)

## 3. Parameters of `sorted()`

There was a bit of confusion on the part of your lecturers about the `cmp` key for `sorted()`. To clear it up: there is **no `cmp` parameter in Python 3**. It was available in Python 2, and it accepted a three-way comparison function, similar to how sorting is done in C, Java, Ruby, etc.

In contrast, Python 3 sorts by calling `<`, just as C++ does.

More on sorting (advanced): https://docs.python.org/3/howto/sorting.html#sortinghowto

# Q&A

Here we address some of the more interesting questions that you asked during the classes.

## 1. Immutability vs constness

The questions was whether immutability in Python is similar to C/C++'s keyword `const`. Superficially, these concepts seem similar, as both mean that the value of an object cannot be changed. However, the answer is "mostly not".

In a nutshell, constness is the property of the **reference** (or pointer); immutability is the property of the **object** itself. As there is no `const` in Python, the examples below are from C++. If you are only interested in Python, just skip this question.

In C++, `const` is used in two context:

1. Regulate write access to mutable objects
1. Define constants

### 1. Write access to mutable objects

```C++
void change_value_m(MutableObject& mref) {
  mref.set(2);  // This modifies the object's value
}

void change_value_c(MutableObject const& cref) {
  cref.set(3);  // **Compilation error**
}

int main(int argc, char* argv[]) {
  MutableObject mo(1);
  change_value_m(mo);
  change_value_c(mo);
  std::cout << mo.get() << std::endl;
}
```

However, the `const`ness of a reference can be cast away:
```C++
void change_value_c2(MutableObject const& cref) {
  MutableObject& mref = const_cast<MutableObject&>(cref);
  mref.set(3);  // This works
}
```

On the other hand, if an object is immutable, it does not even have a `set()` or `operator=()` method:
```C++
class ImmutableObject {
  int value_;  // The value is hidden inside the object
  
 public:
  ImmutableObject(int value) : value_(value) {}
  int get() { return value_; }
};
```

This is also how the Python `tuple` is made immutable. "Item assignment" in the error message below refers to the (lack of the) `__setitem__` function (see next lecture).

In [5]:
(1, 2, 3)[2] = 4

TypeError: 'tuple' object does not support item assignment

### 2. Constants

In C++, the `const` keyword can also be used to define constants. Of the two use cases, this is the one that could be confused with immutability more.
```C++
int main(int argc, char* argv[]) {
  const char* hello = "Hello World!";
}
```

However, this is still just a const reference, which can be cast to a non-const one with `const_cast`. The difference is that these objects are created on the static storage, and the compiler can optimize this by e.g. marking them read-only. Nevertheless, the standard does not specify how these constants are stored, so the following code may or may not work, or may even crash your program.

```C++
int main(int argc, char* argv[]) {
  const char* hello = "Hello World!";
  const_cast<char*>(hello)[0] = 'B';  // Undefined
}
```

In other words, it is not so much "cannot modify" as "don't modify, or you might break something horribly". :)