# Lesson 2 - Intro to Python Data Types and Strings

Python has eight main datatypes:

 0. `bool`: "Boolean" values, either `True` or `False`
 1. `int`: integers, e.g. `3`, `93`, `19494`, `0`, or `83`
 2. `float`: decimal numbers, e.g. `3.14159`, `2.71828`, or `2.5`
 3. `str`: a collection ("string") of characters, e.g. `"This can be LITERALLY anything!"`
 4. `list`: a collection of anything, e.g. `["cat", 2.4, 53]` and is _editable_ ("mutable").
 5. `tuple`: a collection of anything, e.g. `("cat", 2.4, 53)` but is _non_editable_ ("immutable").
 6. `dict`: a collection of "key" and "value" pairs (keys correspond to values), e.g. `{"North": "Up arrow", "West": "Left arrow", "East": "Right arrow", "South": "Down arrow"}`
 7. `set`: a collection of _unique_ values. Duplicate values do not accumulate in a set., e.g. `{"a", "b", "c"}`
 
 
### We will be looking at `bool`, `int`, `float`, `list`, and especially, `str` in this week's lesson.

**`tuple`, `dict`, and `set` will be discussed in a future lesson**.
 
 

# 0. True and False: `bool`

`True` and `False` are the only values permitted to be a `bool`.

`1` and `0` are considered to be equal to `True` and `False`, respectively.

_Any_ value in Python can be _converted_ into a `bool` by using the `bool(...)` function.

# 1 & 2. Numbers: `int` and `float`
`int` and `float` are numbers and can be treated just like numbers in normal calculations.
* `int`s are used for _discrete_ quantities because they are most precise
* `float`s are used for any number with a decimal place (_continuous_ quantities)

The name `float` comes from the term "floating point number". "Floating point" refers to how the number is stored in memory as a binary representation. In other languages it might be called a "double". Floating point numbers are common to almost all computers.

You can use standard arithmetical operators on both `float`s and `int`s.

**Python Arithmetic Operators**
* `+`: addition
* `-`: subtraction
* `*`: multiplication
* `/`: division
* `**`: exponentiation (raise to a power)
* `//`: floor division (`int` quotient)
* `%`: modulo (`int` remainder)

Additionally, you can use parentheses, `(` and `)`, to group calculations as you would normally expect.

The functions `int(...)` and `float(...)` can be used to convert certain values to `int`s and `float`s, respectively.

# 3. Words and Text: `str`
Strings (`str`) types are one of the most common data types to use in Python. Most data that we read from external sources will be "parsed" (processed and interpreted) by python as strings of characters, which python calls `str`.

We can make strings by putting quotes around **anything**.

```python
a = "This is a string" # You can use double quotes
b = 'This is a string, too' # Or single quotes
```

These are also strings:
```python
c = "34.5" # This is a str, not a float
d = "28" # This is a str, not an int
e = 'print(2 + 4**3)' # This is a str, not a function call
```

Strings have some special characters that are known as "escaped" characters. Escaped characters start with a backslash `\`.

* `'\n'`: New line character
* `'\t'`: Tab character
* `'\r'`: Carriage return character (brings the "cursor" to the "beginning of the line")
* `'\b'`: Backspace

To actually write a backslash in your string, you have to "escape" the backslash: `"\\"`

## Representations vs Renderings

With `str`, and other kinds of objects, you can see it in one of two ways:
1. The object's "representation" or ("repr")
2. The object's "rendering"

```python
my_str = "This is a string\nsplit over\nthree lines"
my_str
```

vs.

```python
print(my_str)
```

* The "repr" is viewable whenever you "inspect" an object on the command line or when you use the `repr()` function.
* The "rendering", for a `str`, is viewable whenever you use `print()` or when you write the string to a file.

In [5]:
my_str = "\tThis is a string\nsplit over\nthree lines"
print(my_str)

	This is a string
split over
three lines


### A brief note about string encoding

Characters in strings are actually stored as numbers in the computer's memory, e.g. "A" might be mapped to one number, "a" to a different numebr.

For those who have heard of the acronym "ASCII", these refer to character encodings of strings. Back in the day, English-speaking developers were only worried about the Latin alphabet and they created an encoding system that mapped numbers 0-127 to the characters on an English keyboard.

Nowadays, we expect computers to be able to work with ANY character set. ANSI and ASCII are not capable of this so the encoding `UTF-8` was created in 1992 to accomplish this. 

UTF-8 is the default encoding in Python but it is possible to access other encodings if you encounter data that have different encodings.

```python
# These strings demonstrate Python's utf-8 encodings

f = "如果您可以阅读此内容，则本课程似乎进展顺利。"
g = "إذا كنت تستطيع قراءة هذا ، يبدو أن هذا الدرس يسير على ما يرام."
h = "यदि आप इसे पढ़ सकते हैं, तो ऐसा लगता है कि यह पाठ ठीक चल रहा है।"
j = "ប្រសិនបើអ្នកអាចអានវាហាក់ដូចជាមេរៀននេះដំណើរការល្អ។"
k = "😀"
```

In Python, you can convert _anything_ into a `str` with the `str(...)` function.

# 4. Basic collections (or "arrays"): `list`
A `list` is a collection of Python objects that can be anything you like:

```python
my_first_list = ["Apples", "Oranges", "Bananas", "Pears"]
my_second_list = [12, 43.3, 56]
my_third_list = ["Cars", 42, "😀"]
```

A `list` can also contain other lists:

```python
two_lists_within_a_list = ["abc", [12, 634], [89.3, 0.0001, 342]]
a_nested_list = [[["a"]]]
```

You can convert other _iterables_ to a list by using the `list(...)` function.

> Note: Converting a `str` to a `list` will break out each character as a list item.

# What `str` and `list` have in common

Both `str` and `list` are considered _sequences_. In Python, another word for "sequence" is _**iterable**_. 

* A `str` is an iterable of individual characters
* A `list` is an iterable of other objects, which may be `str`, `int`, `float`, `bool`, other `list`s, or whatever.

Because `str` and `list` are both sequences, they have certain useful traits in common:

1. **Indexing**
2. **Add items with `+` and `*`**
3. **Built-in _methods_**

## 1. Indexing

Both characters in a `str` and objects in a `list` can be _indexed_. This means we can access individual _members_ of the `str` or `list` if we know their position within the `str` or `list`.

```python
b = 'This is a string.'
my_first_list = ["Apples", "Oranges", "Bananas", "Pears"]
```

```python
b[0] # This accesses the first character in the str
my_first_list[2] # This accesses the third object in the list
```

However, `int`s and `float`s are not indexable (or "subscriptable"):
```python
m = 234.234009234
n = 8982389482
```

```python
m[1] # This won't work to get the second number
n[5] # This won't work either (to get the sixth number)
```

```python
m = "234.234009234" # Now these are subscriptable
n = "8982389482" # Now these are subscriptable
```

### Note: Python is "zero indexed"
You may have noticed that I my numbering system does not start from `1`. I have been numbering my bullets starting from `0`. Python is what is called a "zero indexed" language. This means that all numbering starts from `0` and goes up. 

When indexing a sequence, you have to remember that `0` is the _first_ number, not `1`. It takes some getting used to but it eventually becomes second-nature.

This is contrast to something like MS Excel, which is a "one indexed" (numbers start from `1`).

### Indexing syntax `[start:stop:step]`

Indexing is not just used to get single members of a sequence. It can be used to get a range of members or some selection of members.

Here are examples:
```python
diary_entry = 'If you want to destroy my sweater/Hold this thread as I walk away'

diary_entry[3:6] # you
diary_entry[15:22] # destroy
```

The example above demonstrates how to extract sub-strings from a larger string.


```python
shopping_list = ["Apples", "Oranges", "Bananas", "Pears", "Mangos", "Mangosteens", "Pandan leaf", "Betel leaf"]

shopping_list[2:6] # ["Bananas", "Pears", "Mangos", "Mangosteens"]
shopping_list[6:] # Read as "start at position six, do not stop": ["Pandan leaf", "Betel leaf"]
```

The example above demonstrates how to extract list items from a larger list.

```python
numbers_0_thru_20 = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
two_times_table = numbers_0_thru_20[0::2] # Read as "start at position zero, do not stop, every 2nd item"
two_times_table # [0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20]
```
This example shows you how you can use the `step` parameter to "skip" items in a list.

### Understanding `[start:stop:step]`

* `[x]` - Get the item at position `x`
* `[x:y]` - Get the items starting at `x` and stop BEFORE `y` (i.e., does not include position `y`)
* `[x:y:z]` - Get the items start at `x` and stop before `y` retrieving every `z`th item.

#### Variations
* `[x:]` - Get the items starting at `x` and do not stop
* `[x::]` - Get the items starting at `x` and do not stop, retrieving every item (effectively same as above so you wouldn't write this)
* `[x::z]` - Get the items starting at `x` and do not stop, retrieving every `z`th item
* `[::z]` - Get the items starting at the beginning, do not stop, retrieving every `z`th item

#### Going backwards
* `[-1]` - Get the last item
* `[-2]` - Get the second-to-last item
* `[::-1]` - Start at the beginning, do not stop, going backward
* `[-2:-5:-1]` - Start at the second-to-last position, stop before the fifth-to-last position, retrieving every item going backward
* `[-2:-5:-2]` - Start at the second-to-last position, stop before the fifth-to-last position, retrieving every second item going backward

**This week's workbook will largely be focussed on practicing indexing**

## 2. Add items with `+` and `*`

The `+` and `*` operators, in addition to being used with numbers, can be used on `str` and `list`. However, `-` and `/` cannot be.

Explanation by examples:

```python
a = "cat"
a*5 # "catcatcatcatcat"

b = ["list item 1"]
b*5 # ["list item 1", "list item 1", "list item 1", "list item 1", "list item 1"]
```

```python
c = "run"
a + c # "catrun"

d = ["list item 2", "list item 3"]
d + b # ["list item 2", "list item 3", "list item 1"]
```

### More detailed explanation

`str`s can be _concatenated_ (joined together in a chain) to other `str`s using `+`. 

And, `list`s can be _concatenated_ to other `list`s using `+`. 

However, `str`s _cannot_ be added to a `list` using `+`, and vice-versa.

Using `*`, `str`s and `list`s can be concatenated to _themselves_ many times.

## 3. Built-in _methods_

In Lesson 1 and Workbook 1, we were introduced to the idea of _functions_ and how to "call" them. 

> We have some object, called a function, that exists on it's own. We provide it some input (parameters) and it gives us an output.

**A _method_ is like a function that is _built-in_ to the data. It's like a built-in _process_ that the data can use to transform itself.**

### Explanation of methods by example:
```python
greeting = "good day, archibald"
greeting.title() # "Good Day, Archibald"

shopping_list = ["cheese", "eggs", "bread"]
shopping_list.append("yoghurt") # ["cheese", "eggs", "bread", "yoghurt"]
```

### Detailed explanation:

```python
greeting = "good day, archibald"
```
When we assign the string, `"good day, archibald"` to the variable, `greeting`, we can now consider `greeting` to represent an _instance_ of a `str`. 

Because `greeting` now represents an instance of a `str`, we can use the "dot notation" to access all of the `str` methods.



```python
greeting.<method name>()
```

Similarily, when we do:
```python
shopping_list = ["cheese", "eggs", "bread"]
```
`shopping_list` now represents an _instance_ of a `list` and we can use the "dot notation" to access all of the `list` methods.

```python
shopping_list.<method name>()
```

**`str` instances and `list` instances both have methods but they have _different_ methods appropriate for their datatype**
> Use the built-in Python function `dir(...)` to see all of the methods available for a particular datatype

```python
dir(greeting)
```

```python
dir(shopping_list)
```

**And, you can learn more about each of these methods by using `help(...)`**

```python
help(greeting.title)

help(shopping_list.append)
```

### An (incomplete, but useful) list of methods for `str`

| Transformation            | Testing                  | Investigation     | Creating |
| ---------------           | --------------           | ----------------- | ------------------| 
| `.upper()`                | `.isalpha()`             | `.count(sub_str=)`| `.format(var=)`     
| `.lower()`                | `.isalnum()`             | `.find(sub_str=)` | `.join(iterable=)` 
| `.capitalize()`           | `.isdigit()`             | `.rfind(sub_str=)`|    
| `.title()`                | `.islower()`             | 
| `.strip()`                | `.isupper()`             |  
| `.lstrip()`               | `.startswith(sub_str=)`   
| `.rstrip()`               | `.endswith(sub_str=)`   
| **`.replace(old=, new=)`**    | 
| **`.split(sub_str=)`**


### A (complete) list of methods for `list`

| Transformation             | Investigation     | Creating/Editing |
| ---------------- | ----------------- | ------------------| 
| `.reverse()`     | `.count(item=)`   | **`.append(item=)`**     
| `.sort()`        | `.index(item=)`   | `.extend(iterable=)`
|                  |                   | `.insert(item=)`
|                  |                   | `.pop(index=)`
|                  |                   | `remove(item=)`
|                  |                   | `.clear()`
|                  |                   | `.copy()`

# Putting Variables into `list`s and `str`s

### To add your variables to a `list`

**Adding items to a list directly**
```python
item_1 = "my list item"
item_2 = "another list item"
my_items_in_a_list = [item_1, item_2]
```

**Adding items to the end of a list with `.append(item=)`**
```python
item_1 = "my list item"
my_list = [item_1]
item_2 = "another list item"
my_list.append(item_2)
```

**Inserting items at a certain position with `.insert(index=, item=)`**
```python
item_1 = "my list item"
item_2 = "another list item"
my_items_in_a_list = [item_1, item_2]
item_3 = "to go in the middle"
my_items_in_a_list.insert(index=1, item_3)
```

## To add your variables into a `str`

**Using f-strings**

```python
my_number = 34.2
my_string = f"My number is {my_number}"
```

```python
name = "Connor"
city = "Vancouver"
statement = f"{name} is from {city}"
```

## Errors you may run into in this week's Workbook

`IndexError`: When you are trying to access an index outside of the range of the collection, you will get an `IndexError`.

`NameError`: When you try to reference a variable name that you have not actually defined yet. You may have forgot to run a cell above where you defined your variable.




In [23]:
# My first IndexError

my_first_list = ["Apples", "Oranges", "Bananas", "Pears"]

my_first_list[5] # There is nothing in the '5' position in this list, it's not that long
#my_first_list[2] # <- This one will work

IndexError: list index out of range

In [None]:
# My first TypeError
my_string = "Hey! I am a string! I am just a *list* of individual characters!"

my_string.replace()
#my_string.replace('just')
#my_string.replace(43, "Get")
#my_string.replace('string', 'sentence') # <- This one will work