# Data types
In this chapter we'll explore the different base data types of Python, both **scalar** and **collection**.  
The main aspects of working with collection types are accessing its elements individually or iteratively.  

## Scalar types

A scalar data type, or just **scalar**, is any non-collection value. Put differently, a scalar is a singular value. There are just a few different scalar values in Python to be found in the vast majority of Python code, and they are the same or at least similar across the vast majority of programming languages.

- **Numeric types** represent numbers and they come in two flavours; with and without decimal part. 
    - integers have no decimal part (`int`) e.g. `2`, `3`, `140000000198`
    - floating point numbers do have a decimal part (`float`) e.g. `3.211`, `2.0`, `5E-5`
- **Text**: aka string data (`str`) e.g. `"It's a wonderful world"` or `'What a challenge!'`
- **Logical**: aka boolean (`bool`) with only two possible values: `True` and `False`

(I skipped binary and complex types for conciseness)

Since Python is a weakly typed language you do not need to specify its type when creating a variable or literal. However, you can also get the type by using the `type()` function on any literal or variable. 

In [8]:
print(type(True))

m = "It's a wonderful world" # You can use single quotes in a double-quoted string, but not double quotes (unless escaped)

print(type(m))

print(type(5E5)) #exponents are always floats

<class 'bool'>
<class 'str'>
<class 'float'>


Variables can change type when their value changes:

In [9]:
x = 42

print(f'x is of type {type(x)} and has value {x}')

x = True

print(f'x is of type {type(x)} and has value {x}')

x is of type <class 'int'> and has value 42
x is of type <class 'bool'> and has value True


You can also explicitly change between types, as long as it is a legal conversion.

In [17]:
print(float("42.0"))  # OK
#print(int("42.0"))   # fails
print(int("42"))      # OK
print(bool("42.0"))   # OK - any non-empty string is considered True
print(bool(""))       # OK - an empty string is False
# print(int(""))      # fails

42.0
42
True
False


When reading from the command-line args (terminal) or from file, your data will always be character data, even though they are numeric. You always need to do the conversion yourself (unless you use dedicated libraries for it). 

### Exercise
Try some conversions yourself. Figure out why some fail and some do not (sometimes where you expected it), what the result is and what the logic behind the conversion is. Especially conversions to `bool` are interesting and very relevant in `if <condition>:` blocks. 

## Collection types

Collection types are exactly what their name implies; they are composed of other types (scalar or collection). The number of complex types in the base language is limited, and there are only a few that are used in the majority of cases.

- **Sequence types** have elements in a specific order or sequence, and these can be addressed using the position of the element - its **index**.
    - **list** In a list, order matters, and that is why you can fetch elements by their position, starting at zero. Lists can change: you can add and delete elements.
    - **tuple** The tuple is much like a list, but with a very important distinction: they are **immutable**. Once created they can't change.
    - **range** A range is a series of numbers that can be used for iteration of for creating lists or tuples.
    - (**str**) Strings behave A LOT like other sequence types!
- **set** A set is a collection of unique elements; no duplicates are allowed.
- **dict** In a dictionary (in other languages map)there are **entries** where a (**key**) is coupled to a corresponding **value**. So that can be retreived by its key.

### Slicing
A any sequence type can be compared to a street with houses. The address points to a house. In Python addresses start at zero. You can access a single house, a range of houses or every second or third. All this is done using **slicing**. Its general syntax is  
`[start:stop:step]`. The `step` is 1 by default, and if `start` or `stop` is omitted this means "from the beginning (0)" or "to the end". Note that `stop` is NOT included!  
Here are some examples using strings. They work the same in lists and tuples.

In [31]:
letters = 'ABCDEFGHIJK'

print(f'The character(s) selected by letters[0] are {letters[0]}')
print(f'The character(s) selected by letters[3] are {letters[3]}')
print(f'The character(s) selected by letters[-3] are {letters[-3]}')
print(f'The character(s) selected by letters[2:6] are {letters[2:6]}')
print(f'The character(s) selected by letters[::2] are {letters[::2]}')
print(f'The character(s) selected by letters[:] are {letters[:]}')
print(f'The character(s) selected by letters[::-2] are {letters[::-2]}')
print(f'The character(s) selected by letters[:-5:-2] are {letters[:-5:-2]}')



The character(s) selected by letters[0] are A
The character(s) selected by letters[3] are D
The character(s) selected by letters[-3] are I
The character(s) selected by letters[2:6] are CDEF
The character(s) selected by letters[::2] are ACEGIK
The character(s) selected by letters[:] are ABCDEFGHIJK
The character(s) selected by letters[::-2] are KIGECA
The character(s) selected by letters[:-5:-2] are KI


#### Exercise

Given the string `txt = 'aA.bB.cC.dD.eE.fF.gG.hH.zZ'`, write code using string slicing to print to screen  
 
- `"abcdefgz"`
- `"........"` 
- `"BCDEFGH"` 
- `'bC.fG.'`

In [None]:
txt = 'aA.bB.cC.dD.eE.fF.gG.hH.zZ'
# YOUR CODE

#### Exercise

Given the list below, investigate whether lists behave the same as strings.

In [33]:
fruits = ["apple", "orange", "kiwi", "pear", "banana", "plum"]


### Slicing is not all - meet the dot operator

So far I have skipped the point that Python is an **Object-Oriented** programming language. Being object-oriented means the (almost) everything is being modeled as an entity with data and behaviour (e.g. methods). Let's explore this concept with the string type. In Python, strings have only a single property - their character sequence. They do, however, have many methods. Both properties and methods are accessed on an object using the **dot operator**. The difference lies in the fact that methods have parentheses after their name that (optionally) define method arguments.  

Here are some examples of methods on string objects.

In [56]:
s1 = "Hello"
print(s1.upper())

s2 = "Howdy"
print(s2.upper())

print(s1.__eq__(s2))

print("+".join("ABC"))

print(s1.rjust(10))

str.title("foo bar baz")


HELLO
HOWDY
False
A+B+C
     Hello


'Foo Bar Baz'

Do you need to know these methods?  

**NO!**

You will simply remember the once you use most.  
Other things you will need to learn how to find quickly. Here are the sources that are most relevant (in logical order of usage):

1. Use the dot operator within your editor. This usually suggests possible methods on an object (or class).
2. use `help()` in Jupyter or the Python console, e.g. help(str)
3. Use [the python docs](https://docs.python.org/3/), in particular [The Python Standard lLibrary](https://docs.python.org/3/library/index.html)
4. Google

Also, it may be worth your while to have some cheat sheets copied to your Desktop (fysical or computer)



#### Exercise

Use the above sequence of resources to find out how to ...
- print three string variables as one, with a `+` between each string
- split a sentence in a list of separate words
- get a random number between 1 and 100
- round a number to 2 decimals


In [73]:
set([3, "A", (4, 2)])


{(4, 2), 3, 'A'}