# <img style="float: right;"  src="images/jp.png" width="200">

# Python Basic Types

Version 1.0 (12/4/2019)  
License information is at the end of the document

---

This document describes the basic python data types:

* integers
* float numbers
* strings and rawStrings

This document contain several **code cells** execute them as you find them when reading the document.   
Feel free to **modify them** to experiment with the code.

There is also a [Sandbox](#sandbox) at the end of this document so that you can test your own code whatever it is.  

## Python Comments

Before dealing with **types** it is important to explain that **Comments** in **Python** start with the `#` symbol and end at the end of the line.  
In the code examples below you will find a lot of comments.

## Python Objects

Python is an **object** oriented languaje. Python **variables** just point to objects. There are two kind of objects:

* immutable objects
* mutable objects

**Immutable objects** are objects that cannot be modified. Most of objects described in this document are **immutable**. **Mutable** objects are objects that can be modified. This has important consequences in **Python coding** as we will see later.

In python you use the `=` sign to assign one object to a variable. The variable has no type in itself, only the object has a datatype.

In [None]:
a = 1            # Create an 'a' variable and assign the '1' integer object to it
print(a)         # Show the contents of the variable 'a'
print(type(a))   # Show the type associated to the contents of the variable 'a'

In [None]:
a = 4.5          # We associate now the 'a' variable to a floating point object
                 # As the '1' object is not pointed to any variable, it dissapears from memory
    
print(a)         # Show the contents of the variable 'a'
print(type(a))   # Show the type associated to the contents of the variable 'a'   

It is very important to remember that, in Python, **variables point to objects**. They are **links**, they don't contain anything. And they don't have any particular **type**. This is commonly associated to [Duck Typing](https://en.wikipedia.org/wiki/Duck_typing):

$\qquad \text{"If it walks like a duck and it quacks like a duck, then it must be a duck"}$

## Integer Objects

Integer objects are objects that represent integer numbers like 1, 4 or 7689.  
You don't need to define the number of bits of the integer number, **Python** assigns the number of needed bits so you can always represent integer numbers as bigs as you please.

The following code cell, don't need to understand it now, computes factorials from $1!$ to $30!$ as you can see, you never get an [overflow](https://en.wikipedia.org/wiki/Integer_overflow).

In [None]:
# Factorials example
fact = 1                      # Set fact to 1
for i in range(1,31):         # Loop trough all numbers from 1 to 30
    fact = fact*i             # Compute factorial
    print(str(i)+'! =',fact)  # Show factorial 

You can use the normal integer operators in Python:

In [None]:
print('1 + 2 =',1+2)   # Add
print('4 - 2 =',4-2)   # Substract
print('5 * 4 =',5*4)   # Multiplication
print('8 / 2 =',8/2)   # Division

In **Python 2.x** the division of two integers always return an integer discarding the remainder. In **Python 3.x**, we get the true division. That means that the division of two integer can be a non integer type.  
If you want the integer division you need to use the special `//` operator.  
Also you can use the modulo `%` operator to get the remainder.

In [None]:
print('7 / 3 =',7/3)     # True division
print('7 // 3 =',7//3)   # Integer division
print('7 % 3 =',7%3)     # Remainder

## Float Objects

Float numbers are usually coded on the [IEEE-754](https://en.wikipedia.org/wiki/IEEE_754) double precision standard. They represent numbers with a mantisa and an exponent. Due to representation limitations, they are all rational numbers as they shall have a finite number of digits.

There is support for [rational numbers](https://docs.python.org/2/library/fractions.html) in Python but not with the standard data types.

In [None]:
a = 7.45         # Set a value
print(a)         # Show the contents of the variable 'a'
print(type(a))   # Show the type associated to the contents of the variable 'a'  

You can use the normal operations in Pythons just like with integer numbers:

In [None]:
print('1.5 + 2.3 =',1.5+2.3)   # Add
print('4.5 - 1.2 =',4.5-1.2)   # Substract
print('5.7 * 4.1 =',5.7*4.1)   # Multiplication
print('8.8 / 2.2 =',8.8/2.2)   # Division

Observe that, although the last division gives an integer number, Python still generates a **float class type**. Operations that use one **float** always return a **float** execept for cases that always return integers.

You can use libraries like [math](https://docs.python.org/3/library/math.html) or [numpy](https://www.numpy.org/) to manipulate float numbers.

In [None]:
import numpy as np   # Import the numpy module
import math          # Import the math module  

print(np.exp(1))     # Calculate e^1
print(math.exp(1))   # Calculate e^1

print(np.sqrt(2))    # Calculate the square root of 2
print(math.sqrt(2))  # Calculate the square root of 2

a = math.sqrt(2)     # Store the square root of 2
print(math.floor(a)) # Calculate its floor
print(math.ceil(a))  # Calculate its ceil

As you can see we can use the **numpy** and **math** modules to do the same things. It is quite usual in **Python** to have many ways to skin a cat.

## Boolean Objects

Boolean objects can be used to define [boolean values](https://en.wikipedia.org/wiki/Boolean_data_type).

In [None]:
a = True
print(a)
print(type(a))
a = False
print(a)
print(type(a))

Boolean objects are a subset of the integer objects **True** and **False** are just numbers.

* **False** is just the integer 0
* **True** is just the integer 1

You can operate on booleans but the result is not boolean when using arithmetic operators.

In [None]:
a = True
print(a)
print(a+1)
print(type(a+1))

In [None]:
a = False
print(a)
print(a+1)
print(type(a+1))

Normal boolean operators use full english words like **and** and **or**.

In [None]:
a = True
b = False
print(a and b)
print(a or b)

In **Python** every object evaluates to **True** or **False**. Empty objects, objects with zero lenght or objects with a value of zero evaluate to **False**. Every other object evaluates to **True**.

Also, boolean operators don't always return **boolean values**.  
For instance, the boolean **and** operator just returns the first element that evaluates to **False** or the last element if both evaluate to **True**.    
Note that the `'Text'` object is a **String** object that is [described later](#string) in this document, also the `b''` object is a **Bytes** object that is also [described later](#bytes) so you probably want to continue the document until you read about those object before continuing with the boolean operators examples.

In [None]:
print('True and 1 =',True and 1)
print('1 and True =',1 and True)
print("True and 'Text' =",True and 'Text')
print('True and 0 =',True and 0)
print("True and b'' =",True and b'')
print("False and b'' =",False and b'')
print("b'' and False =",b'' and False)

In a similar way, the **or** operator just returns the first element that evaluates to **True** of the last element if both evaluate to false.

In [None]:
print('True or 1 =',True or 1)
print('1 or True =',1 or True)
print("'Text' or True =",'Text' or True)
print('True or 0 =',True or 0)
print("b'' or True =",b'' or True)
print("False or b'' =",False or b'')
print("b'' or False =",b'' or False)

So, in the end, the **and** and **or** operators always return one of its arguments whatever object it is.

If you find this complicated, just play with the code cells using your own contents.

<a id='string'></a>

## String Objects

String objects contain text. In **Python 3** normal string texts are **unicode** so they can contain any character available in **unicode**.

Strings can also include [escape sequences](https://docs.python.org/2.0/ref/strings.html) starting with `\`. In fact, you can use escape sequences to include **unicode** caracters.

Strings can be delimited with `"` or `'` caracters. You just need to end with the same caracter you started with. For instance if your string contains `'` caracters, it is wise to use `"` as delimiters. 

In [None]:
print('This is a text string and contains "')
print("This is also a text string and contains '")
print('This string contains escape caracters \\ \t \'')
print('This string contains a chinese caracter \u2E86')

The **len** function gives the number of caracters of a string. Note that, due to unicode, the string length does not need to match the used space in Bytes.

In [None]:
a = 'chinese \u2E86'  # Use a to point to a string
print(a)              # Show the string
print(len(a))         # Show the string lenght
print(type(a))        # Type is always str

Yo can get the elements of one string by using the `[]` operator.

In [None]:
print(a[0])   # Show caracter in position 0
print(a[8])   # Show caracter in position 1

Srings are **immutable** so you cannot change its contents. Trying to do that generates an **exception**.

In [None]:
a[0] = 'C'

You can concatenate strings by using the `+` operator. Note that, as the strings are **immutable** this generates a new object, it does not modify the previous objects.

You can also use the `*` operator to replicate the string contents.

In [None]:
a = 'First '    # Use a to point to a string
b = 'Second'    # Use b to point to another string
print(a+b)      # Show a+b
print(a*4)      # Show a*4

### Raw Strings

Raw strings are just like normal strings but they cannot include escape sequences. They are usefull if you want to include a lot of `\` caracters. 

In fact they are not new class types. They are another way to write down string literals. So the object you obtain is just a **string**.

In [None]:
a = r'This is a \Raw String\.'   # Define a raw string literal
print(a)                         # Show it
print(type(a))                   # See that it is just a normal string

### Str function

You can apply the **str()** function to any object to obtain a string with a representation of the object content.

In [None]:
a = str(45)
print(a)
print(type(a))

It can be usefull to generate complex strings.

In [None]:
x = 17
print('x content is '+str(x))

<a id='bytes'></a>

## Bytes Objects

A **bytes** object is an **string** of bytes. Codes below 128 are represented by the ASCII code.  
As in the case of **strings** they are **immutable**.

In [None]:
a = b'This is a bytes object'   # Create a bytes object
print(a)                        # Show it   
print(type(a))                  # Show its type

Remember that, in Python 3, strings natively support unicode caracters. You can **encode** a string to a standard unicode format like [utf-8](https://en.wikipedia.org/wiki/UTF-8). This will generate a **bytes** object.  

In [None]:
a = 'chinese \u2E86'     # Use a to point to a string
print(len(a))            # Show number of caracters in a
d = a.encode('utf-8')    # Encode a in UTF-8 as a bytes object
print(d)                 # Show the encoded byte sequence
print(len(d))            # Show the encoded number of bytes

Observe how, in the example below, the chinese caracter is encoded with 3 bytes in **utf-8** in particular to the sequence $E2h$, $BAh$, $86h$.

Remember that, as the **bytes** type is **immutable** you cannot change its contents.

In [None]:
d[0] = b'C'    # Try to change a bytes content

## Bytearray Objects

A **bytearray** object is just like a **bytes** object. The only difference is that it is **mutable** so you can change its contents.

In [None]:
d = bytearray(a.encode('utf-8'))    # Encode a in UTF-8 as a bytearray object
print(d)                            # Show the encoded byte sequence
print(len(d))                       # Show the encoded number of bytes
d[0] = 67                           # Capitalize first caracter
print(d)                            # Show the encoded byte sequence

## Mutability Warning

Remember that variables in Python don't contain anything, they just point to objects. This has an important implication in **mutable objects** as you can see in the following code.

In [None]:
a = bytearray('Hello'.encode('utf-8'))   # Encode 'Hello' in UTF-8 in a bytearray
b = a                                    # b now also points to the bytearray
print(b)                                 # Show b
a[1]=117                                 # Modify the object pointed by a
print(b)                                 # Show b

See that, as both $a$ and $b$ point to the same mutable object, if you modify the object using the $a$ variable, the same modified object will be seen on $b$ variable. This can give some headaches to programmers of other languajes like $C$ where most variables store content not pointers to objects. If you come from $C$ it is best to think that all python variables are **pointers**.

In some cases you can use the **copy()** method to create a copy of one object so that now, the objects pointed by $a$ and $b$ are independent.

In [None]:
a = bytearray('Hello'.encode('utf-8'))   # Encode 'Hello' in UTF-8 in a bytearray
b = a.copy()                             # b now points to a copy of a
print('b =',b)                           # Show b
a[1]=117                                 # Modify the object pointed by a
print('b =',b)                           # Show b
print('a =',a)                           # Show a

## Space Recall

As explained several times, **Python variables** just point to objects.  
Python associates a **reference counter** to each object with the number of variables that point to it. If the counter reaches zero that means that nobody points to this objects and its memory space is returned to the memory pool.  

In [None]:
a = "String object"     # We create an object. String counter = 1
b = a                   # b also points to a. String counter = 2
a = 1                   # a no longer points to the object. String counter = 1

b = 7.2                 # b no longer points to the object. String counter = 0
                        # String object is removed from memory

Sometimes the **reference counter** can fail when there are circular references. That's why **Python** usually uses an additional [garbage collector](https://en.wikipedia.org/wiki/Garbage_collection_(computer_science)) to detect unaccesible objects that don't have a zero reference counter.

<a id='sandbox'></a>

## Sandbox

This last section of this document is a sandbox, you can write any code you please in the box below and see what it does when you execute the **run** command.

In [None]:
# Sandbox code

## Document license

Copyright  ©  Vicente Jiménez (2019)  
This work is licensed under a Creative Common Attribution-ShareAlike 4.0 International license.   
This license is available at http://creativecommons.org/licenses/by-sa/4.0/

<img  src="images/cc_sa.png" width="200">