# Python Types

## Data Types

Python has a few basic data types. We have integers, floating point numbers, Booleans and strings. Integers are ***whole*** numbers and do not include decimals. Floating Point numbers are decimal numbers. Booleans are essentially True or False statements. Lastly, strings hold text data.

Knowing what data type you are dealing with is important when writing code because each type has limits on what operation it can perform. 

We can check the data type of a value in python with the `type(x)` function. _Push the "play" button to run a code block._

In [7]:
# Integer Numbers
a= 5
print (f"a is a {type(a)}")

a is a <class 'int'>


In [8]:
# Floating point (real) numbers
b = 5.5
print(f"b is a {type(b)}")

b is a <class 'float'>


In [9]:
# Boolean True/False
c = True
print(f"c is a {type(c)}")

c is a <class 'bool'>


In [10]:
# Strings
d = "String"
print(f"d is a {type(d)}")

d is a <class 'str'>


Notice that even the whole floating point numbers have a decimal place next to them. 

In Python, addition isn't limited to numbers. We can also do so with strings! 

**Note:** Addition and Multiplication are the only arithmetic operation we can do with strings. Subtraction, division, etc. are not viable operations with this data type. 

Functions that evaluate expressions return the resulting type:

In [11]:
y = 9 < 10
print(f"y is: {y}")
type(y)

y is: True


bool

## Arithmetic

### Operations

Much like you can with regular numbers, you can add, subtract, multiply, divide, exponentiate, etc. integers and floating point numbers in Python.

In [12]:
#Addition/Subtraction
print(1+1)
print(1.+1.)
print(1.0+1)

#Multiplication/Division
print (10/5)
print (10//5)
print (10%5)
print (5*3)
print (5.*3)

#Exponentiate
print(3**2)
print(3.**2.)
print(3.**2)

2
2.0
2.0
2.0
2
0
15
15.0
9
9.0
9.0


### Expressions

In arithmetic expressions, `True` is converted to `1` and `False` is converted `0`.

This is called **Boolean arithmetic**. It's a surprisingly rich field of mathematics and very useful in programming!

Here are some examples:

In [13]:
bool_list = [True, False, False, True]

sum(bool_list) # They're converted to [1, 0, 0, 1]

2

In [14]:
True + True

2

In [15]:
True * 5 * False

0

## Boolean Conditionals

As mentioned above, Boolean statements are simply statements that evaluate to True or False and using this, we can create code that executes based on whether the data we're given fits a certain criteria.

For Example:

In [16]:
x = 5
if x == 5:
    print ("x is 5!")

x is 5!


Since the statement `x == 5` evaluated to True, Python was able to print out the message. Let's see what happens if `x` isn't equal to 5.


As you can see, since x was not equal to 5, the boolean condition of x = 5 evaluated to False and Python did not execute the print command. We will see more of this when we talk about loops.

### Converting Between Data Types

Sometimes when programming, we run into issues where we need to convert between the data types. Ex. We wanted to multiply a number by another number but since one of the numbers was a string, the operation failed.

In [17]:
"5" + 0

TypeError: can only concatenate str (not "int") to str

In [None]:
A = float(5)
B = str(5)
C = int(4.5)

print (A)
print (B)
print (C)

# Containers

Containers are types that store collections of (not necessarily type-homogeneous) data.

# Lists

Square Brackets denote python `list`:

In [3]:
A = [3, 5, 6, 7, 8, 9, 3, False, True]
A

[3, 5, 6, 7, 8, 9, 3, False, True]

In [4]:
A.append("String")
A

[3, 5, 6, 7, 8, 9, 3, False, True, 'String']

You can put objects into lists. Note that the object in the list is just a **reference** to the underlying object:

In [5]:
b = ["cat", "dog"]
A.append(b) 

b.append('Mouse')
A

[3, 5, 6, 7, 8, 9, 3, False, True, 'String', ['cat', 'dog', 'Mouse']]

Now if we change `b` then the value will change in `A`

In [7]:
b.append("horse")
A

[3, 5, 6, 7, 8, 9, 3, False, True, 'String', ['cat', 'dog', 'Mouse', 'horse']]

In [22]:
A[3]

7

A handy feature common to all containers is that you can "pick out" or call individual elements using brackets. NOTE: The types of brackets matter when calling elements of different containers

In [23]:
A{2}

SyntaxError: invalid syntax (2920254822.py, line 1)

Note that element #3 is actually the fourth element because Python always counts from zero.

There are certain advantages to starting from 0 over 1 but overall, it's not any better or worse than coding on a language that indexes starting from 1.

We can also slice using the colon in the square bracket. Slicing is a method that allows us to call multiple entries in a list between a starting and ending index that we set:

In [24]:
A[0:4]
# A[:] is the same as A[start : end]

[3, 5, 6, 7]

**NOTE:** When slicing, while Python does call the starting index, it *does not* call the ending index as Python counts from 0.

In [25]:
A[4]

8

As you can see, the element with index 5 is 999 and was not called in the above slice. 

The general rule is that `A[m:n]` returns `n - m` elements, starting at `A[m]`.

Negative numbers are also work and **go back from the first element.**

In [26]:
A[-1]

['cat', 'dog', 'Mouse', 'horse']

In [27]:
A[-4:-1] #Start at -4 ends at -1 (EXCLUDES last item)


[False, True, 'String']

In [28]:
A[2:]

[6, 7, 8, 9, 3, False, True, 'String', ['cat', 'dog', 'Mouse', 'horse']]

Elements in lists can also be overwritten:

In [29]:
A[0] = 'First'
A

['First',
 5,
 6,
 7,
 8,
 9,
 3,
 False,
 True,
 'String',
 ['cat', 'dog', 'Mouse', 'horse']]

Here, I replaced the first element in our list with "First".

### List of Lists

A List of Lists is a list where all the entries or elements are lists themselves. Calling certain elements in this data type requires an additional step. as you can see below.

In [30]:
a = [[1,2],[34,4],[3,4],[5,6]]
# Slicing the first three elements in a
print(a[:3])
# Calling the Second Element in a
print(a[1])
# Calling the first entry in the Second Element of a
print(a[1][0])
# Slicing the second entry in the first 3 elements of a
print(a[0:3][1])

[[1, 2], [34, 4], [3, 4]]
[34, 4]
34
[34, 4]


Note that since strings are effectively containers of single characters we can slice strings as well:

In [8]:
Mystring = "Journey"
Mystring[0:4]

'Jour'

In [9]:
lis = list("Journey")
print(lis)

['J', 'o', 'u', 'r', 'n', 'e', 'y']


# Tuples

Parentheses denote a python `tuple`. Tuples are the default container: if you put commas between objects it'll default to a tuple.

In [33]:
x = ('a', 'b')
y = 'a', 'b' # You can skip the brackets
print(x)
print(x == y)
print(type(y))

('a', 'b')
True
<class 'tuple'>


Tuples are like lists except they're immutable (e.g. not-mutable; can't be modified)

In [34]:
x = ('a', 'b')
x[0] = "this will fail"

TypeError: 'tuple' object does not support item assignment

In [35]:
l = [True, 0, 5.5]
l[0] = "This works on a list"
l

['This works on a list', 0, 5.5]

Watch out: Leaving an accidental comma after an expression will convert it into a tuple with an empty second element:

In [36]:
x = 5,
print(f"X is a {type(x)}")
x

X is a <class 'tuple'>


(5,)

# Set

A **set** is a container where objects are forced to be unique. It's denoted by the *curly brackets*

In [37]:
s = {1, 1, 1, 1, 1, 2}
print(type(s))
s

<class 'set'>


{1, 2}

One useful method for now is just calling `set(l)` on a container to filter the unique elements:

In [38]:
a = [1, 2, 3, 2, 1, 4, 5, 4, 2, 3, 5, 77, 33]
a = list(set(a))
a

[1, 2, 3, 4, 5, 33, 77]

# Unpacking

You can define variables in reverse from a container by using a tuple on the **left hand side** of an assignment expression

In [39]:
tp = [11, 22, 45]
x, y, z = tp
x

11

# Associative Containers

The python `dict` is an "associative array" or a "map" -- it associates (maps) values to other values

In [40]:
d = {'name': 'Sam', 
     'age': 31,
     #'age':[31,41],
     'race': 'hobbit',
}
type(d)

dict

One way to think of dictionaries is that they are like lists except that the items are named instead of numbered

In [41]:
d['age']

31

The names `'name'` and `'age'` are called the *keys*.

Keys are unique in a dictionary, so resetting a key will change the mapping.

This is why sets and dictionaries both use curly brackets (the keys are a set)

In [42]:
d['age'] = 999
d

{'name': 'Sam', 'age': 999, 'race': 'hobbit'}

## Arrays

Arrays are containers that carry mainly numeric information. For example, if I had a matrix:

⎡
⎢
⎣
1
2
1
3
0
1
0
2
4
⎤
⎥
⎦
 

The way I could represent this in python is by using an array. The package that deals with arrays and array calculations is called "Numpy".

In [43]:
import numpy as np #Don't worry about library imports right now

matrix = [[1,2,1],[3,0,1],[0,2,4]]
print(matrix)
Matrix = np.array(matrix)
print(Matrix)

[[1, 2, 1], [3, 0, 1], [0, 2, 4]]
[[1 2 1]
 [3 0 1]
 [0 2 4]]


In [44]:
type(matrix)

list

In [45]:
type(Matrix)

numpy.ndarray

Arrays are different from lists of lists in that I can carry out operations **elementwise**

In [46]:
Matrix + 5

array([[6, 7, 6],
       [8, 5, 6],
       [5, 7, 9]])

In [47]:
matrix + 5

TypeError: can only concatenate list (not "int") to list

As you can see above, I was able to add 5 to every element in the array but ran into an error when I ran the same code with a list of lists. Arrays also have the added benefit of having their elements called with Boolean Statements.

In [48]:
Matrix

array([[1, 2, 1],
       [3, 0, 1],
       [0, 2, 4]])

In [49]:
Matrix[:,:2] #Grabs first 2 columns

array([[1, 2],
       [3, 0],
       [0, 2]])

In [50]:
Matrix[:2] #Grabs first 2 rows

array([[1, 2, 1],
       [3, 0, 1]])

In [51]:
Matrix[:2,1]
#Grabs first 2 rows and column index 1

array([2, 0])

In [52]:
Matrix[:2][1] #This grabs first 2 rows and then out of those two rows picks the second index

array([3, 0, 1])

In [53]:
Matrix

array([[1, 2, 1],
       [3, 0, 1],
       [0, 2, 4]])

In [54]:
Matrix[Matrix<5]

array([1, 2, 1, 3, 0, 1, 0, 2, 4])

In [55]:
lis = [2,4,5,6,7,8,9]
lis = np.array(lis)
x = lis[lis<=6]
lis = list(x)
print(lis)

[2, 4, 5, 6]


In [56]:
x

array([2, 4, 5, 6])

## Data Frames

Lastly, is the most commonly used data container in data science, Data Frames. Data Frames are a versatile container as they can store multiple types of data inside it and provide corresponding labels for them. The package that handles data frames is "pandas".

In [57]:
# Creating a data frame - will look more when we look at python libraries
import pandas as pd
dates = pd.date_range("20130101", periods=6)
df = pd.DataFrame(np.random.randn(6, 4), index=dates, columns=list("ABCD"))
df

Unnamed: 0,A,B,C,D
2013-01-01,0.260387,-2.257405,1.046962,0.55902
2013-01-02,0.121074,-0.219977,2.214922,-0.228332
2013-01-03,0.044422,-2.367109,-1.649273,0.428935
2013-01-04,-2.447554,0.112796,-0.395942,0.11108
2013-01-05,-0.113939,0.169123,-0.852692,1.014481
2013-01-06,0.019017,0.943597,-0.597067,-0.892586


In [58]:
s = pd.Series([1, 3, 5, np.nan, 6, 8]) #can think of a series as a 1 dimensional DF - Vector almost 
s

0    1.0
1    3.0
2    5.0
3    NaN
4    6.0
5    8.0
dtype: float64

Here, I created a dataframe where I generated columns with random numbers attached. We can also append columns as needed to pandas dataframes. Each individual column is knownw as a "Series.

In [59]:
# Adding a List of fruits as a column in the data frame
df['E'] = ['Apples', 'Oranges', 'Pears', 'Grapes', 'Banana', 'Guava']
df

Unnamed: 0,A,B,C,D,E
2013-01-01,0.260387,-2.257405,1.046962,0.55902,Apples
2013-01-02,0.121074,-0.219977,2.214922,-0.228332,Oranges
2013-01-03,0.044422,-2.367109,-1.649273,0.428935,Pears
2013-01-04,-2.447554,0.112796,-0.395942,0.11108,Grapes
2013-01-05,-0.113939,0.169123,-0.852692,1.014481,Banana
2013-01-06,0.019017,0.943597,-0.597067,-0.892586,Guava


We'll dive more into the usage of data frames in later workshops as there is quite abit to get when it come to the specifics of manipulating data in a data frame.

## Converting Between Data Containers

In the same way we can convert between data types, we can convert between data containers as well. As you can see below:

In [60]:
# List to Set
a = [1,2,3,2,1,4,5,4,2,3,5,77,33]
a = set(a)
print(a)

{1, 2, 3, 4, 5, 33, 77}


In [61]:
# Tuple to List
a = (1,2,3,4,'a','d','f')
a= list(a)
print(a)

[1, 2, 3, 4, 'a', 'd', 'f']


In [62]:
# List to Tuple
a = [1,2,3,4,5,5,6,67,7]
a = tuple(a)
print(a)

(1, 2, 3, 4, 5, 5, 6, 67, 7)


In [63]:
a = [[1,2],[34,4],[3,4],[5,6]]
[sublist[1] for sublist in a[0:3]] #Take index one from the sublists index 0-3 exclusive

[2, 4, 4]