# Foundations of Data Science - CMU PORTUGAL ACADEMY

> In this notebook you can get an overview of common Python operations and features.
> 
> Instructors:
>   - David Semedo (df.semedo@fct.unl.pt)
>   - Rafael Ferreira (rah.ferreira@fct.unl.pt)

## Basic data types

### Numbers

Integers and floats work as you'd expect from other programming languages. Let's create two variables: 

In [2]:
some_integer = 73
some_float = 3.14159

We can print out these variables and their types as follows:

In [3]:
print(some_integer, type(some_integer))
print(some_float, type(some_float))

73 <class 'int'>
3.14159 <class 'float'>


What if you want to print some text and then some numbers? One way to do this is to _cast_ the number as a string and then print it:

In [4]:
print('My integer was ' + str(some_integer))
print('My float was ' + str(some_float))

My integer was 73
My float was 3.14159


An alternative way, using the  `print` statement with comma separated values:

In [None]:
print('My integer was', some_integer)
print('My float was', some_float)

My integer was 73
My float was 3.14159


Summary of Python operators over numbers:

| Operator     | Name           | Description                                            |
|--------------|----------------|--------------------------------------------------------|
| ``a + b``    | Addition       | Sum of ``a`` and ``b``                                 |
| ``a - b``    | Subtraction    | Difference of ``a`` and ``b``                          |
| ``a * b``    | Multiplication | Product of ``a`` and ``b``                             |
| ``a / b``    | True division  | Quotient of ``a`` and ``b``                            |
| ``a // b``   | Floor division | Quotient of ``a`` and ``b``, removing fractional parts |
| ``a % b``    | Modulus        | Integer remainder after division of ``a`` by ``b``     |
| ``a ** b``   | Exponentiation | ``a`` raised to the power of ``b``                     |
| ``-a``       | Negation       | The negative of ``a``                                  |

<span style="display:none"></span>

In [5]:
print('Sum:', some_integer + some_float)
print('Multiplication:', some_integer * some_float)
print('Division:', some_integer / some_float)
print('Power:', 10 ** some_integer)

Sum: 76.14159
Multiplication: 229.33606999999998
Division: 23.236641318567987
Power: 10000000000000000000000000000000000000000000000000000000000000000000000000


We can also store the result from math operations in new variables, e.g.

In [6]:
my_sum = some_integer + some_float
print('My sum was', my_sum)

My sum was 76.14159


**Different types of division in Python.**

In [7]:
3 / 2 # Regular division

1.5

In [8]:
3 // 2 # Integer division

1

### Booleans

Python implements all of the usual operators for Boolean logic, but uses English words rather than symbols like `&&`, `||`, etc that are found in other languages:

In [9]:
t = True
f = False
print(type(t), type(f))
# logical AND
print(t and f)
# logical OR
print(t or f)
# logical NOT
print(not t)
# logical XOR
print(t != f)

<class 'bool'> <class 'bool'>
False
True
False
True


### Strings

Python has powerful and flexible string processing capabilities. You can write _string literals_ using either single quote `'` of double quotes `"`:

In [10]:
a = 'one way of writing a string'
b = "another way"

We can also get the number of elements in a string sequence as follows

In [11]:
hello = 'hello'
len(hello)

5

We can also access each character in a string and print it's value:

In [12]:
for letter in hello:
    print(letter)

h
e
l
l
o


Adding two strings together concatenates them and produces a new string

In [13]:
world = 'world'
hello + ' ' + world 

'hello world'

String objects also come with a range of built-in functions to convert them into various forms:

In [14]:
hello.capitalize()

'Hello'

In [15]:
hello.upper()

'HELLO'

In [16]:
# replace all instances of one substring with another
s = 'hitchhiker'
s.replace('hi', 'ma')

'matchmaker'

## Container Data Structures

Python includes several built-in container types:

* Lists
* Dictionaries
* Sets
* Tuples

### Lists

A list is the Python equivalent of an array, but is _resizeable_ and can contain elements of _different types_:

In [32]:
some_list = [1,1,2,3,5,8,13,21,34,55,89]
print('This is a list:', some_list)

This is a list: [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]


Accessing list elements. In Python, the index of elements in a list starts at zero:

In [33]:
print(some_list[0])

1


To get elements at the end of a list, we can use _negative indices_, e.g.

In [None]:
print(some_list[-1])

89


We can also replace values in a list based on their index, e.g.

In [None]:
some_list[-1] = 148
print(some_list)

[1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 148]


Finally we can use the `pop` method to remove and return the last element of a list:

In [None]:
last_element = some_list.pop()
print(last_element)

148


#### Slicing

In addition to accessing list elements one at a time, Python provides concise syntax to access sublists; this is known as _slicing_. 
THe Python's built-in `range` function can be used to create a list of integers:

In [34]:
L = list(range(10))
L

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

To get a slice from $[2,4)$ we run

In [35]:
L[2:4]

[2, 3]

while to get a slice from index 2 to the end of the list we run

In [36]:
L[2:]

[2, 3, 4, 5, 6, 7, 8, 9]

To get a slice from the start to index 5 (exclusive) we run

In [37]:
L[:5]

[0, 1, 2, 3, 4]

To get a slice of the whole list:

In [38]:
L[:]

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Slices can also be negative

In [39]:
L[:-1]

[0, 1, 2, 3, 4, 5, 6, 7, 8]

In [40]:
L[2:4] = ['a', 'b']
L

[0, 1, 'a', 'b', 4, 5, 6, 7, 8, 9]

#### Loops

In [41]:
animals = ['cat', 'dog', 'monkey']
for animal in animals:
    print(animal)

cat
dog
monkey


#### List comprehension

Creating a list with a for loop:

In [28]:
nums = [0, 1, 2, 3, 4]
squares = []
for x in nums:
    squares.append(x ** 2)

squares

[0, 1, 4, 9, 16]

Creation of the same list, using Python List Comprehensions:

In [42]:
nums = [0, 1, 2, 3, 4]
squares = [x ** 2 for x in nums]; squares

[0, 1, 4, 9, 16]

Example of using a list comprehension, with an if statement:

In [44]:
nums = [0, 1, 2, 3, 4]
even_squares = [x ** 2 for x in nums if x % 2 == 0]; even_squares

[0, 4, 16]

### Dictionaries

A dictionary is known as the `dict` data structure, and can store (key,value) pairs. 
It is a collection of _key-value_ pairs, where _key_ and _value_ are Python objects. The simplest way to create a dictionary is with curly braces `{}`:

In [45]:
empty_dict = {}

In [46]:
d = {'cat': 'cute', 'dog': 'furry'}; d

{'cat': 'cute', 'dog': 'furry'}

We can access, insert, or set elements using the same approach as for lists:

In [47]:
d['cat']

'cute'

In [48]:
d['fish'] = 'wet'; d

{'cat': 'cute', 'dog': 'furry', 'fish': 'wet'}

You can check if a key exists as follows

In [49]:
'cat' in d

True

Finally, you can delete values using the `del` keyword

In [50]:
del d['fish']; d

{'cat': 'cute', 'dog': 'furry'}

#### Loops

In [51]:
d = {'person': 2, 'cat': 4, 'spider': 8}
for animal in d:
    legs = d[animal]
    print(f'A {animal} has {legs} legs')

A person has 2 legs
A cat has 4 legs
A spider has 8 legs


In [52]:
d = {'person': 2, 'cat': 4, 'spider': 8}
for animal, legs in d.items():
    print(f'A {animal} has {legs} legs')

A person has 2 legs
A cat has 4 legs
A spider has 8 legs


#### Dictionary comprehensions

Similarly to lists, we can create dictionaries with dictionary comprehensions:

In [53]:
nums = [0, 1, 2, 3, 4]
even_num_to_square = {x: x ** 2 for x in nums if x % 2 == 0}
print(even_num_to_square)

{0: 0, 2: 4, 4: 16}


### Sets

A set is an unordered collection of unique element. They are similar to dicts, but with just keys (no values). The simplest way to create a set is as follows

In [54]:
animals = {'cat', 'dog'}; animals

{'cat', 'dog'}

Sets allow us to perform the standard set operation like union, intersection, difference and symmetric difference. For example

In [55]:
felines = {'cat', 'tiger', 'lion'}

In [56]:
animals.union(felines)

{'cat', 'dog', 'lion', 'tiger'}

In [57]:
animals.intersection(felines)

{'cat'}

### Tuples

A tuple is an (immutable) ordered list of values. The simplest way to create one is with a comma-separated sequence of values

In [58]:
tup = (4, 5, 6)
tup

(4, 5, 6)

In [59]:
nested_tup = (4, 5, 6), (7, 8)
nested_tup

((4, 5, 6), (7, 8))

Multiplying a tuple by an integer has the effect of concatenating together copies of the tuple:

In [60]:
('foo', 'bar') * 4

('foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'bar')

### Strings
In Python strings are list like containers for characters. There are a handful of useful operations to process strings explained in this section

In [62]:
string = 'This is a string!\n(But not a very interesting one)\n\n\tEnd.'
print(string)

This is a string!
(But not a very interesting one)

	End.


In Python strings are lists of characters and as such one can iterate through them like lists:

In [63]:
for character in string:
    print(character)

T
h
i
s
 
i
s
 
a
 
s
t
r
i
n
g
!


(
B
u
t
 
n
o
t
 
a
 
v
e
r
y
 
i
n
t
e
r
e
s
t
i
n
g
 
o
n
e
)




	
E
n
d
.


Check their length like lists:

In [64]:
len(string)

57

Check if they contain certain elements like lists:

In [65]:
'!' in string

True

In [66]:
'?' not in string

True

We can also check if a **substring** is present in a string:

In [67]:
'very' in string

True

**Capitalisation:**
There are different ways to manipulate the casing of strings:

In [68]:
'test'.upper()

'TEST'

In [69]:
'TEST'.lower()

'test'

In [70]:
'test'.capitalize()

'Test'

**Adding strings:**

In [71]:
result = 'a'+'b'
print(result)

ab


**Splitting strings:**

Often we need to split sentences into words or file paths into components. For this task we can use the `split()` function. By default a string is split wherever a whitespace is (this could be normal space, a tab `\t` or a newline `\n`).

In [72]:
string.split()

['This',
 'is',
 'a',
 'string!',
 '(But',
 'not',
 'a',
 'very',
 'interesting',
 'one)',
 'End.']

In [73]:
'path/to/file/image.jpg'.split('/')

['path', 'to', 'file', 'image.jpg']

**Stripping strings:**

Sometimes strings contain leading or trailing characters that we want to get rid of, such as whitespaces or unnecessary characters. We can remove them with the `strip()` function. Like the `split()` function it removes whitespaces by default but we can set any characters we want:

In [74]:
'_path/to/file/image.jpg_'.strip('_')

'path/to/file/image.jpg'

In [75]:
'_-_path/to/file/image.jpg,_,'.strip(',_-')

'path/to/file/image.jpg'

**Replacing:**

With the `replace()` function one can replace substrings in a string.

In [76]:
'one plus one equals two!'.replace('two','three')

'one plus one equals three!'

**Joining strings**

Sometimes we split strings into a list of words for processing (like stemming or stop word removal) and then want to join them back to a single string. To to this we can use the `join()` function:

In [None]:
' '.join(['this', 'is', 'a', 'list', 'of', 'words'])

'this is a list of words'

In [None]:
'-'.join(['this', 'is', 'a', 'list', 'of', 'words'])

'this-is-a-list-of-words'

## Functions

Functions are declare with the `def` keyword and returned from the `return` keyword, e.g.

In [77]:
def area_of_a_circle(radius):
    area = 3.14159 * radius ** 2
    return area

In [78]:
area_of_a_circle(5)

78.53975

Note that we can have multiple return statements, e.g. based on the result of some conditional statements:

In [79]:
def sign(x):
    if x > 0:
        return 'positive'
    elif x < 0:
        return 'negative'
    else:
        return 'zero'

for x in [-1, 0, 1]:
    print(sign(x))

negative
zero
positive


We can define _keyword_ arguments and specify default values:

In [80]:
def hello(name, loud=False):
    if loud:
        print('G\'day, %s!' % name.upper())
    else:
        print('G\'day, %s' % name)

hello('Bob') 
hello('Fred', loud=True)

G'day, Bob
G'day, FRED!


In Python, we can return multiple _multiple_ objects in a Function, as a tuple:

In [81]:
def is_number_positive(number):
    if number > 0:
        return True, number
    else:
        return False, number

In [82]:
is_number_positive(-1)

(False, -1)

In [83]:
is_number_positive(42)

(True, 42)

One cool aspect of such functions is that the returned objects can be unpacked in two different ways:

In [84]:
tup = is_number_positive(3)
tup

(True, 3)

In [85]:
is_positive, number = is_number_positive(-10)

print(is_positive)
print(number)

False
-10


## Exercises from Slides

Exercise from Slide 30 - Session 1 - Introduction to Python Programming.

In [86]:
probability = 0.7  # 70% chance of winning
reward_amount = 100  # 100€ reward
tax_rate = 0.23  # 23% tax

# Reading the bet amount from the user
bet_amount = float(input("Enter the bet amount: "))

# Calculate expected winnings before tax
expected_win = probability * reward_amount

# Apply tax to the expected winnings
expected_win_after_tax = expected_win * (1 - tax_rate)

net_winnings = expected_win_after_tax - bet_amount

print("Net expected winnings after tax and bet:" , round(net_winnings, 2))



Net expected winnings after tax and bet: 43.9


Exercise from Slide 40 - Session 1 - Introduction to Python Programming.

In [87]:
def expected_reward(bet_amount):

    probability = 0.7  # 70% chance of winning
    reward_amount = 100  # 100€ reward
    tax_rate = 0.23  # 23% tax

    # Calculate expected winnings before tax
    expected_win = probability * reward_amount

    # Apply tax to the expected winnings
    expected_win_after_tax = expected_win * (1 - tax_rate)

    net_winnings = expected_win_after_tax - bet_amount

    return round(net_winnings, 2)



In [88]:
bet = float(input("Enter the bet amount: "))
result = expected_reward(bet)
print("Net expected winnings after tax and bet:" , result, "€")


Net expected winnings after tax and bet: 33.9 €


Exercise from slide 51 - Session 1 - Introduction to Python Programming

In [91]:
def is_prime(n):
    """is_prime : int [positive] -> bool
    Description: checks if n is prime.
    Examples: is_prime(3)->True; is_prime(4)->False
    """
    v = n - 1
    while (v > 1) and (n % v != 0):
        v = v - 1

    return v == 1


In [92]:
print(is_prime(7), is_prime(123), is_prime(1009))

True False True
