# Intro to Python for data science - SOLUTION

## 1. Introducing Python syntax
Let's illustrate three key aspects of Python:
* Variables
* Data types
* Logic and loops


### 1 Variables

#### 1.1 Variable types

Python has several types of variables, including strings, integers and floats. Manipulating these objects is much more intuitive than in languages like C++.

Let's create some:

In [2]:
title = "Introductory Python"
date = "March 2019"

date_founded = 1946
my_salary = 250000

pi_5_decimal_places = 3.14159

In [3]:
# To inspect a variable, just call it:

title

'Introductory Python'

In [4]:
# To check a variable's type:

type(date_founded)

int

Note: there are many types of objects: datetime objects, numerical arrays, Pandas dataframes, geometry objects...

#### 1.2 Manipulating strings:

In [5]:
# How long is the string?

len(title)

19

In [6]:
# Add (concatenate) two strings together

full_title = title + ": " + date

full_title

'Introductory Python: March 2019'

In [7]:
# Split the string up.
# Note: Each data type has 'methods' associated with it. You can access these with dot notation. '.split' is just one example of the methods associated with a string object.

full_title.split()


['Introductory', 'Python:', 'March', '2019']

In [8]:
# ... and to slice them using Python's indexing methods (these will quickly come in very handy):
# indexing starts at zero!

print(title[:5])       # first 5 characters
print(title[5:20])     # characters 4 to 11

Intro
ductory Python


In [9]:
# EXERCISE:

# 1. Use tab completion to find what other useful methods you can access for (i) string objects; and (ii) integers or floats.
# 2. Call up help for the '.split' method. Figure out how to split full_title on the colon rather than the spaces. (Hint: this should return two items instead of four)

#### 1.3 Manipulating integers and floats

In [10]:
# Basic math operators work as expected

print("Addition: ", 3+4)
print("Multiplication:", 3 * 4)
print("Division: ",3 / 4)

Addition:  7
Multiplication: 12
Division:  0.75


In [12]:
# Math can be applied to variables. Python will change the variable's datatype when the operation requires it.

my_daily_salary = my_salary / 365

print("my_daily_salary is ", my_daily_salary)
type(my_daily_salary)

my_daily_salary is  684.931506849315


float

In [13]:
# We can convert datatypes using str(), int() or float().

my_daily_salary = int(my_daily_salary)

print("converted back to an integer: ",my_daily_salary)

converted back to an integer:  684


In [16]:
# Datatypes matter. One of the following will fail, one will succeed:

# YOUR ACTION REQUIRED: Fix this cell if it fails when you run it. (You can just comment out the broken command)

print("Every day I earn " + my_daily_salary)

print("Every day I earn " + str(my_daily_salary))

TypeError: can only concatenate str (not "int") to str

In [17]:
# EXERCISE 2: Sports star salaries

# 1. Define three variables, player, season_length, and annual_salary. Just use a Google search to find some values for this.
# 2. Write a script to calculate how much the player earns per month, week, day, and for the whole season.
# 3. Print it out neatly.
# 4. Your player got bought by Real Madrid and their salary doubled. Recalculate.

# BONUS POINTS: find a way to limit the salary to two decimal points when printing it.

In [None]:
# Your code here:


In [None]:
## SOLUTION CODE

player = 'Ronaldo'
season_length = 70
annual_salary = 2500000

daily_sal = annual_salary / 365
season_sal = daily_sal * season_length

In [None]:
print(player + " earns " + str(daily_sal) + "per day and total of " + str(season_sal) + "during the " + str(season_length) + "-day long season.")

In [None]:
print("{} earns {:.2f} per day and total of {:.2f} during the {}-day long season".format(player,daily_sal,season_sal, season_length))

In [None]:
annual_salary = annual_salary * 2

daily_sal = annual_salary / 365
season_sal = daily_sal * season_length

print("{} earns {:.2f} per day and total of {:.2f} during the {}-day long season".format(player,daily_sal,season_sal, season_length))

### 2. Data structures: Lists and dictionaries
In data analysis we're usually concerned with groups of objects. Python provides lists, dictionaries and tuples as ways to aggregate your variables.

#### 2.1 Lists
Lists can be thought of as bags of objects:
* An empty list is denoted []
* It can contain any number of objects. You can mix different types together (eg. list1 = [2, 6, 'hello'])
* Learn the methods to append new items to lists, add lists together, check whether a given item is included in a list, etc. 

In [18]:
list1 = [10,12,'hello']
list1

[10, 12, 'hello']

In [19]:
evens = [2,4,6,8]
odds = [3,5,7,9]
all_nums = evens + odds
all_nums

[2, 4, 6, 8, 3, 5, 7, 9]

In [20]:
# Use list methods like .sort

all_nums.sort()
all_nums

[2, 3, 4, 5, 6, 7, 8, 9]

In [21]:
# Pull items out of lists with indexing:

all_nums[2:5]

[4, 5, 6]

In [22]:
# EXERCISE: 

#### 3.2 Dictionaries
Dictionaries are a widely used data structure that pairs items together. They are denoted with curly brackets like this {'hello':'bonjour'}.
Dictionaries comprise key-value pairs.

Examples:

In [23]:
employees = {'Mary': 'Economist', 'John': 'Urban Specialist', 'Yixuan': 'Communications Officer'}
employees['Mary']

'Economist'

In [24]:
print (list(employees.values()))

['Economist', 'Urban Specialist', 'Communications Officer']


### 3. Logic and control flow
When writing scripts for data analysis, loops and logic operations and very helpful. Examples:
* Check if two quantities are the same or different
* Filter out erroneous data by dropping observations more than 3 standard deviations above the mean

#### 3.1 Logic operators

In [25]:
# Declare variables using '='
a = 5
b = 7

In [26]:
# Compare variables using '<', '>', or '=='

a == b

False

In [27]:
a < b

True

In [28]:
condition = a < b
print(condition)
type(condition)

True


bool

#### 3.2 Conditional statements with if
Let's illustrate the power of 'if' constructions using a dietary example. Our Python coder is vegetarian. We will test whether the variable 'food' is 'burger', 'chicken' or 'veg', then decide whether to eat. For this, we will use 'if', 'elif' (else if), and 'else'.

How to use this structure:
* start with an 'if' statement, specifying the logical test to apply
* make sure your 'if' statement ends with :
* your next line sets out the action to take if the tested condition was true

Test additional actions using 'elif', and any other actions with 'else'.

In [29]:
food = 'chicken'

In [30]:
if food == 'veg':
    print ('yum')
elif food == 'chicken':
    print ('hmm maybe')
elif food == 'burger':
    print ('no thanks')
else:
    print ('more information needed')

hmm maybe


#### 3.3 For loops
This code structure allows you to perform calculations several times in a row. 

* Declare your for loop with respect to any iterable. A range operator is a helpful way to count up to a specified number.
* End the for loop declaration with : then indent the next lines.
* Note: its common to nest 'if' statements within for loops. If so you require two levels of indentation.

In [31]:
print("The omelettes I ate this week:")
print()

n_omelettes = 6

for i in range(n_omelettes):
    print("omelette ",i)

The omelettes I ate this week:

omelette  0
omelette  1
omelette  2
omelette  3
omelette  4
omelette  5


In [32]:
# EXERCISE: FizzBuzz

# Write a script to count from 1 to 20 but replace numbers with 'fizz' if divisible by 3, 'buzz' if divisible by '5', and 'fizzbuzz' if divisible by 3 and 5.

# Hint: use the 'mod' function to check divisibility of numbers. (Any questions about a Python function, just google it!!). Example: 10 % 2 == 0.

In [33]:
# SOLUTION:

for i in range(1,21):
    if (i % 3 == 0) and (i % 5 == 0):
        print('fizzbuzz')
    elif i % 3 == 0:
        print('fizz')
    elif i % 5 == 0:
        print('buzz')
    else:
        print(i)
    

1
2
fizz
4
buzz
fizz
7
8
fizz
buzz
11
fizz
13
14
fizzbuzz
16
17
fizz
19
buzz
