# Python for Data Analytics
## Week 1: Variables and Scalar data types

Acknowledgement:

This notebook is based on open teaching materials of Worldbank, https://github.com/worldbank and examples from  McKinney's Python for Data Analysis https://wesmckinney.com/book/

## Jupyter Notebooks and Colab environment
* Make sure you're signed into your Google account. Click connect in the top right corner.

#### Notebooks

Notebooks comprise two types of cells:
* _Code cells._ These contain executable commands in Python.
* _Text cells._ These include plain text, or you can use [markdown](https://commonmark.org/help/) to add formatting.

__EXERCISE:__ Spend a couple of minutes learning to navigate Colab. Perform the following:
 * Add a new code cell, first by point-and-click method, then using the keyboard shortcut.
 * Write your first program: print("hello world!")
 * Run the program two ways: using CTRL-ENTER and SHIFT-ENTER. Note the difference in which cell is selected.
 * Delete the code cell when you're finished with it.

#### Keyboard shortcuts

Action | Colab Shortcut
---|---
Execute current cell | `<CTRL-ENTER>`
Execute current cell and moves to next cell | `<SHIFT-ENTER>`
Insert cell above | `<CTRL-M> <A>`
Append cell below | `<CTRL-M> <B>`
Convert cell to code | `<CTRL-M> <Y>`
Convert cell to Markdown | `<CTRL-M> <M>`
Delete cell | `<CTRL-M> <D>`
Autocomplete | `<TAB>`
Goes from edit to "command" mode | `<ESC>`
Goes from "command" to edit mode | `<ENTER>`
<p align="center"><b>Note:</b> On OS X use `<COMMAND>` instead of `<CTRL>`</p>

## Variables and Math in Python

### Math operators

In [23]:
# add two integers
2 + 2

4

In [24]:
# multiply two integers
2 * 2

4

In [25]:
# spaces don't matter here, but keep them consistent (PEP8 good practice)
2*3   +   10

16

In [26]:
# divide two integers
6 / 3

2.0

In [27]:
# raise 2 to the 4th power
2 ** 4

16

In [28]:
# the mod function returns the remainder after division. Useful to check divisibility (among other things)
10 % 3

1

| Symbol | Task Performed |
|----|---|
| +  | Addition |
| -  | Subtraction |
| /  | division |
| *  | multiplication |
| **  | to the power of |
| %  | mod |

### Variables

In [29]:
# variables, such as x here, contain values and their values can vary
x = 5

In [30]:
# to inspect a value, just call it
x

5

In [31]:
# you can perform calculations on variables
x + 3

8

In [32]:
# what's the value of x now?
x

5

In [33]:
# to update the value of a variable, you have to do assignment again
x = x + 3

In [34]:
# now what's the value of x?
x

8

In [35]:
# create a new variable y from a calculation involving x
y = x + 2
y

10

In [36]:
# to modify a variable in place through addition or subtraction, use the shorthand += or -=
x += 10
x

18

In [37]:
# calling two variables only displays the last one
x
y

10

In [38]:
# use the print() function to output value(s) to the console
print(x)
print(y)

18
10


In [39]:
# separate two values by commas to output on the same line
print(x,y)

18 10


In [40]:
# you can also print the output of an expression
print(x * y)

180


NOTE: Use valid variable names!
* Variable names can contain letters, numbers, and the underscore character.
* You can't begin variable names with a digit, or use any of Python's _reserved words_ (eg. False, list, None, zip, else, class, ...).
* Don't use a space in the middle of a variable name.

| result | variable name |
|----|----|
| Valid | my_float, xyz_123, zip_code |
| Error! | my float, 123_xyz, zip |

### Getting Help

In [41]:
# get iPython help on an expression by putting ? after it
len?

In [42]:
# use tab complete to fill in the rest of statements, functions, methods
prin

NameError: name 'prin' is not defined

In [None]:
# also use it to complete variable or functions that you defined yourself
name_of_course = "Python for Data Analytics"

In [None]:
name_of_cou

## Basic data types: int, float, string, Boolean
These object types are the most basic building blocks when handling data in Python. Note that Python is an object-oriented language. Each object has a type, which determines what can be done with it. For instance, an object of type _int_ can be added to another _int_.

In compiled languages like C++, the programmer has to declare the type of any variable before using it. By contrast, Python will **infer the type of variable you want** at run-time. It does this based on what characters you pass, whether they are surrounded by quote marks or brackets. This keeps the syntax much more 'natural' - but take care to learn the rules your Python interpreter applies.

In [None]:
# integers are whole numbers
x = 10
type(x)

In [None]:
# floats are floating point (or decimal) numbers
y = 4.25
type(y)

In [None]:
# strings are sets of characters in a row, denoted by single or double quotes
course_name = 'Python for Data Science'

In [None]:
# the possible values for a Boolean are True or False

my_enrollment_status = True
type(my_enrollment_status)

In [None]:
# use isinstance to check an object's type (answer is a Boolean)
isinstance(course_name, int)

### Manipulating strings

In [None]:
# this is a string. It can be assigned to a variable. 
mystring = 'I am a string. Humans can interpret me easily'

In [None]:
# we can print this string as follows:
print(mystring)

In [None]:
# Data types are defined as classes. Classes have methods attached to them, which you can access with dot notation.
# Example: strings have a method '.split()' that returns a list of component parts. 

mystring.split('.')

In [None]:
# we can use this to access just part of the string
split_up_list = mystring.split('.')
split_up_list[1]

In [None]:
# strings are iterable - we can print certain letters or chunks, depending on how we index them:
print(mystring[0])
print(mystring[3])

In [None]:
for q in range(0, 15):
    print(mystring[q:15])

In [None]:
for q in range(0, 15):
    print(mystring[15:-q])

In [None]:
# the .replace() method is handy too. This operation can be chained for entertainment value:
print(mystring)

new_string = mystring.replace("I am","Nicholas is").replace("string", "human").replace("Humans", "Other humans").replace("me","him")

print(new_string)
print(new_string.replace('Nicholas', 'Charles').replace('easily','from time to time'))

In [None]:
# strings can added (concatenated) together
add_chunk = '. I love strings'
mystring + add_chunk

In [None]:
# ... but not subtracted
subtract_bit = 'easily.'
mystring - subtract_bit

In [None]:
# the backslash is special in strings - it is called an escape character. 
# It does a number of different things depending on the next letter:
print('using \n generates a new line!')

In [None]:
print('using \t generates a tab!')

In [None]:
# you can tell python to ignore this by adding 'r' to the start of a string.\
string_will_fail = 'C:\Users\Student001'

In [None]:
string_will_work = r'C:\Users\Student001'

In [None]:
import time 
the_time = time.ctime()
print('the date and time is currently: %s' % the_time)

In [None]:
# be aware of the letter after the first percent sign - it changes the nature of the string formatting:
q = 22 / 7

print('%d' % q)    # d: decimal
print('%f' % q)    # f: float
print('%e' % q)    # e: exponential
print('%s' % q)    # s: string

In [None]:
# a more recent and easy-to-use way to print variables is with .format()
# to keep your outputs neat, you might want to limit the decimal places

print(f"My output: {q}")
print(f"My output: {q:.3f}")

### Converting between types
Often you need to convert variables to other types, especially to make them work together. Use the _int()_, _str()_ or _float()_ functions to convert to these data types.

In [None]:
# sometimes Python will change a variable's data type for you. Take this variable:
my_salary = 500000
print(f"Variable my_salary has value {my_salary} and type {type(my_salary)}")

In [None]:
# now divide it by another integer. The result is necessarily a float:
daily_rate = my_salary / 365
print(f"Variable daily_rate has value {daily_rate} and type {type(daily_rate)}")

In [None]:
# changing a float to an integer lops off everything after the decimal place
int(daily_rate)

In [None]:
# the output should actually be 1370
round(daily_rate)

In [None]:
# you can't concatenate a string and an integer

address = "1808 H ST NW, DC"
WB_zip = 20037

address + " " + WB_zip

In [None]:
# instead, change the integer to a string first
WB_zip = str(WB_zip)
type(WB_zip)

In [None]:
# does it work now?
address + " " + WB_zip

# Exercises (homework)

Please organise your solutions as a separate notebook and upload to the LMS (Moodle):
* Notebook (.ipynb)
* Generate pdf (File -> Download as -> PDF). 

Pandoc is required and can be installed as 

_conda install -c conda-forge pandoc_ 

in Anaconda prompt

# Exercise 1.
Calculate sum and product of two integer numbers:

In [None]:
number1 = 5

number2 = 7

#YOUR CODE HERE

Expected result: "Sum of 5 and 7 is 12; product of 5 and 7 is 35" printed

### Exercise 2.
Repeat the string n times (Hint: check the ** operator for strings)

In [None]:
times = 5
s = "Hey! "


# YOUR CODE HERE

# Expected result: "Hey! Hey! Hey! Hey! Hey! " printed

### Exercise 3.
Calculate the age given the year of birth

In [None]:
birth_year = 1980
# YOUR CODE HERE

# Expected result: "Your age is 41 or 42 years"

### Exercise 4.
Divide two integers with a remainder

In [None]:
dividend = 17
divisor = 6
# YOUR CODE HERE

# Expected result: "The quotient is 2 and the remainder is 5"

### Exercise 5.
Print pi (math.pi) with 10 decimal places

In [None]:
import math
# YOUR CODE HERE

# Expected result: "Pi is 3.1415926536"

### Exercise 6.
Introduce variables with current exchange rates and convert 10 euro to pounds (GBP) and dollars ($)

In [None]:
value = 10 # euro
# YOUR CODE HERE

# Expected result (approx.): "10 euro is 8.68 pounds (GBP) or 10.02 dollars (USD)

### Exercise 7.
Print a string by one symbol in line

In [None]:
name = "Dmitry" # repalce with your name
# YOUR CODE HERE

# Expected result:
# D
# m
# i
# t
# r
# ### Exercise 3.
Calculate the age given the year of birthy

### Exercise 8.
Calculate the age given the date of birth

In [None]:
birth_date = "1980-Jul-6"
# YOUR CODE HERE

# Expected result: "Your age is 42 years"