# Week 1: Overview of Python

**Sources:**

- Python for Marketing Reserach and Analytics. J. Schwarz, C. Chapman, and E.M. Feit. Springer 2020.

## 2. Basic Python Types

- Almost all entities in python are objects (strings, classes, functions, etc.)
- Python is a weakly typed or dynamically typed language which means: (1) flexibility to change an object's type (e.g. from numeric to string), and (2) many basic operators are overloaded (e.g the + operator)



### 2.1 Numeric Types

- Python has three built-in numeric types: int, float, and complex
- float: floating-point numbers, i.e. real numbers. floats can represent decimal values unlike integers int
- ints are more memory efficient
- complex: represent complex numbers that include an imaginary component

In [1]:
# int

x = 2
y = 4
x+y # addition operator

6

In [2]:
type(x)

int

In [3]:
# float

w = x/y # division operator 
w

0.5

In [4]:
type(w) # object type

float

In [5]:
x**y # exponentiation

16

In [6]:
z = 3.2
type(x*z)

float

### 2.2 Sequence Types

- Python has three sequence types
- Each of the sequence type is an ordered array of objects
- The three types: lists, tuples, and ranges

#### 2.2.1 Lists

- Lists are ordered, mutable sequences of objects
- Defined with square brackets []

In [7]:
x = [0, 1, 2, 3, 4, 5]
y = ['a', 'b', 'c']

- When we add two lists, we concatenate them together

In [8]:
x+y

[0, 1, 2, 3, 4, 5, 'a', 'b', 'c']

In [9]:
type(x)

list

In [10]:
type(x+y)

list

- Use the append() method to add an element to the end of the list
- We pass the element we want to add as an argument to the method append(object)

In [11]:
x.append('r')
print(x)

[0, 1, 2, 3, 4, 5, 'r']


- Lists can contain a mix of types (e.g. int and strings)

In [12]:
x

[0, 1, 2, 3, 4, 5, 'r']

- Use the sort() method to sort the list numerically or alphabitically

In [13]:
a = [5, 3, 7, 2, 4, 1]
a.sort()
print(a)

[1, 2, 3, 4, 5, 7]


- We can find the length of the list using len() Python's built-in function

In [14]:
len(x)

7

- A list has an index that starts at 0
- The index represents the position of the element in the list
- The following line of code returns the second element. The list's index starts at 0, so element with index 0 is the first element in the list, and the element with index 1 is the second element in the list


In [15]:
x[1]

1

- We can index a range of values using the operator :
- In the code below, we retrieve a elements starting from index 2 up to and not including 4 (i.e. 2 and 3)
- In Python, the lower bound is inclusive and the upper bound is exlusive

In [16]:
x[2:4]

[2, 3]

- If we want to start indexing from the begining of the list, a starting number needs not to be specified

In [17]:
x[:2]

[0, 1]

- We can index the list all the way till the end by not specifying the end index

In [18]:
x[1:]

[1, 2, 3, 4, 5, 'r']

- Negative indicies are relative to the end of the list
- The code below retrieves the last two elements of the list x

In [19]:
x[-2:]

[5, 'r']

- When the programming language does not support negative index similar to Python, then we compute as follows:

In [20]:
x[len(x)-2:len(x)]

[5, 'r']

- Lists are mutable
- Mutable means that we can append elements and substitute elements (i.e. change the content)


In [21]:
x[2] = 'freez'
x

[0, 1, 'freez', 3, 4, 5, 'r']

#### 2.1.2 Tuples

- Tuples are similar to lists with one major caveat: they're immutable
- Tuples are defined with paranthetical brackets ()

In [22]:
#tuples
z = (7,8,9)

- We index tuples just like lists

In [23]:
z[1]

8

- Attempting to modify a tuple leads to an error

In [24]:
# z[1] = 'boil'

#### 2.1.3 Ranges

- Ranges are immutable sequence of numbers
- Mostly used with for loops (we will discuss in a later section, see Control Flow section below)
- It has three positional arguments (start, stop, ans step)
- In the code below, we create a range object. The range of values starts at 5, stops at 30 (not inclusive), with steps of 2

In [25]:
range(5, 30, 2)

range(5, 30, 2)

In [26]:
list(range(5, 30, 2))

[5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29]

- Only the stop argument is required
- If only the stop argument is provided, the range will start at 0 and increments by 1 up to that value
- In the following code, we specify stop =10, so we get a range of numbers starting at 0 up to 10 (not inclusive, we stop at 9)

In [27]:
range(10)

range(0, 10)

In [28]:
list(range(10))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

- In the following code, we starts at 2 and ends at 12 (not inclusive, we stop at 11)

In [29]:
list(range(2,12))

[2, 3, 4, 5, 6, 7, 8, 9, 10, 11]

### 2.3 Text Sequence Type

- Python has a type for text: str (string)
- Strings can be specified using 'single', "dobule", or '''triple''' quotes
- The following code concatenates two string objects using the + operator

In [30]:
x = 'Hello'
y = "World!"

x+y

'HelloWorld!'

- We can index the string similar to lists using square brackets []

In [31]:
x[3:]

'lo'

- String objects have many string-specific methods
- In the following code, we use the method lower() to modify the case of the letters to lower, and the upper() method to UPER

In [32]:
x.lower()

'hello'

In [33]:
x.upper()

'HELLO'

- Strings are mutable
- We can use the replace() method to replace elements in the string object
- In the following line of code, we replace the 'lo' portion in the string 'hello' with the letter 'p'

In [34]:
x.replace('lo','p')

'Help'

- A list of strings can be joined on another string

In [35]:
', '.join([x,y, 'what a day!'])

'Hello, World!, what a day!'

- A string could be split on a delimiter
- In the following code, we split the string 'Hello, world, what, a, day!' on the comma delimiter ','

In [36]:
'Hello, world, what a day!'.split(',')

['Hello', ' world', ' what a day!']

- format() method is used to insert values from variables into a string
- The substitution locations are specified using {}
- The values to be substituted in are passed as arguments to format()

In [37]:
temperature = 21.34

'The temperature today is {} degrees'.format(temperature)

'The temperature today is 21.34 degrees'

- We can also specify names for each substitution

In [38]:
x = 18.93
y = 345.234

'{x} plus {x} plus {y} equals {r}'.format(x = x, y = y, r = x + x + y)

'18.93 plus 18.93 plus 345.234 equals 383.094'

### 2.4 Booleans

- A boolean (or bool) can have only one of two values: `True` or `False`
- Bools are often produced from copmarisons




In [39]:
1==1 # Is 1 equal to 1?

True

In [40]:
1 < 2 # Is 1 less than 2?

True

In [41]:
1 == 2 # Is 1 equal to 2?

False

- We can save the boolean in a bool object

In [42]:
x = 1 == 1
x

True

- Bools can also be compared using the `and`, `or`, and `not` operators
- Bools are used a lot in control statements (see section below)
- Bools are also used when indexing dataframes (in future class sessions)

In [43]:
x = True
y = False 
x or y

True

In [44]:
x and y

False

In [45]:
x and not y

True

### 2.5 Mapping Types (Dictionaries)

- Dictionaries or dicts are data structure that use one object to index another object
- Lists or tuples can store any object but only with an integer index
- A dictionary has two types of objects (keys and values)
- Dictionaries are very efficient
- In the following line of code, we creat a dictionary using the `dict()` function
  - Keys: a, b, and c
  - Values: 1, 2, 3

In [46]:
x = dict(a = 1, b = 2, c = 3)
x

{'a': 1, 'b': 2, 'c': 3}

- We can also define a dictionary using curly brackets

In [47]:
x = {'a': 1, 'b':2, 'c':3}
x

{'a': 1, 'b': 2, 'c': 3}

- Just like lists and tuples, indicies are passed using square brackets

In [48]:
x['a']

1

- The key-value pairs can be accessed directly as tuples using `items()` method

In [49]:
x.items()

dict_items([('a', 1), ('b', 2), ('c', 3)])

- Keys in a dictionary can be accessed using `keys()` method

In [50]:
x.keys()

dict_keys(['a', 'b', 'c'])

- Values in a dictionary can be accessed using `values()` method

In [51]:
x.values()

dict_values([1, 2, 3])

## 3. Control Flow

- Control flow is the order in which the statements in the program are evaluated
- Two type of control flow: conditionals and loops


### 3.1 If statement

- Conditional statements use boolean conditions to create branch points in the code
- The condition is assessed using boolean logic and if the result is True, then the following line of code is executed, otherwise if it is False, the following line will be skipped
- In the following peice of code, the evaluation of the condition x = 5 leads to True since the value of x is 5, and 5 is greater than 2

In [52]:
x = 5
if x > 2:
  print('x = {}, which is greater than 2'.format(x))
print('Done!')

x = 5, which is greater than 2
Done!


- In the following peice of code, the condition evaluation leads to False becase 0 is not greater than 2

In [53]:
x = 0
if x > 2: 
  print('x = {}, which is greater than 2'.format(x))
print('Done')

Done


- `if` statements often includes a paired `else` statement
- In the following peice of code, the `else` statement will be executed since 0 is not greater than 2 

In [54]:
x = 0
if x > 2:
  print('x = {}, which is greater than 2'.format(x))
else:
  print('x = {}, which is less than or equal to 2'.format(x))

x = 0, which is less than or equal to 2


- There is also an `elif` (else if) which evaluates if the previous `if` or `elif` statement evaluated to False
- In the following code, x is not greater than 2; x is not equal to 2; the `else` statement is executed when all the previous `if` or `elif` statements evaluate to False

In [55]:
x = 2
if x > 2:
  print('x = {}, which is greater than 2'.format(x))
elif x == 2:
  print('x = {}, which is equals 2!'.format(x))
else:
  print('x = {}, which is less than 2'.format(x))

x = 2, which is equals 2!


### 3.2 For loop statement

- Loops iterates through an *iterator*, which is a collection of objects: e.g. lists, tuples, strings, and sets
- In the following code, we iterate through a collection of integer values in the list `a`.
- Note that the iteration is through the elements of the lists directly, and not through an index to reference into the list

- An *iterator* is an object that contains a countable number of values (source: W3 Schools)
- An *iterator* is an object that can be iterated upon, meaning that you can traverse through all the values (sourse W3 Schools).
 

[For loop flow chart](https://cdn.techbeamers.com/wp-content/uploads/2018/08/Regular-Python-for-loop-flowchart.png)

In [3]:
# given a list a, find the square of each value in a
a = [4, 2, 5, 1, 12, 13]
a_squared = []

for x in a:
    print(x**2)

16
4
25
1
144
169


In [57]:
a = [4, 2, 5, 1, 12, 13]
a_squared = []

for x in a:
  a_squared.append(x**2)

print('a_squared generated: {}'.format(a_squared))

a_squared generated: [16, 4, 25, 1, 144, 169]


- We can also iterate through a set of numbers using the `range()` function to produce an iterator

In [58]:
for i in range(5):
  print(i)

0
1
2
3
4


In [59]:
type(range(5))

range

In [60]:
for i in range(21, 100, 12):
  print(i)

21
33
45
57
69
81
93


- We can iterate through a list indirectly using a range of numbers as as indicies to index into a list

In [4]:
a_squared = []

for i in range(len(a)):
  a_squared.append(a[i]**2)

print('a_squared generated: {}'.format(a_squared))

a_squared generated: [16, 4, 25, 1, 144, 169]


- The `zip()` function "zips" together two collections and iterates through a pair of values
- In the following code, we define two ranges, and then we iterate through pairs of values using the `zip()` function

In [62]:
range1 = range(6)
range2 = range(6, 18, 2)

for x, y in zip(range1, range2):
  print(x, y)

# for x, y in zip(range(6), range(6, 18, 2)):
#   print(x, y)

0 6
1 8
2 10
3 12
4 14
5 16


- If one of the collections in the `zip()` function is shorter, then the iteration proceeds for the lenght of the shorter collection
- In the following, `range(6)` produces 6 value and `range(6,12,2)` produces 3 values, thus the output of the `zip()` function will be of length 3 (Since 3 is less than 6)

In [63]:
for j, k in zip(range(6), range(6, 12, 2)):
  print(j, k)

0 6
1 8
2 10


- `enumerate()` function returns not only the value from a collection, but also the index

In [64]:
print(a)

[4, 2, 5, 1, 12, 13]


In [65]:
for i, x in enumerate(a):
  print(i, x)

0 4
1 2
2 5
3 1
4 12
5 13


### 3.3 List Comprehension

- List comprehenision is a concise syntax for generating a list from another list
- In the following example, the code takes a list of numbers as input and produces a new list where each element has been incremented by one


In [66]:
a = [4, 2, 5, 1, 12, 33]

In [67]:
a_plus_one = [] # empty list

for x in a:
  a_plus_one.append(x+1)

a_plus_one

[5, 3, 6, 2, 13, 34]

- Instead of instantiating an (1) empty list, (2) creating a `for` statment to iterate through the source list, and (3) writing statements to append to a new list; all of those operations can be done in a single line using list comprehension
- List comprehension is in the following form

`newlist = [expression for item in iterable if condition == True]`

- The return value *expresson* is a new list, leaving the old list unchanged (source: w3 schools)

- The *iterable* can be any iterable object, like a list, tuple, set etc.(source: w3 schools)
- The *expression* is the current item in the iteration, but it is also the outcome, which you can manipulate before it ends up like a list item in the new list (source: w3 schools)


In [68]:
a_plus_one = [x+1 for x in a]
a_plus_one

[5, 3, 6, 2, 13, 34]

- The *condition* is like a filter that only accepts the items that valuate to `True` (source: w3 schools)
- The condition below is `x<12` only includes the elements that are less the 12 in the result


In [69]:
a_plus_one_filtered = [x+1 for x in a if x < 12]
a_plus_one_filtered

[5, 3, 6, 2]

- If we want a differential behavior based on a particular condition (if else statement), we can place the `if else` statement before the `for`

In [70]:
a_modified = [x+1 if x <12 else x*100 for x in a]
a_modified

[5, 3, 6, 2, 1200, 3300]

- The following code generates a list of tupples, where each tuple pair is a number and its square value

In [71]:
a_square_tuples = [(v, v**2) for v in a]
a_square_tuples

[(4, 16), (2, 4), (5, 25), (1, 1), (12, 144), (33, 1089)]

- In the following code, we iterate over a list of tuples

In [72]:
a_reconstructed = [w/v for v, w in a_square_tuples]
a_reconstructed

[4.0, 2.0, 5.0, 1.0, 12.0, 33.0]

- We can generate dictionaries as follows using dictionary comprehension

In [73]:
a_square_dict = {v: v**2 for v in a}
a_square_dict

{4: 16, 2: 4, 5: 25, 1: 1, 12: 144, 33: 1089}

### 3.4 While loop statement

- Loops allow one to run the same code repeatedly while systematically changing specific variables
- `while` loops will iteratively run the code as long as the loop condition is True
- The code below, we initialize the variable x to be 0
- The variable x is used in to control the flow of the loop
- The following loop produces a sequence of integers starting at 0 up to 5 (not inclusive, so the count is to 4)

In [74]:
x = 0

while x <5: 
  print(x)
  x += 1

0
1
2
3
4


- The following loop runs until `i` was no longer than `len(a)` i.e. the length of the list `a`
- 

In [75]:
a = [4, 2, 5 ,1, 12, 33]
a_squared = []
i = 0 

while i < len(a):
  a_squared.append(a[i]**2)
  i += 1
print('a_squared generated: {}'.format(a_squared))

a_squared generated: [16, 4, 25, 1, 144, 1089]


In [76]:
x = 34
y = x - 1

while True:
  if x % y == 0:
    break
  y -= 1

print('{y} is the largest factor of {x}, \n{f2} times {y} equals {x}'.format(y = y, x = x, f2 = x/y))

17 is the largest factor of 34, 
2.0 times 17 equals 34
