# Week 7

### Table of Contents
1. [Introduction to characters and strings](#bullet1)
2. [Printing](#bullet2)
3. [Count number of characters](#bullet3)
4. [String manipulation](#bullet4)
5. [Sets: Another Python Data Object](#bullet5)
6. [Formatting Strings](#bullet6)
7. [String manipulation with Regex](#bullet7)
8. [Decorators](#bullet8)
9. [Classes](#bullet9)
10. [Modules](#bullet10)

In [1]:
import pandas as pd
import numpy as np

## 1. Introduction to characters and strings <a class="anchor" id="bullet1"></a>

In [3]:
# use quotation marks to assign a string/char to an object

a = "learning to create"
b = "a"
c = 'character string'
d = 3.14

##### You can use single or double qutation marks, it makes no difference. If you want quotation marks inside a string, then you have to pay a bit more attention

In [9]:
quote = "Steve jobs was a beliver in Da Vinci's saying \n'Simplicty is the ultimate sophistication.'"
print(quote)

Steve jobs was a beliver in Da Vinci's saying 
'Simplicty is the ultimate sophistication.'


##### Check for string data objects with `type()`; convert objects to str with `str()`

In [10]:
print(type(a))
print(type(d))
print(type(a) == str)
print(type(d) == str)

<class 'str'>
<class 'float'>
True
False


In [11]:
d = str(d)
print(d)
print(type(d))

3.14
<class 'str'>


##### Ways to combine strings:

In [12]:
a + b + c ## the join operator combines strings with a separator

'learning to createacharacter string'

In [13]:
a + ' ' + b + ' ' + c  # Same as above


'learning to create a character string'

##### To use the + operator for string concatenation, the args must be a string

In [14]:
'hello' + 4

TypeError: can only concatenate str (not "int") to str

In [15]:
### thus, we have to coerce numeric data types to be a string to concatenate 
'hello' + str(4)

'hello4'

##### join() function

In [16]:
" ".join([a, b, c, "in Python Version" ,d])

'learning to create a character string in Python Version 3.14'

The `join()` function and `+` operator are the common ways you'll use to combine strings. However, there are three more ways to combine strings in older versions of Python. Link to resource: https://www.digitalocean.com/community/tutorials/python-string-concatenation

## 2. Printing <a class="anchor" id="bullet2"></a>

Whenever you write your own functions, there will often be messages or warnings that you want to print as the function executes. For these cases, you simply use the `print()` function.

__Escape characters__ help you change the behavior of printing a string text when there are cases that create issues. An escape character is a backslash (`\`) followed by a character you want to insert. For example, suppose you wanted to use double quotes inside doublequote, then:

In [17]:
txt = "We are the so-called "Vikings" from the north."  # produces an error

SyntaxError: invalid syntax (2847221345.py, line 1)

In [18]:
txt = "We are the so-called \"Vikings\" from the north."  # use an escape character \" to get around this 
txt

'We are the so-called "Vikings" from the north.'

And there are several other escape characters you will frequently see: 

`\'` Single Quote  
`\\` Backslash  
`\n` New Line  
`\r` Carriage Return  
`\t` Tab  
`\b` Backspace  

In [20]:
txt = 'testing \'quotes\''
txt

"testing 'quotes'"

In [21]:
txt = 'testing \nquotes' # creates a string object with a next line escape character
txt  # notice that the new line escape character is writen until the print function is called

'testing \nquotes'

In [22]:
print(txt)

testing 
quotes


In [23]:
txt = 'Python is included in CodeSpeedy\r123456'
txt 

'Python is included in CodeSpeedy\r123456'

In [24]:
print(txt) # Replaces start of string with the 123456 characters

123456 is included in CodeSpeedy


In [25]:
txt = 'Python is included in CodeSpeedy \b123456'
print(txt)

Python is included in CodeSpeedy123456


In [26]:
txt = 'Python is included in CodeSpeedy \t123456'
print(txt)

Python is included in CodeSpeedy 	123456


In [27]:
import warnings

In [28]:
warnings.warn("Warning...........Message")  # Can embed these in functions too!



Since the backslash character in a string introduces *escape sequences*, if you with to include a backslash in a string then you have to include two backslashes `\\`. This can make code harder to readh, so you can also use a raw string instead. This is particularly useful when you are copy-pasting directory locations. 

In [29]:
# these two are equivalent
path = 'C:\\MyFolder\\MySubFolder\\MyFile.txt' # to enter a single \, we need to escape it so we need to use \\
print(path)
path = r'C:\MyFolder\MySubFolder\MyFile.txt'   # notice the r placed in front of the string, making a raw string which ignores escape characters
print(path)

C:\MyFolder\MySubFolder\MyFile.txt
C:\MyFolder\MySubFolder\MyFile.txt


## 3. Count number of characters<a class="anchor" id="bullet3"></a>

In [30]:
len(b)

1

In [31]:
list1 = ["Count", "length", "of", "each", "element"]
[len(x) for x in list1]

[5, 6, 2, 4, 7]

## 4. String Manipulation<a class="anchor" id="bullet4"></a>

##### Case conversion

In [32]:
x = "Learning To MANIPULATE strinGS in Python"

In [33]:
# what does this do?
print(x.lower())  # print in lower case
print(x.upper())  # print in upper case

learning to manipulate strings in python
LEARNING TO MANIPULATE STRINGS IN PYTHON


Last week we learned about **camelCase** and **snake_case** as good practice when it comes to naming functions. Upper and lower case are best used for naming variables - I think most people prefer lower case.

Note that there are many more *cases* you can use to name different types of objects or functions. You can read about all of them here. I personally think **kebab case** is good for naming repos on GitHub!

##### Subset a string

In [35]:
# extract the 3rd character in string
x[2]

'a'

In [36]:
# extract multiple characters/a substring in a string
x[12:22]

'MANIPULATE'

##### Other useful string methods/functions

In [37]:
# replace a character/substring
x.replace('MANIPULATE', 'change')

'Learning To change strinGS in Python'

In [38]:
# split elements of a string
x.split()

['Learning', 'To', 'MANIPULATE', 'strinGS', 'in', 'Python']

In [39]:
y = "Alabama-Alaska-Arizona-Arkansas-California"
y.split(sep='-')

['Alabama', 'Alaska', 'Arizona', 'Arkansas', 'California']

In [43]:
[ ("id" + str(x)) for x in range(1,11)]  # iterate from 1-10  (since the ending 11 index here is NOT inclusive)

['id1', 'id2', 'id3', 'id4', 'id5', 'id6', 'id7', 'id8', 'id9', 'id10']

In [44]:
["test string"] * 4   # repeat string n times

['test string', 'test string', 'test string', 'test string']

In [46]:
x = ["test", "string"]
y = np.repeat(x, 3)
y

array(['test', 'test', 'test', 'string', 'string', 'string'], dtype='<U6')

In [48]:
type(y) # we used a numpy function, and so we got back a numpy object

numpy.ndarray

In [49]:
# Remove white space or special characters on left/right of string
print('  Hello World!  '.strip())
print('  Hello World!  '.strip(" Hel"))
print('  Hello World!  '.strip(" lo"))
print('  Hello World!  '.rstrip())
print('  Hello World!  '.lstrip())

Hello World!
o World!
Hello World!
  Hello World!
Hello World!  


## 5. Sets: Another Python Data Object <a class="anchor" id="bullet5"></a>

Sometimes you want to get the __union__, __intersection__, or the __difference__ between two string lists/arrays. Base Python has a built in data type called __sets__ which are particularly useful here. You define a set the same way you do a list, except you use `{}` rather than `[]`.

In [50]:
x = {"apple", "banana", "cherry"}
type(x)

set

In [51]:
# Unions
x = {"apple", "banana", "cherry"}
y = {"google", "microsoft", "apple"}
z = x.union(y)
print(z)

{'banana', 'google', 'cherry', 'apple', 'microsoft'}


In [52]:
# Intersection
x.intersection(y)

{'apple'}

In [53]:
# Difference -- shows the elements in the left set that are not in the right set
x.difference(y)

{'banana', 'cherry'}

In [54]:
y.difference(x)

{'google', 'microsoft'}

In [55]:
# Symmetric difference -- shows all differences
y.symmetric_difference(x)

{'banana', 'cherry', 'google', 'microsoft'}

In [56]:
# You can also check if two sets are disjoint, that is whether the two sets share ANY common elements
x.isdisjoint(z)

False

In [57]:
# There are also shortcuts to all the operations above!
print(x | y)    # Union
print(x & y)    # Intersection
print(x - y)    # Difference
print(x ^ y)    # Symmetric Difference

{'banana', 'google', 'cherry', 'apple', 'microsoft'}
{'apple'}
{'banana', 'cherry'}
{'cherry', 'google', 'banana', 'microsoft'}


## 6. Formatting strings <a class="anchor" id="bullet6"></a>

Before Python 3.6, you had two main ways of embedding Python expressions inside string literals for formatting: __%-formatting__ and __str.format()__. As of Python 3.6, __f-strings__ are a great new way to format strings. Not only are they more readable, but they are more concise and less prone to error than other ways of formatting. On top of that, they are also faster!

In [62]:
import timeit
timeit.timeit("""name = "Eric"
age = 74
'%s is %s.' % (name, age)""", number = 10000)

0.008942799999658746

In [64]:
# Notice this is a lot faster!
timeit.timeit("""name = "Eric"
age = 74
f'{name} is {age}.'""", number = 10000)

0.004069300000082876

In [65]:
name = "Eric"
age = 74
f"Hello, {name}. You are {age}."

'Hello, Eric. You are 74.'

In [66]:
# You can also perform mathematical calculations inside f-strings
f"{2 * 37} pints"

'74 pints'

In [67]:
# And use functions/methods
name = "Eric Idle"
f"{name.lower()} is funny."

'eric idle is funny.'

In [68]:
# Multiline f-strings  --> notice the f placed in front of each line, the code won't work without that!
name = "Eric"
profession = "comedian"
affiliation = "Monty Python"
message = (
     f"Hi {name}. "
     f"You are a {profession}. "
     f"You were in {affiliation}."
 )
message

'Hi Eric. You are a comedian. You were in Monty Python.'

<div class="alert alert-block alert-info">
<b>Did you know?:</b> Python was actually named after the movie Monty Python's Flying! Circus</div>

In [69]:
# You can use these with dictionaries
comedian = {'name': 'Eric Idle', 'age': 74}
f"The comedian is {comedian['name']}, aged {comedian['age']}."

'The comedian is Eric Idle, aged 74.'

In [72]:
# if you want the curly braces to appear, double them up!
f"{{70 + 4}}"

'{70 + 4}'

#### Formatting inside f-strings

By default, Python assumes you want to display the the values as a string. But sometimes, you will want to specify another type. For example, you might want to round a float to 2 decimal places. In the code below, the f (after the .2) is called a __presentation type__, it indicates that you want Python to print the result as a float rounded to 2 decimals.

In [73]:
x = 17.489
f'{x:.2f}'  

'17.49'

The __d presentation type__ prints integer values as strings.

In [74]:
f'{10:d}'

'10'

The __c presentation type__ prints an integer character code as the corresponding string.

In [75]:
f'{65:c} {97:c}'

'A a'

The __e presentation type__ prints exponential scientific notation.

In [76]:
from decimal import Decimal
print(f'{Decimal("1000000000000000000000.0"): .3e}')
print(f'{Decimal("10000000000000000000.0"): .3E}')  # notice the Capital E

 1.000e+21
 1.000E+19


The __s presentation type__ prints a string, this is the default.

In [77]:
f'{"hello":s} {7}'

'hello 7'

In [78]:
f'{'hello':s} {7}'  # note that you have to switch between double and single quotation marks to make this work!

SyntaxError: f-string: expecting '}' (3754223557.py, line 1)

##### By default, Python *right-aligns* numbers and *left-aligns* other values such as strings

In [79]:
f'{27:10d}'   # right-aligned with a width of 10 spaces

'        27'

In [80]:
f'{3.5:10f}' # right-aligned with a width of 10 spaces

'  3.500000'

In [81]:
f'{"hello":10}' # left-aligned with a width of 10 spaces

'hello     '

##### You can modify this default behavior with the `<` (left-aligned) and `>` (right-aligned) symbols. You can also center values with the `^` symbol.

In [82]:
f'{27:<10d}' 

'27        '

In [83]:
f'{"Hello":>10s}' 

'     Hello'

In [84]:
f'{27:^10d}' 

'    27    '

##### You can add a positive number sign too!

In [85]:
f'{27:+10d}' 

'       +27'

In [86]:
f'{27:+010d}'    

'+000000027'

##### Use a space to make positive and negative numbers line up:

In [87]:
print(f'{27:d}\n{27: d}\n{-27: d}')

27
 27
-27


## 7. String manipulation with Regex<a class="anchor" id="bullet7"></a>

A **regular expression** (aka **regex**) is a sequence of characters that define a search pattern, mainly used for pattern matching with strings. Regex is a *universal* string manipulation library which is used in many differet software packages. For example, the equivalent in R is called stringr. 

There are a few main uses of regular expressions:
1. **Validating data**, like e-mail addresses, ZIP codes, web page addresses, social security numbers etc. If this is what you need to do, you'll rarely have to create your own regular expressions. For common items like these, visit websites like:  
    - https://regex101.com
    - https://www.regexlib.com
    - https://www.regular-expressions.info
2. **Extracting data from text**, sometimes known as scrapping
3. **Cleaning data**
4. **Transforming data into other formats**, for example transforming data that was collected as tab-separated or space-separated into comma-separated values


There are two things you need to learn to workwith regex in Python: (1) the syntax used for pattern matching and (2) the Python functions which use regex patterns to search strings. Typically, regex patterns consist of a combination of alphanumeric characters and also special characters, or metacharacters: `. \ | ( ) [ { $ * + ?`. To match meta-characters, you need to escape them with a `\`. We will start by learning about some of the more useful functions below.

In [88]:
import regex as re

##### Useful Functions

In [89]:
# re.sub function: Substitute one string for another
txt = 'I love R'
re.sub(pattern="R", repl='Python', string=txt)

'I love Python'

In [90]:
# re.findall function: Returns all matches 
txt = 'I love R, and I also love Python'
re.findall("o", txt)

['o', 'o', 'o', 'o']

In [92]:
# re.search function: Returns index of a character pattern, but only for first occurence  
txt = "The rain in Spain"
x = re.search("\s", txt)
print("The first white-space character is located in position:", x.start())

The first white-space character is located in position: 3


In [93]:
# re.split function: Split string based on a pattern
txt = "The rain in Spain"
x = re.split("\s", txt)
print(x)
print(re.split("\s", txt, 1)) # Split at first occurence only

['The', 'rain', 'in', 'Spain']
['The', 'rain in Spain']


In [94]:
# re.match function: Returns a "match" object if a pattern is found at the beginning of a string
x = re.match("The", "The rain in Spain")
print(x)
print(x.group())  # returns the match

<regex.Match object; span=(0, 3), match='The'>
The


##### Regex syntax

In [95]:
re.sub("\\$", "!", "I love Python$")

'I love Python!'

In [96]:
re.sub("$", "!", "I love Python$")  # didn't work as expected since we need to escape the $

'I love Python$!'

In [97]:
re.sub("\\*", "!", "I love Python*")

'I love Python!'

In [98]:
re.sub("\\\\", " ", "I\\need\\space") # to replace double \\, you have to escape twice!

'I need space'

In [99]:
re.sub(r"\\", " ", "I\\need\\space") #use a raw string to make the code easier to follow

'I need space'

In [100]:
# substitute any digits
re.sub("\\d", '_', "I'm working in Python 3.10.2")

"I'm working in Python _.__._"

In [101]:
# substitue any non-digits
re.sub("\\D", '_', "I'm working in Python 3.10.2")

'______________________3_10_2'

In [102]:
# substitute any alpha-numeric characters
re.sub("\\w", '_', "I'm working in Python 3.10.2")

"_'_ _______ __ ______ _.__._"

In [103]:
# substitute any non-alpha-numeric characters
re.sub("\\W", '_', "I'm working in Python 3.10.2")

'I_m_working_in_Python_3_10_2'

In [104]:
# substitute any white-space characters, including \n, \t, etc.
re.sub("\\s", '_', "I'm working in Python 3.10.2")

"I'm_working_in_Python_3.10.2"

In [105]:
# substitute any non-white-space characters, including \n, \t, etc.
re.sub("\\S", '_', "I'm working in Python 3.10.2")

'___ _______ __ ______ ______'

To match one of several characters in a specified set we can enclose
the characters of concern with square brackets `[ ]`. For example:  
- `[aeiou]` to match any lower case vowel
- `[AEIOU]` to match any upper case vowel
- `[a-z]` to match any lower case letter
- `[A-Z]` to match any upper case letter
- `[0123456789]` to match any digit 
- `[0-9]` to match any numbers in the 0-9 range 
- `[[:blank:]]` to match any blank characters
- `[[:punct:]]` to match any special characters
- `[[:alnum:]]` to match any alpha-numeric characters
- `[^aeiou]` to match any non-lower case vowel, the ^ negates the lower-case vowels

In addition, to match any characters not in a specified character set
we can include the caret `^` at the beginning of the set within the
brackets

In [106]:
x = "I like pizza! #pizza, @wheres_my_pizza, I like R (v3.2.2) #rrrrrrr2015"

In [107]:
re.findall("[A-Za-z]", x) # this matches all letters, but only a single letter at a time

['I',
 'l',
 'i',
 'k',
 'e',
 'p',
 'i',
 'z',
 'z',
 'a',
 'p',
 'i',
 'z',
 'z',
 'a',
 'w',
 'h',
 'e',
 'r',
 'e',
 's',
 'm',
 'y',
 'p',
 'i',
 'z',
 'z',
 'a',
 'I',
 'l',
 'i',
 'k',
 'e',
 'R',
 'v',
 'r',
 'r',
 'r',
 'r',
 'r',
 'r',
 'r']

In [108]:
# remove space or tabs
re.sub("[[:blank:]]","", x)

'Ilikepizza!#pizza,@wheres_my_pizza,IlikeR(v3.2.2)#rrrrrrr2015'

In [109]:
# replace punctuation with whitespace
re.sub(pattern = "[[:punct:]]", repl = " ", string=x)

'I like pizza   pizza   wheres my pizza  I like R  v3 2 2   rrrrrrr2015'

In [110]:
# remove alphanumeric characters
re.sub("[[:alnum:]]","", x)

'  ! #, @__,    (..) #'

In [111]:
# use ^ for negation
re.sub("[^[:alnum:]]","", x)

'IlikepizzapizzawheresmypizzaIlikeRv322rrrrrrr2015'

In [113]:
x = ["Python", "v.0.99.484", "2015", "09-22-2015"]

In [114]:
[re.findall("[0-9]", y) for y in x]

[[],
 ['0', '9', '9', '4', '8', '4'],
 ['2', '0', '1', '5'],
 ['0', '9', '2', '2', '2', '0', '1', '5']]

In [115]:
[re.findall("[8-9]", y) for y in x]

[[], ['9', '9', '8'], [], ['9']]

In [116]:
[re.findall("[P]\w+", y) for y in x]  # find elements which have a "P"

[['Python'], [], [], []]

In [117]:
re.findall("New\s\w+", "New York") 

['New York']

In [118]:
locations = ['New York', 'D.C.', 'Baltimore', 'Atlanta', 'Miami', 'Dallas', 'New Hampshire', 
             'San Francisco', 'San Diego', 'San Antonio', 'Detroit', 'Chicago']
regex = "New\s\w+"
[re.findall(regex, x) for x in locations if re.findall(regex, x) != []]

[['New York'], ['New Hampshire']]

__Quantifiers:__ Used when we want to match a certain number of characters:
- `?` the preceeding item is optional and will be matched at most once
- `*` the preceeding item is matched zero or more times
- `+` the preceeding item is matched one or more times
- `{n}` the preceeding item is matched exactly n times
- `{n,}` the preceeding item is matched exactly n or more times
- `{n, m}` the preceeding item is matched at least n times, but not more than m times

In [120]:
regex = "New Y?"
[re.findall(regex, x) for x in locations]

[['New Y'], [], [], [], [], [], ['New '], [], [], [], [], []]

In [121]:
regex = "New Y?"
[re.findall(regex, x) for x in locations if re.findall(regex, x) != []]

[['New Y'], ['New ']]

In [122]:
regex = "New Y*"
[re.findall(regex, x) for x in locations if re.findall(regex, x) != []]

[['New Y'], ['New ']]

In [123]:
regex = "New Y+"
[re.findall(regex, x) for x in locations if re.findall(regex, x) != []]

[['New Y']]

In [124]:
regex = "\w*\s*A{1}\w+"
[re.findall(regex, x) for x in locations if re.findall(regex, x) != []]

[['Atlanta'], ['San Antonio']]

In [125]:
regex = "\w*l{1,}\w+"
[re.findall(regex, x) for x in locations if re.findall(regex, x) != []]

[['Baltimore'], ['Atlanta'], ['Dallas']]

In [126]:
regex = "\w*l{2,}\w+"
[re.findall(regex, x) for x in locations if re.findall(regex, x) != []]

[['Dallas']]

##### Challenge: Complete the questions below:

In [127]:
# Using a regex, check for three digit number followed by space followed by two digit number
string = '39801 356, 2102 1111'

In [128]:
# check if 'Python' is at the beginning
string = "Python is fun"

In [129]:
# Match all string elements starting with "The" - Hint, use a for loop
strings = ['The quick brown fox', 'The lazy dog', 'A quick brown fox']

In [130]:
# Print all only the year listed in each string, using a for loop
string0 = ["I went to him at 11 A.M. on 4th July 1886", "She went to him at 10 A.M. on 4th July 1890"]

##### It is impossible to memorize the entire regex library and what everything stands for, so it is good to keep a solid cheatsheet on hand. 
- https://cheatography.com/davechild/cheat-sheets/regular-expressions/

## 8. Decorators<a class="anchor" id="bullet8"></a>

Functions are just like any other object in python. Which means __functions can be passed around and used as arguments.__ We learned this last week, and today we will build on this idea and explore a special type of function known as a **decorator**.

In [131]:
def say_hello(name):
    return f"Hello {name}"

def be_awesome(name):
    return f"Yo {name}, together we are the awesomest!"

def greet_bob(greeter_func):
    return greeter_func("Bob")

In [132]:
greet_bob(say_hello)

'Hello Bob'

In [133]:
greet_bob(be_awesome)

'Yo Bob, together we are the awesomest!'

__Inner functions:__ It’s possible to define functions inside other functions. Such functions are called inner functions. Below is an example of a function with two inner functions. Also, this example shows that Python also allows you to **use functions as return values.** The following example returns one of the inner functions from the outer parent() function.

In [134]:
def parent(num):
    def first_child():
        return "Hi, I am Emma"

    def second_child():
        return "Call me Liam"

    if num == 1:
        return first_child
    else:
        return second_child
first = parent(1)
second = parent(2)
print(first)
print(second)

<function parent.<locals>.first_child at 0x000001D47B1A5820>
<function parent.<locals>.second_child at 0x000001D47B1A5940>


In [135]:
first()

'Hi, I am Emma'

In [136]:
second()

'Call me Liam'

### Simple Decorators
Now that you’ve seen that functions are just like any other object in Python, you’re ready to move on and see the magical beast that is the Python decorator. Let’s start with an example:

In [137]:
def my_decorator(func):
    def wrapper():
        print("Something is happening before the function is called.")
        func()
        print("Something is happening after the function is called.")
    return wrapper

def say_whee():
    print("Whee!")

say_whee = my_decorator(say_whee)
say_whee()

Something is happening before the function is called.
Whee!
Something is happening after the function is called.


  
The so-called decoration happens at the following line: `say_whee = my_decorator(say_whee)`. In effect, the name say_whee now points to the wrapper() inner function. Remember that you return wrapper as a function when you call my_decorator(say_whee):

In [138]:
say_whee

<function __main__.my_decorator.<locals>.wrapper()>

Put simply: __decorators wrap a function, modifying its behavior.__  
  
The way you decorated say_whee() above is a little clunky. First of all, you end up typing the name say_whee three times. In addition, the decoration gets a bit hidden away below the definition of the function. Instead, __Python allows you to use decorators in a simpler way with the @ symbol__, sometimes called the “pie” syntax. The following example does the exact same thing as the first decorator example:

In [140]:
@my_decorator
def say_wheeee():
    print("Wheeee!")

In [141]:
say_wheeee()

Something is happening before the function is called.
Wheeee!
Something is happening after the function is called.


Recall that a __decorator is just a regular Python function.__ All the usual tools for easy reusability are available, such as saving it in a **module** (which is simply a .py file).

In [142]:
def do_twice(func):
    def wrapper_do_twice():
        func()
        func()
    return wrapper_do_twice

## 9. Classes<a class="anchor" id="bullet9"></a>

Classes are part of a programming paradigm called **object-oriented programming**. Object-oriented programming, or OOP for short, focuses on building reusable blocks of code called **classes**. When you want to use a class in one of your programs, you make an object from that class, which is where the phrase "object-oriented" comes from. Python itself is not tied to object-oriented programming, but you will be using objects in most or all of your Python projects. In order to understand classes, you have to understand some of the language that is used in OOP.

A __class__ is a body of code that defines the attributes and behaviors required to accurately model something you need for your program. You can model something from the real world, such as a rocket ship or a guitar string, or you can model something from a virtual world such as a rocket in a game, or a set of physical laws for a game engine.

An __attribute__ is a piece of information. In code, an attribute is just a variable that is part of a class.

A __behavior__ is an action that is defined within a class. These are made up of __methods__, which are just functions that are defined for the class.

An __object__ is a particular **instance** of a class. An object has a certain set of values for all of the attributes (variables) in the class. You can have as many objects as you want for any one class.

There is much more to know, but these words will help you get started. They will make more sense as you see more examples, and start to use classes on your own.

The vast majority of object-oriented programming you'll do in Python is object-based programming in which you primarily create and use objects of existing classes. For example, Python's in-built types like int, float, str, tuple, dict and set are classes. You've also used NumPy arrays and pandas Series and DataFrame objects - which are all custom classes built by someone else.  

__Classess are new custom data types that you can develop.__ For example, below we will develop an application-specific class called *Account*. This class will hold an account holder's name and balance. The *Account* class accepts deposits that increase balance and withdrawals that decrease the balance. An actual bank account class would likely include lots of other information, such as address, birth date, telephone number, account number and more.

In [143]:
# this calss will use the Decimal data type
from decimal import Decimal
print(Decimal('12.34'))
print(type(Decimal('12.34')))

12.34
<class 'decimal.Decimal'>


In [144]:
value = Decimal('12.34')

In [146]:
from decimal import Decimal  # When defining a class, always load dependcies before you define it

class Account:
    """Account class for maintaining a bank account balance."""      # class docstring
    
    def __init__(self, name, balance):
        """Initialize an Account object."""
        
        # if balance is less than 0.00, raise an exception
        if balance < Decimal('0.00'):
            raise ValueError('Initial balance must be >= to 0.00.')
            
        self.name = name
        self.balance = balance

Some notes about the above:
- We start by using the `class` keyword, followed by the name we want to use for our custom class. Note the **Style Guide for Python Code** recommends that you use *camelCase* to name multi-word names for classes. 
- The `__init__` method initalizes a new instance of a class when called. That is, it creates or initalizes a new object in memory. By convention most Python programmers call this method's first parameter `self`. 
- A class's methods and attributes must use that reference (self) to access the object's attributes and other methods. For example, we can see the name and balance attributes here are referenced by self. 
- Class `Account`'s `__init__` method also specifies parameters for the name and balance. The init method is an example of a __special method__, which can be identified by the leading and trailing double underscores (__).

In [153]:
# We use a constructor expression to create a new instance of class, which initializes a new object in memory by
# calling the class's __int__ method
account1 = Account('John Doe', Decimal('50.00')) 
print(type(account1))
print(account1.name)
print(account1.balance)
account2 = Account('John Doe', Decimal('-50.00')) # Raises an exception since the initial balance must be positive

<class '__main__.Account'>
John Doe
50.00


ValueError: Initial balance must be >= to 0.00.

We can add methods and attributes to our class. Below, if the balance is less than 0.00, then an exception is raised since deposits must be positive.

In [154]:
from decimal import Decimal
class Account:
    """Account class for maintaining a bank account balance."""
    
    def __init__(self, name, balance):
        """Initialize an Account object."""
        
        # if balance is less than 0.00, raise an exception
        if balance < Decimal('0.00'):
            raise ValueError('Initial balance must be >= to 0.00.')
            
        self.name = name
        self.balance = balance
        
    def deposit(self, amount):
        """Deposit money into the account."""
        
        # if the amount is less than 0.00, raise an exception
        if amount < Decimal('0.00'):
            raise ValueError('Deposit must be positive.')
            
        self.balance += amount

In [155]:
account1 = Account('John Doe', Decimal('50.00')) 
account1.deposit(Decimal('25.53'))
account1.balance

Decimal('75.53')

In [156]:
# We can also directly modify attributes
account1.balance = Decimal('19.23')
account1.balance

Decimal('19.23')

In [157]:
print(account1.name)
account1.name = 'Jane Doe'
account1.name

John Doe


'Jane Doe'

In [158]:
# Useful for data validation
account1.deposit(Decimal('-25.53'))

ValueError: Deposit must be positive.

##### Challenge: Add a withdraw method to the Account class. Make it so that you cannot withdraw an amount greater than your current balance, and also you withdraw has to be positive.

In [159]:
from decimal import Decimal
class Account:
    """Account class for maintaining a bank account balance."""
    
    def __init__(self, name, balance):
        """Initialize an Account object."""
        
        # if balance is less than 0.00, raise an exception
        if balance < Decimal('0.00'):
            raise ValueError('Initial balance must be >= to 0.00.')
            
        self.name = name
        self.balance = balance
        
    def deposit(self, amount):
        """Deposit money into the account."""
        
        # if the amount is less than 0.00, raise an exception
        if amount < Decimal('0.00'):
            raise ValueError('Deposit must be positive.')
            
        self.balance += amount
        
    # define new method here
    

###### Solution below 

In [160]:
from decimal import Decimal
class Account:
    """Account class for maintaining a bank account balance."""
    
    def __init__(self, name, balance):
        """Initialize an Account object."""
        
        # if balance is less than 0.00, raise an exception
        if balance < Decimal('0.00'):
            raise ValueError('Initial balance must be >= to 0.00.')
            
        self.name = name
        self.balance = balance
        
    def deposit(self, amount):
        """Deposit money into the account."""
        
        # if the amount is less than 0.00, raise an exception
        if amount < Decimal('0.00'):
            raise ValueError('Deposit must be positive.')
            
        self.balance += amount
        
    def withdraw(self, amount):
        """Withdraw money from the account."""
        
        # if the amount is less than 0.00, raise an exception
        if amount < Decimal('0.00'):
            raise ValueError('Withdraw must be positive.')
        elif amount > self.balance:
            raise ValueError('Withdraw must be less than balance.')            
            
        self.balance -= amount       

In [161]:
account1 = Account('John Doe', Decimal('50.00')) 
account1.withdraw(Decimal('25.00'))
print(account1.balance)

25.00


In [162]:
account1.withdraw(Decimal('100'))

ValueError: Withdraw must be less than balance.

In [163]:
account1.withdraw(Decimal('10'))

In [164]:
account1.withdraw(Decimal('-3'))

ValueError: Withdraw must be positive.

In [165]:
# Warning: You can use attributes to change values, but unlike methods they cannot validate the data
account1.balance = (Decimal('-1000'))
account1.balance

Decimal('-1000')

In [166]:
# You can also use f strings to return formatted output within a class using f-strings
class Comedian:
    def __init__(self, first_name, last_name, age):
        self.first_name = first_name
        self.last_name = last_name
        self.age = age

    def __repr__(self):
        return f"{self.first_name} {self.last_name} is {self.age}. Surprise!"
    
new_comedian = Comedian("Eric", "Idle", "74")
f"{new_comedian}"

'Eric Idle is 74. Surprise!'

The __repr__() method deals with how objects are presented as strings, so you’ll need to make sure you include at least one of those methods in your class definition. The string returned by __repr__() is the official representation of the object and should be unambiguous.

### Inheritance
One of the most important goals of the object-oriented approach to programming is the creation of stable, reliable, reusable code. If you had to create a new class for every kind of object you wanted to model, you would hardly have any reusable code. In Python and any other language that supports OOP, one class can inherit from another class. This means you can base a new class on an existing class; the new class inherits all of the attributes and behavior of the class it is based on. A new class can override any undesirable attributes or behavior of the class it inherits from, and it can add any new attributes or behavior that are appropriate. The original class is called the parent class, and the new class is a child of the parent class. The parent class is also called a superclass, and the child class is also called a subclass.

The child class inherits all attributes and behavior from the parent class, but any attributes that are defined in the child class are not available to the parent class. This may be obvious to many people, but it is worth stating. This also means a child class can override behavior of the parent class. If a child class defines a method that also appears in the parent class, objects of the child class will use the new method rather than the parent class method.

To better understand inheritance, let's look at an example of a class that can be based on the Rocket class. In the file `Rocket_and_shuttle_classes.py`, you'll notice that the Shuttle class is created with Rocket as an input argument.

The __init__() function of the new class needs to call the __init__() function of the parent class. The __init__() function of the new class needs to accept all of the parameters required to build an object from the parent class, and these parameters need to be passed to the __init__() function of the parent class. The super().__init__() function takes care of this (refer to the code in the class activity to see!).

## 10. Modules<a class="anchor" id="bullet10"></a>
A Python __module__ is a file containing Python definitions and statements. A module can define functions, classes, and variables. A module can also include runnable code. Grouping related code into a module makes the code easier to understand and use. It also makes the code logically organized.

When you create classes or functions, you'll save them in a python script, or a file with the `.py` extension, which we can also call a module. On the class activity, you will first need to create a module which holds a class. Then, we will import this module into another python script file to use the class and do a few exercises.

To import your module into another python script/program, simply run the code below, where the name of the classes follow the `import` keyword and the name of the python file follows the `from` keyword, in this case `Rocket_and_shuttle_classes.py`  (notice we don't use the `.py` extension.

In [None]:
# Import rocket and shuttle class
from Rocket_and_shuttle_classes import Rocket
from Rocket_and_shuttle_classes import Shuttle