# Python Fundamentals

## Contents

- [Setting Up Anaconda](#Setting-Up-Anaconda)
- [Running Code](#Running-Code)
- [Loading Packages](#Loading-Packages)
- [Working Directory and Path](#Working-Directory-and-Path)
- [Data Types](#Data-Types)
- [Functions](#Functions)
- [Classes and Objects](#Classes-and-Objects)
- [Collections](#Collections)
- [Slicing](#Slicing)
- [Conditional Statements](#Conditional-Statements)
- [Loops and List Comprehension](#Loops-and-List-Comprehension)

## Setting Up Anaconda


### Installing the Anaconda Python Distribution

A common way to install Python for analytics and data science is by installing **Anaconda**. Anaconda is a pre-packaged Python distribution that comes with Python and an additional set of packages for analytics and data science. See the file [Downloading and Installing Anaconda](Download_Anaconda_Instructions.docx) for detail.

### Helpful Python Resources

* Official Python 3 Documentation: https://docs.python.org/3/
* Real Python: https://realpython.com/
* Full Stack Python: https://www.fullstackpython.com/best-python-resources.html
* Python Wiki: https://wiki.python.org/moin/BeginnersGuide/Programmers
* The Hitchhiker's Guide to Learning Python: https://docs.python-guide.org/intro/learning/
* Interactive Python Tutorials: https://www.learnpython.org/

# Running Code

### Getting started with Python
There are several different ways to interact with Python. You can run Python code directly from the command line, using text editors (Sublime, Notepad++), using Jupyter notebooks like we're using here, using an Integrated Development Environment or IDE (Visual Studio, Spyder, Atom).

#### Running Python via Command Line
Let's walk through some examples with a demo! 

#### Running Python in Visual Studio
Let's again walk through some examples with a demo!

#### Running Python in Jupyter Notebooks

In Jupyter Notebooks (like this one), you can run Python code **interactively**, line-by-line, by writing Python code and executing it in real-time:

In [1]:
# Your first Python code!
# Below is a print function.
# Run this cell by clicking > Run or Shift+Enter to see the output.

print("hello!")

hello!


The **order** of statements matters. Python, generally speaking, will execute statements in a **sequential** order, line-by-line:

In [2]:
letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
for letter in letters:
    print(letter.upper())

A
B
C
D
E
F
G


Don't worry if you aren't familiar with the `for letter in letters:` construct here. We'll revisit this later.

**Variables** hold data so that you can give that data a name. You assign to a variable with `=`:

In [3]:
my_variable = 5

print(my_variable * 5)

25


**Case and capitalization matter**. Python is strict about these:

In [4]:
# This will cause an error, since the variable name is my_variable
print(My_Variable * 5)

NameError: name 'My_Variable' is not defined

**Indentation and spacing matter**. This is how Python allows you to express **logic** and run code according to **conditions** (more on the specifics of functions and conditional statements to come):

In [5]:
# Define a function called add_numbers that adds its two inputs

def add_numbers(first, second):
    return first + second

# Use the function we defined to determine if 5 + 6 > 10 
if add_numbers(5, 6) > 10:
    print("Sum is greater than 10")
else:
    print("Sum is less than 10")

Sum is greater than 10


# Loading Packages

A **module** is an individual `.py` file, such as `my_module.py`, that is a collection of functions, variables, and classes designed with a common theme or purpose.

The Python **standard library** includes some commonly-used modules:

- [os](https://docs.python.org/3/library/os.html): a portable way of using operating system-dependent functionality.
- [datetime](https://docs.python.org/3/library/datetime.html): supplies classes for manipulating dates and times.

A **library** or **package** is a collection of multiple modules that allows one to bundle up some type of functionality that other developers and data scientist can use.

* **Pandas**: DataFrame & Series manipulation for data management
* **Numpy**: Array/matrix manipulation and management (similar to MATLAB)
* **Seaborn**: High-level data visualization interface
* **Matplotlib**: Manipulate figures, visualzations, and representations

In [8]:
# load some packages:
import sys #Provides information on the runtime environment
import os #Provides information on your operating system

If a package has a long name, you can give it a shorter "alias" name that you can use throughout your code instead. Below, we use the alias "np" for Python package numpy:

In [6]:
import numpy as np #Now "numpy" will be referred to as "np" throughout the rest of your code.

These imports are put into your **namespace**. You can use **dot-notation** to reference the functions that they export. For instance, you can use the `getlogin()` function from the `os` module to find the active username:

In [31]:
os.getlogin()

'sarahansen'

# Working Directory and Path

Python uses the concept of a working directory (exactly like many other programs). Everytime we start Python, it is started 'somewhere' on our computer (filesystem). Where python is started depends on environment variables in your operating system and how you started Python. It may be helpful to set the working directory for my session to be wherever your project files are, and organize a directory like the structure below, then set the working directory to the top level:

`project1 --->
        --> Data
        --> Code
        --> Results
        --> Etc...`
        
So in this case, we would set the working directory to be the file location for project1. You should have downloaded a group of files for this class. You need to modify the working directory to be wherever you have downloaded the files to on your computer (on your filesystem). If you downloaded them to your 'documents' folder and named the folder 'python_course' (and your username is sarahansen), the file path might be this:

`c:\users\sarahansen\documents\python_course\`

Figure out where you have downloaded the folder to, and change the code below to assign the correct working directory. We use the **os package** to change the working directory.

- os.getcwd() provides our current working directory
- os.chdir(path) allows you to change the current working directory when provided with the path as an input

In [12]:
# Working Directory
print("My working directory:\n" + os.getcwd())
# Set Working Directory - CHANGE THIS CODE
os.chdir(r"C:\Users\sarahansen\Documents\python_course")
# Confirm it changed the working Directory
print("My working directory:\n" + os.getcwd())

My working directory:
C:\Users\sarahansen\Desktop
My working directory:
C:\Users\sarahansen\Documents\python_course


# Data Types

Data types are attributes of the data that tell us how that data is going to be used or interpreted by the program.

Important data types to know and understand:

- __String__: Represents data as character values. Not just letters, but can also inclue numbers i.e. "Apple" or "18"
- __Integer__: Whole number data type i.e. 17
- __Float__: Decimal number data type i.e 17.7465 or 17.00
- __Boolean__: Data type representing TRUE or FALSE. Used for logic statements


Variables are containers for storing data values and are created using `=`.

In [1]:
string_variable = "hello"
integer_variable = 8
float_variable = 8.45
boolean_variable = False

In [30]:
#Boolean variables can be created by any "condition," or something that returns a true/false value.
print(boolean_variable)
boolean_variable = (2==2)
print(boolean_variable)

False
True


### Converting Between Types

Python objects have a **type**, as shown above.

Why this matter is that **type determines the behavior of the object.** Using the `+` operator will behave differently for an integer (`int`) versus string (`str`).

Using `+` with integers adds them:

In [72]:
a = 5
b = 7
type(a), type(b)

(int, int)

In [73]:
a + b

12

Whereas using `+` with strings **concatenates them** (joins them together):

In [74]:
a = "multiple"
b = "words"
type(a), type(b)

(str, str)

In [75]:
a + " " + b

'multiple words'

You can use Python's built-in functions to convert from one type to another and assign the result into a new variable.

In [1]:
user_answer = input("What year is it Enter a number? ")

type(user_answer)

What year is it Enter a number? 2020


str

Let's subtract the year 2000 from this input:

In [2]:
years_since_y2k = user_answer - 2000

TypeError: unsupported operand type(s) for -: 'str' and 'int'

Uh-oh. `input()` produces a `str`. How do we subtract an `int`? You can use `int()` to explicitly convert the type:

In [4]:
years_since_y2k = int(user_answer) - 2000
print(type(years_since_y2k))
print(years_since_y2k)

<class 'int'>
20


To print integers/other data types in the middle of a string, we can use a Python functionality called "F string formatting." If we place the letter f before a string, and then put the name of the variable in curly braces, Python will include the variable value as a part of the string:

In [5]:
print(f"It has been {years_since_y2k} years since Y2K. The computers still haven't broke!")

It has been 20 years since Y2K. The computers still haven't broke!


# Functions

Functions in Python all follow a standard formatting. Below is an example of a function that returns the sum of `first_number` and `second_number`.

In [80]:
def sum_function(first_number, second_number):
    total =  first_number + second_number
    print(total)

x = 5
y = 6

sum_function(x, y);
sum_function(9, 14);

11
23


It looks daunting at first, but it actually breaks down pretty nicely. A function is comprised of:

- `def` keyword
- Function name (based on the example 'sum_function')
- Parameters passed in to be used in the function (in this case 'first_number' and 'second_number')
- A semicolon `':'` after the parameter statement
- The body of the function containing the code to be executed

_Note: After the definition statement, the body of the function is always indented. This is how Python distinguishes what is and what is not part of the function_

The above example uses both the print and return statement. These are different in the sense that the print function will physically print out the value of 'total', while the return statement returns you the object 'total' with its value that can be used in other functions. This is where it helps to understand passing things as an object, or object-oriented programming.

What follows the function is the assignment of values to the two arguments, which are known as parameters. After assignment, the function is called by typing the function's name and adding in the two variables. 

In the above example, the function knows 'x' refers to 'first_number' positionally, and the same goes for 'y' and 'second_number'.

This is just a simple example function with two arguments. When creating a function, you can have unlimited parameters and you can even use one catch all term for keyword arguments to capture multiple parameters without having to type each one out, but that is a more advanced topic outside the scope of this course. Functions can also be called within one another. You can have a long line of functions calling one another in your Python code to execute multiple functions in one line of code.

### What's the Difference Between `print()` and `return` ?

One nice thing about functions is that they are **reusable**.

Another developer or end user can apply their own input values and use the output (return value) from the function that you define.

- Using `print()` will only, like it's name implies, print a string to your screen. The function doesn't have a "result." (Technically, it's result is `None`.)
- Using `return` will return a value from the function. You can call the function, grab the return value, and use it where you see fit.

In [83]:
def sum_function(first_number, second_number):
    total =  first_number + second_number
    # It is more common to *return* values from functions than print() them.
    return total

print(sum_function(5, 6))

11


Additional function examples in this tutorial will mostly use `return` rather than `print()` for this reason.

### Practice: Creating Functions

Create a function that calculates the hypotenuse of a right angle triangle c, given two inputs, a and b that correspond to the two perpendicular sides of the triangle.

Let this function return the value of `c`.

**Hint**: The equation for calculating the hypotenuse (`c`) from inputs `a` and `b` is:

$$
c = \sqrt{a^2 + b^2}
$$

In [84]:
from math import sqrt

def pythagorean_function(a, b):
    pass  # Define me!

# Classes and Objects

#### Python is an object-oriented programming (OOP) language. What does that mean?
The OOP paradigm is centered around the concept of objects that represent real-world things (example: a person). Objects have two main characteristics: data (attributes) and behavior (methods). 

A **class** is the template from which objects are built and objects are one particular representation of the class. 

For example: If <b>Person</b> represents a class, then <b>John Doe</b> could represent an object of that class.

- You define a class once.
- You create **instances** of a class multiple times. Each instance represents a distinct thing.

For example, say that **John Doe** defines the following function:

In [81]:
def longest_name(names) -> str:
    """Find longest name in a sequence of names."""
    longest = names[0]
    if len(names) == 0:
        return longest
    for n in names[1:]:
        if len(n) > len(longest):
            longest = n
    # Give the result value back to the caller of the function
    return longest

Now **Jane Doe** wants to _use_ the `longest_name()` function:

In [82]:
names = ("Ted", "Phoebe", "Mathangi", "Shana")

# Jane Doe is the *caller* of the longest_name() function.
# She assigns the *return value* to the variable `result`.
result = longest_name(names)

print(f"The longest name from the group is {result}.")

The longest name from the group is Mathangi.


##### Exercise: Let's create an object John Doe of class Person. 

John has the following attributes (data points):
- First name: John
- Last name: Doe
- Birthday: today
- Address: 1234 Main Street
- Phone number: 123-456-7891
- Email: `john.doe@example.com`

In [85]:
import datetime 

class Person:
    """Person is a *class* that stores information about a person.
    
    It can define *methods* to calculate or derive information
    about that person.
    """

    def __init__(self, name, surname, birthdate, address, telephone, email):
        self.name = name
        self.surname = surname
        self.birthdate = birthdate

        self.address = address
        self.telephone = telephone
        self.email = email

    def age(self) -> int:
        """Calculate Person's age in years, an integer (number)."""
        today = datetime.date.today()
        age = today.year - self.birthdate.year

        if today < datetime.date(today.year, self.birthdate.month, self.birthdate.day):
            # Subtract 1 in some cases
            age -= 1

        return age

    def full_name(self) -> str:
        """Combine name and surname into a single string."""
        return f"{self.name} {self.surname}"

In [86]:
# John Doe *is-a* Person.
# We can create multiple *people* from the *Person* class.

person = Person(
    "John",
    "Doe",
    datetime.date(1992, 6, 26), # year, month, day
    "1234 Main Street",
    "123-456-7891",
    "john.doe@example.com"
)

print(person.name)
print(person.email)
print(person.age())
print(person.full_name())

John
john.doe@example.com
28
John Doe


In [87]:
# Jane Doe *is-a* Person.
# We can create multiple *people* from the *Person* class.

other_person = Person(
    "Jane",
    "Doe",
    datetime.date(1990, 1, 23), # year, month, day
    "1234 Market Lane",
    "123-456-9999",
    "jane.doe@example.org"
)

print(other_person.name)
print(other_person.email)
print(other_person.age())
print(other_person.full_name())

Jane
jane.doe@example.org
30
Jane Doe


Above, you created 2 people from the `Person` class.

# Collections

What if we want to organize multiple python objects into a single object? We call this a 'collection' of objects. An example might be a list of customer_ids, or a list of addresses...  
There are multiple collections (or containers) in Python we can use. We will go over the basics here:
- __List__: A sequence of values
- __Dictionary__: Key-value pair representation that **maps** keys to values
- __Tuple__: An **immutable** (unchangeable) sequence of objects

In [3]:
list_variable = [1, 2, 3]
dictionary_variable = {"brand": "Ford", "model": "Mustang", "year": 1964}
tuple_variable = ("tuples", "are", "not", "mutable")

Note that Python is indexed at Zero, meaning the first element of any collection is referenced as the "0th" element to Python.

In [4]:
# Lists are mutable, eaning they can be changed/modified/edited/updated
print(list_variable)

#Change the first value of the list:
list_variable[0] = 5
list_variable

[1, 2, 3]


[5, 2, 3]

In [5]:
# Tuples are immutable; reassignment of items is not allowed
print(tuple_variable)
tuple_variable[0] = "mod"
tuple_variable

('tuples', 'are', 'not', 'mutable')


TypeError: 'tuple' object does not support item assignment

> **A note on tuples and immutability**: 
>
> - Tuples are immutable sequences in that they don't support item *reassignment*; `my_tup[5] = 10` will raise an error; you cannot _reassign_ the 5th element of the tuple.
> - However, this does not necessarily mean that their member items are immutable; `f = (1, 2, [3, 4]); f[2][0] = -6` is valid because the item modified is a reference to a mutable `list`
> - None of the above means that you cannot reassign the _variable's_ value, such as `f = (1, 2, 3); f = [4, 5]`.

In [6]:
#Dictionaries store values that can be accessed with "keys"
print(dictionary_variable)
# Access value for key=brand
dictionary_variable["brand"]

{'brand': 'Ford', 'model': 'Mustang', 'year': 1964}


'Ford'

# Slicing

#### Slicing (Indexing)

* We have seen this a few times, let’s formally address it.
* We can slice (index) sequence like objects similarly
 * Use `[]`. Python starts counting at 0, so `var[0]` is the first element
 * We can get a range (a slice) using `":"` example: `var[0:2]`
 * Slices are __start__ inclusive, __finish__ exclusive

In [22]:
# Slicing: first create a list of numbers
seq1 = list(range(1,11))
seq1

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

In [23]:
## Different slicing techniques
# first and second
seq1[0:2]
# Third to the end
seq1[2:]
# up to the 8 element
seq1[:8]
# only the 8 element
seq1[7:8]

[1, 2]

[3, 4, 5, 6, 7, 8, 9, 10]

[1, 2, 3, 4, 5, 6, 7, 8]

[8]

The `":"` has three arguments: *Start*, *stop*, *increment* 

So far, we are leaving the increment blank and therefore it is the default value of 1. 

*Start*/*Stop* can be negative, implying count from the end. We'll explore this below. *Increment* default value is 1, but it can be any integer (including negative values).

In [24]:
# Slice stepping 
print("Starting sequence:", seq1)

Starting sequence: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]


In [25]:
# first from the end
seq1[-1]

10

In [26]:
# first to the 9th by 2
seq1[:9:2]

[1, 3, 5, 7, 9]

In [27]:
# 10th to the first by negative 2
seq1[9:0:-2]

[10, 8, 6, 4, 2]

In [28]:
# why does this not print anything?
seq1[-2:-5]

[]

In [29]:
# Because of the step! (increment) -> counting backwards requires negative increment
seq1[-2:-5:-1]

[9, 8, 7]

In [30]:
# How would you Reverse the entire sequence? 
seq1[-1::-1]

[10, 9, 8, 7, 6, 5, 4, 3, 2, 1]

# Conditional Statements

**Conditional statements**, are the way in which a program makes decisions computationally.

This computational decision making follows a similar logic to everyday decision making. When you wake up in the morning and ask yourself "What am I going to have for breakfast" or for some it's "Am I going to have breakfast" you'll likely base your action on some criterion such as "I'm in the mood for something with cheese" or "I'm running late I'll just eat a big lunch". Decision making, in the simplest of structures, is a two-step process. Make a selection based on criteria and subsequently follow with action.

Decision making by a computer is done through code that is evaluated and executed conditionally. Here's a function that uses an `if` statement to determine if a number is an even or odd number.

In [88]:
def even_or_odd(number):
    # '%' is the modulo operator:
    # https://docs.python.org/3/reference/expressions.html#binary-arithmetic-operations
    mod = number % 2
    if mod > 0:
        return str(number) + " is an odd number."
    else:
        return str(number) + " is an even number."

In [89]:
even_or_odd(5)

'5 is an odd number.'

In [90]:
even_or_odd(16)

'16 is an even number.'

In [91]:
# What about 0?
even_or_odd(0)

'0 is an even number.'

But what happens when you put in a 0? The function will tell you that 0 is an even number, but we know that isn't the case. This brings use to nested If statements. You can put if statements within if statements to do further conditional checks. Run the function below to see how the addition of a nested if statement expands the function's capability.

In [92]:
def even_or_odd(number):
    mod = number % 2
    if mod == 0:
        if number == 0:
            return str(number) + " is neither an odd nor even number."
        else:
            return str(number) + " is an even number"
    else:
        return str(number) + " is an odd number"

In [93]:
even_or_odd(2)

'2 is an even number'

In [94]:
even_or_odd(1)

'1 is an odd number'

In [95]:
even_or_odd(0)

'0 is neither an odd nor even number.'

Conditional statements are very powerful. They, along with iterative statements, serve as the basis for many of the core programming applications within Python by executing logical flow.

# Loops and List Comprehension

#### For Loops

There are two distinct types of iterative statements often used in Python, the 'for' loop and the 'while' loop. For loops execute code by iterating over a sequence. They can iterate through lists, tuples, dictionaries, strings, and dataframes. Run the function below, which uses a for loop to sequence through a list and produce each value multiplied by 2, and then stores the value in a new list.

In [96]:
def double_each_value(numeric_list):
    new_doubled_list = []
    
    for x in numeric_list:
        new_doubled_list.append(x * 2)
        
    return new_doubled_list

In [97]:
numeric_list = [4, 7, 23, 8, 1, 86, 34, 76, 56, 34, 2, 8, 9]
double_each_value(numeric_list)

[8, 14, 46, 16, 2, 172, 68, 152, 112, 68, 4, 16, 18]

Following the above example, a for loop is comprised of the following:

- The `for` keyword
- An iterator variable that can be called anything (in this case it's 'x') followed by the 'in' keyword
- A sequence to loop through (in this case it's `numeric_list`)
- The statement to be run on each iteration

For loops can also be written as sequence comprehensions. A sequence comprehension is often better than a for loop given that they are roughly one line of code and you can use conditional statements within them. Run the function below that uses a list comprehrension in place of a for loop from the last function.

In [98]:
def double_each_value(numeric_list):
    return [x * 2 for x in numeric_list]

In [99]:
numeric_list = [4, 7, 23, 8, 1, 86, 34, 76, 56, 34, 2, 8, 9]
double_each_value(numeric_list)

[8, 14, 46, 16, 2, 172, 68, 152, 112, 68, 4, 16, 18]

#### While Loop

'While' loops iterate while a specific condition is true, contrary to the for loop where no condition is set. These are useful for a program that needs to constantly run until a specific goal (condition) is met. An example might be using a counter variable in a program and once that counter hits a specific point, the loop stops. The function below defines a function that does just that. This function could be used concurrently with another function as a time-stop method.

In [100]:
def while_loop(counter):

    while counter < 9:
       print(f'The count is: {counter}')
       counter += 1

    print("All done!")

counter = 6
while_loop(counter)

The count is: 6
The count is: 7
The count is: 8
All done!


Following the above example, a `while` loop is comprised of:

- The `while` keyword
- The condition to be met that will stop the while loop (i.e. count variable less than 9)
- A semicolon, `':'`
- The command to be executed within the while loop

Whatever you come across, make sure that you while loop has a conditional end that can be met! You can have infinite while loops that will massively slow down your machine or application and often will require you to restart your computer.