# Python Fundamentals

Prepared by: Gregory J. Bott, Ph.D.
(Designed to parallel our textbook, Python for Everybody by Charles Severance)


## Why should a business student learn Python?

Information is the lifeblood of nearly every organization. The purpose of this notebook is the help business students master the fundamental concepts and skills required to effectively use Python. Python skills are in high demand. One reason for this demand is Python's ability to efficiently acquire, manipulate, analyze and visual data. However, prior to performing data analytic tasks, business students must learn the fundamentals.

## Data Analysis is Part of *Every* Job
It's not just data scientists or data analytics that need analysis skills. Nearly every job intersects with data. It's highly likely that even if your job doesn't have "analyst" or "scientst" in the title, you'll still benefit from understanding how to acquire, handle, manipulate, and report data.

> ### Deloitte: "...skills that were highly appreciated in Deloitte and projects were Java, Python/R..."


## Python skills are in high demand

2018 Developer Survey by StackOverflow

![](images/2018MostWantedLanguages.jpg)

## Programming Teaches Problem solving

The ability to think critically and solve problems is a general life skill. Proble solving applies to the all facets of life. In this course you'll learn Python syntax and structures, but more importantly you'll learn to abstract a problem and code a solution. 

## Python is the new Excel
Business rightly assume that you have solid Excel skills. However, the new expectation is that you already possess the skills necessary to handle data acquisition, analysis, and visualization. And alothough this can arguably still be done in Excel, Python's tools and libraries are exponentially more efficient. 

Python is the new Excel. (see https://www.fincad.com/blog/python-new-excel)

## About the Python language
(Sources: Wikipedia, Dr. Nickolas K. Freeman)

>Python is an interpreted high-level programming language for general-purpose programming. Created by Guido van Rossum and first released in 1991, Python has a design philosophy that emphasizes code readability, and a syntax that allows programmers to express concepts in fewer lines of code, notably using significant whitespace. It provides constructs that enable clear programming on both small and large scales.

>Python is a multi-paradigm programming language. Object-oriented programming and structured programming are fully supported, and many of its features support functional programming and aspect-oriented programming (including by metaprogramming and metaobjects (magic methods)). Many other paradigms are supported via extensions, including design by contract and logic programming.

>The language's core philosophy is summarized in the document The Zen of Python (PEP 20), which includes aphorisms such as:

> - Beautiful is better than ugly
> - Explicit is better than implicit
> - Simple is better than complex
> - Complex is better than complicated
> - Readability counts

> Rather than having all of its functionality built into its core, Python was designed to be highly extensible. This compact modularity has made it particularly popular as a means of adding programmable interfaces to existing applications. Van Rossum's vision of a small core language with a large standard library and easily extensible interpreter stemmed from his frustrations with ABC, another programming language that espoused the opposite approach.

> While offering choice in coding methodology, the Python philosophy rejects exuberant syntax (such as that of Perl) in favor of a simpler, less-cluttered grammar. As Alex Martelli put it: "To describe something as 'clever' is not considered a compliment in the Python culture." Python's philosophy rejects the Perl "there is more than one way to do it" approach to language design in favor of "there should be one—and preferably only one—obvious way to do it".

>Python's developers strive to avoid premature optimization, and reject patches to non-critical parts of CPython that would offer marginal increases in speed at the cost of clarity. When speed is important, a Python programmer can move time-critical functions to extension modules written in languages such as C, or use PyPy, a just-in-time compiler. Cython is also available, which translates a Python script into C and makes direct C-level API calls into the Python interpreter.

>An important goal of Python's developers is keeping it fun to use. This is reflected in the language's name—a tribute to the British comedy group Monty Python—and in occasionally playful approaches to tutorials and reference materials, such as examples that refer to spam and eggs (from a famous Monty Python sketch) instead of the standard foo and bar.

>A common neologism in the Python community is *pythonic*, which can have a wide range of meanings related to program style. To say that code is pythonic is to say that it uses Python idioms well, that it is natural or shows fluency in the language, that it conforms with Python's minimalist philosophy and emphasis on readability. In contrast, code that is difficult to understand or reads like a rough transcription from another programming language is called unpythonic.

>Users and admirers of Python, especially those considered knowledgeable or experienced, are often referred to as Pythonists, Pythonistas, and Pythoneers
# What is a Program?

> ## A program is a sequence of instructions that specified how to perform a computation. 

## Building Blocks of Nearly Every Language
* **input** - get data--from user via keyboard, from sensors, from other programs, from databases, etc.
* **output** - display results in the console on the screen, on paper, to another program, a web page, etc.
* **math** - perform mathematical operations (addition, multiplication, etc.)
* **conditional execution** - check for certain values or states and run the appropriate code
* **repetition** - repeatedly perform some action a certain number of times

### Python is interpreted, not compiled.
Programming languages generally fall into one of two categories: Compiled or Interpreted. With a compiled language, code you enter is reduced to a set of machine-specific instructions before being saved as an executable file. With interpreted languages, the code is saved in the same format that you entered. Compiled programs generally run faster than interpreted ones because interpreted programs must be reduced to machine instructions at runtime. However, with an interpreted language you can do things that cannot be done in a compiled language. For example, interpreted programs can modify themselves by adding or changing functions at runtime. It is also usually easier to develop applications in an interpreted environment because you don't have to recompile your application each time you want to test a small section. (source: http://www.vanguardsw.com/dphelp4/dph00296.htm)
<br>

> ![image](images\ComplieGraphic2.jpg)

<br>

> ![image](images\InterpretedPython2.gif)



In [None]:
# cost = most recent landed cost
cost = 1.27

# Setting Up Your Environment
<a id="Setting_up_your_environment"> </a>



## Installing Anaconda
Anaconda is a free and open-source distribution of the Python and R programming languages for scientific computing, that aims to simplify package management and deployment. Package versions are managed by the package management system conda. (source: Wikipedia)

Watch a [video](https://vimeo.com/309189712)  that explains how to install Anaconda in a Windows environment.

## Start Jupyter Lab in specific directory

If you wish to control the starting folder (home folder) of Jupyter Lab, then following these instructions.

1. Open Anaconda Prompt
2. Navigate to starting folder (e.g., an external drive, G:\).
3. Type jupyter lab and press ENTER
    * The Home Folder is the starting folder of the Anaconda prompt.

## Loading the TOC plugin for Jupyter Lab

### Install dependencies
1. Right-click the Anaconda prompt icon and then click Run as Administrator.
2. In the console window, type the following commands:
  * conda udpate conda
  * conda install nodejs
  * conda install npm
  * jupyter labextension install @jupyterlab/toc
1. Then to start Jupyter Lab, type:
  * jupyter lab --watch
  
  
  (Source: https://github.com/jupyterlab/jupyterlab-toc)

# Why should a business student learn Python?

### Data Analysis is Part of *Every* Job
It's not just data scientists or data analytics that need analysis skills. Nearly every job intersects with data. It's highly likely that even if your job doesn't have "analyst" or "scientst" in the title, you'll still benefit from understanding how to acquire, handle, manipulate, and report data.

> ### Deloitte: "...skills that were highly appreciated in Deloitte and projects were Java, Python/R..."



## Python Coding basics


### Indentation
White space in many languages has little or no meaning. In Python, improper indentation will generate a syntax error:

### Comments

Make a habit of clearly commenting your code...even if the purpose of the code seems obvious. Even for code you have written yourself, it is often difficult to remember why you chose to implement something in a specific way. 

Start a comment using the hash (#) symbol 

# Variables, Expressions, and Types

Pyton is a *dynamically typed* language. A programming language is said to be dynamically typed, or just 'dynamic', when the majority of its type checking is performed at run-time as opposed to at compile-time. 

In [None]:
#Store a string in a
a = "apple"
print(a, type(a))

#Store a float in a
a = 3.141
print(a, type(a))

#Store a list in a
a = ["apple", 3.141, "banana"]
print(a, type(a))

In [None]:
# Error: statements following an if statement must be indented
if 4 > 1:
print("4 is greater than 1")

## Variables in Python
Variables store values. Variable names can be as long as you want, can contain letters and numbers, but must not begin with a number or be a Python keyword (e.g., true, for, from, lambda).

> **Python is case-sensitive.** unit_cost is not the same as Unit_cost. 

By convention variable names are lower case and use the underscore character to separate words. 

earnings_after_tax <br>
default_gateway

In [None]:
# Variables are case-sensitive
Fruit = "apple"
print(fruit)


In [None]:
# illegal -- must not start with a number
76trombones = 0

In [None]:
although_difficult_to_use_this_is_a_valid_variable = 1

## Basic Data Types in Python

### Integers
---
A number with no fractional part. 

![image](\images\int-number-line.svg)

#### Includes: 
* the counting numbers {1, 2, 3, ...}, 
* zero {0}, 
* and the negative of the counting numbers {-1, -2, -3, ...}

We can write them all down like this: {..., -3, -2, -1, 0, 1, 2, 3, ...}

Examples of integers: -16, -3, 0, 1, 198

Integer size is limited only by your machine.

In [None]:
bigInt = 1234568901234568901234568901234568901234564568901234568901234568901234568901234568901234568901234568900

print(bigInt + 1)

print(type(bigInt))

### Float type
* Platform dependent
* Typically equivalent to IEEE754 64-bit C double
* Smallest float is effectively 2.225 x 10^-308

In [None]:
type(1.0)

In [None]:
b = 2

print("b = {} and is type {}".format(b, type(b)))

In [None]:
b = 2 * 1.1

print("b = {} and is type {}".format(b, type(b)))

### Boolean type
---

In [None]:
is_fte = 1

# Boolean values indicate True or False and must be title-case
is_fte == True


# Reminder: do comparisons with double = sign (x == 5)
#    single = is assignment, let x = 5.
is_fte == False


In [None]:
# Error = the boolean value must be capitalized (True, not true or TRUE)
if is_fte = true


### String Type
A string is a sequence of characters. 
---

In [None]:
s = 'Monty Python'

# use len() function to get the length of a string
print(len(s))

# Print part of a string, a slice
print(s[0:5])

print(s[6])

In [None]:
#Strings are immutable (Can't make Python to Jython)
s[6] = "J"

### None type

The null keyword is available in languages such as C++ and Java. Null means empty. It is not equivalent to a zero-length string nor is it equivalent to zero (0). In Python, the None type  is the keyword equivalent to Null. None (the type) is not equivalent to the string, "None". In my humble opinion, None is more logical than null. None means the object is nothing, non-existent. 

#### Why use None?

When instantiating (creating) an object, you may need to check to see if the instantiation was successful or not. If the creation of the new object failed, the object will return a None type.

---

In [None]:
print(None == "None")
print(type(None))

In [None]:
#Source: https://www.pythoncentral.io/python-null-equivalent-none/
database_connection = None
 
# Try to connect (none of the variables for the connect have values...)
try:
    database = MyDatabase(db_host, db_user, db_password, db_database)
    database_connection = database.connect()
except:
    pass
 
if database_connection is None:
    print('The database could not connect')
else:
    print('The database could connect')

### Complex numbers
---
A Complex Number is a combination of a Real Number and an Imaginary Number. [1]

![image](\images\complex-example.svg)

   

In [None]:
print("3i is of type: " + str(type(3j)))
print(7 + 3j)

   
The "unit" imaginary number (like 1 for Real Numbers) is i, which is the square root of −1.   

![image](\images\imaginary-square-root.svg)

> **Except in Python, "j" is used instead of "i".**

In [None]:
1j * 1j == -1

## Converting values between types

Often you may need to convert from values from one type to another. For example, you may need to convert the values received from the input() function from string to an int or a float.

In [None]:
user_number = input("Enter a number and I'll tell you if it is even or odd:\n")
print(type(int(user_number)))


## Mandelbrot set
What exactly is a Mandelbrot set?
The term Mandelbrot set is used to refer both to a general class of fractal sets and to a particular instance of such a set. In general, a Mandelbrot set marks the set of points in the complex plane such that the corresponding Julia set is connected and not computable. (source: mathworld.wolfram.net)

![image](\images\220px-Mandelbrot_sequence_new.gif)

In [None]:
#Source: https://gist.github.com/jfpuget/60e07a82dece69b011bb -- Jean-François Puget¶

import numpy as np
from matplotlib import pyplot as plt
from matplotlib import colors
%matplotlib inline 


def mandelbrot_image(xmin,xmax,ymin,ymax,width=12,height=12,maxiter=80,cmap='hot'):
    dpi = 72
    img_width = dpi * width
    img_height = dpi * height
    x,y,z = mandelbrot_set(xmin,xmax,ymin,ymax,img_width,img_height,maxiter)
    
    fig, ax = plt.subplots(figsize=(width, height),dpi=72)
    ticks = np.arange(0,img_width,3*dpi)
    x_ticks = xmin + (xmax-xmin)*ticks/img_width
    plt.xticks(ticks, x_ticks)
    y_ticks = ymin + (ymax-ymin)*ticks/img_width
    plt.yticks(ticks, y_ticks)
    
    norm = colors.PowerNorm(0.3)
    ax.imshow(z.T,cmap=cmap,origin='lower',norm=norm)
    
def mandelbrot(c,maxiter):
    z = c
    for n in range(maxiter):
        if abs(z) > 2:
            return n
        z = z*z + c
    return 0

def mandelbrot_set(xmin,xmax,ymin,ymax,width,height,maxiter):
    r1 = np.linspace(xmin, xmax, width)
    r2 = np.linspace(ymin, ymax, height)
    n3 = np.empty((width,height))
    for i in range(width):
        for j in range(height):
            n3[i,j] = mandelbrot(r1[i] + 1j*r2[j],maxiter)
    return (r1,r2,n3)

mandelbrot_image(-2.0,0.5,-1.25,1.25,maxiter=80,cmap='gnuplot2')

## Order of Operations
<a id="Setting_up_your_environment"> </a>

The order of evaluation of expressions with more than one operator follows *rules of precedence* -- PEMDAS

* **Parentheses**
* **Exponentiation**
* **Multiplication and Division**
* **Addition and Subraction**
* **Left to Right** - operators with the same precedence are evaluated left to right

In [None]:
# Exponentiation, then Multiplication
3*1**3

## Modulus operator
<a id="modulus_operator"> </a>

In computing, the modulo operation finds the remainder after division of one number by another (sometimes called modulus). Given two positive numbers, a (the dividend) and n (the divisor), a modulo n (abbreviated as a mod n) is the remainder of the Euclidean division of a by n.


In [None]:
# Divide 7 by 3
7/3

In [None]:
# Return the quotient
7//3

In [None]:
# Return the remainder
7 % 3

## Assignment vs. Comparison
---
### In Python, assignment of a value to a variable is accomplished using a single equal sign.  


In [None]:
x = 7
print(x)
x = 1000
print(x)

### Comparison is performed using a double equal sign.

In [None]:
# Must use a double equal sign to compare values
if x == 7:
    print("Lucky Seven")
else:
    print("You lose!")

In [None]:
# Error when adding string and integer
userinput = "5"

sum = 7 + int(userinput)
print(sum)

In [None]:
sum = 7 + int(userinput)
print("sum = {}".format(sum))

Style guide - http://www.voidspace.org.uk/python/articles/python_style_guide.shtml


[1]: Source: https://www.mathsisfun.com

# Conditional execution

In Python, use the if statement to perform decision-making by allowing conditional execution of a statement or group of statements based on the value of an expression.

In [None]:
CardTotal = 20
if CardTotal > 21:
    print("busted!")

You can use an if statement to execute a set of statements based on whether the value of a variable is even or odd.

![image](\images\if-then-elselogic.jpg)

The basic if statement form:

if expr: <br>
    ''statement''

In [None]:
x = 9

if x%2 == 0:
    print('x is even')
else:
    print('x is odd')

## Conditionals with multiple expressions


In [None]:
shave = True
haircut = True

if shave and haircut:
    print("You know the secret knock!")
else:
    print("You're not one of us.")

In [None]:
#BOTH statements must be true to satisfy the statement and print True
if 1 < 10 and -2 > -7:
    print(True)
else:
    print(False)

In [None]:
#Only ONE expressopm must be true to satisfy the statement and print True
if 100 < 10 or -2 > -7:
    print(True)
else:
    print(False)

## Chained Conditionals

If more then two possibilities exist, one way to programmatically express this is using elif.

In [None]:
x = 2
y = 2

if x > y:
    print("x is greater than y")
elif y > x:
    print("y is greater than x")
else:
    print("x and y are equal")

## Nested Conditionals

You can also nest one conditional inside another conditional. Consider the previous example:

In [None]:
if x == y:
    print("x and y are equal")
else:
    if x < y:
        print("x is less than y")
    else:
        print("x is greater than y")

> ### No Switch or Select Statement in Python
> In some cases a dictionary structure could be useful to replace a switch statement. 

## Grouping comparison operators
Comparison operators can be grouped.

In [None]:
x = 111
if 0 < x < 10:
    print("x is a positive single-digit number")
elif x < 0:
    print("x is a negative number")
elif x >= 10:
    print("x is a positive two-digit number")

In [None]:
x = 1
y = 2
z = 6

# The entire expression must be true to print values
if x < y < z: print(x); print(y); print(z)

## Catching exceptions using try and except

Robust programs anticipate and gracefully handle unexpected situations and errors. For example, when asking a user to input a number, a robust program gracefully handles unexpected or erroneous input. Another examples include attempting to open a file or connect to a database.

In [None]:
#Source: https://www.pythoncentral.io/python-null-equivalent-none/
database_connection = None
 
# Try to connect (none of the variables for the connect have values...)
try:
    database = MyDatabase(db_host, db_user, db_password, db_database)
    database_connection = database.connect()
except:
    pass
 
if database_connection is None:
    print('The database could not connect')
else:
    print('The database could connect')

## Operators and Operands
<a id="operators_and_operands"> </a>
Operators are special symbols that represent computations like addition and multiplication. The values the operator is applied to are called operands.
The operators +, -, *, /, and ** perform addition, subtraction, multiplication, division, and exponentiation, as in the following examples:


In [None]:
#Addition and subtraction
20+33-10

In [None]:
# Five squared
5**2

In [None]:
# Multiplication
(3+2)*(9+2)

In [None]:
# Division
100/25

# Iteration
Computers are very good and doing repetitive tasks. You will use iteration for many operations in Python. For example, you may loop through records in a database or examine lines in text file. 

Two methods for iterating are the whiel statement and the For loop.

## While statement


In [None]:
n = 0
while n < 7:
    print("day " + str(n))
    n = n + 1

## For Loop

In [None]:
# Range function (see "Functions" section for more information about functions)
# range(start_value, end_value, step_value)

for x in range(0,11):
    print(x, end=",")

In [None]:
# To count down, set the step value to negative
for x in range(10,0,-1):
    print(x, end=" ")

Adding a counter variable

In [None]:
#Define z. Set it to 0.
z = 0

#Loop ten times (1 through 10)
for y in range(1,11):

    #Increment z by 1 during each loop. 
    z += 1 # This is shorthand for z = z + 1
    
    print(z)

# Functions
A function is a discrete set of instructions typically designed to receive one or more values and return a value. A function call receives values called "arguments" and it "returns" a value. 

## Built-in functions
The print() function takes an argument and sends output to the console.

The type() function takes a value or object and returns its type.

In [None]:
# Print and Type functions 

print(type(3.141))

In [None]:
# Print the highest value letter
print(max("Hello world"))
print(f"w= {ord('w')}")
print(f"o= {ord('o')}")


In [None]:
# Display the lowest value
min("Hello world")

In [None]:
#Display the number of characters
len('Hello world')

## Getting User Input

To get input from the user Python provides a built-in function **input** that captures input from keyboard as a string.

In [None]:
# Get card total from user and store in CardTotal 
card_total = input("Card total?")
print("CardTotal type is: {}".format(type(card_total)))

In [None]:
# Error -- CardTotal is str
if card_total > 21:
    print("Busted")
else:
    print("Hit me")

In [None]:
# Must cast to appropriate value type (int)
if int(card_total) > 21:
    print("Busted")
else:
    print("Hit me")

## Type Converstion Functions

Python includes functions to convert values from one data type to another. 

For example, when requesting a number value from a user you may need to convert the resulting string input to an number type such as int.

In [None]:
# Input values are strings. Convert strings to appropriate number type, if necessary.
# Enter decimal value....error.

tirepressure = int(input("Input current tire pressure:"))

if tirepressure < 32:
    print("Add air to tire.")
else:
    print(f"At {tirepressure} psi the tire does not require additional air pressure.")

## Misc Functions
Below are common functions and explanations of how they work and when you might use them.

### Generating random numbers
To generate random numbers, use the random module. Note that this module is not designed for cryptographic use. 

In [None]:
import random

# Print 10 numbers between 1 and 100 (inclusive)
for x in range(10):
  print(x,random.randint(1,101))

## Creating your own Functions
Use the def keyword to define custom functions. Empty parentheses following the function name indicate the function takes no arguments.

In [None]:
def print_lyrics():
    """ Prints lumberjack lyrics! """
    print("I'm a lumberjack and I'm okay.")
    
def repeat_lyrics():
    print_lyrics()
    print_lyrics()
    
repeat_lyrics()

## Docstrings
Docstrings (documentation strings) provide a helpful and convenient method of
displaying documentation with Python modules, functions, classes, and methods. 

An object's docsting is defined by including a string constant as the first
statement in the object's definition and can be viewed by calling help(function).

In [None]:
help(print_lyrics)

## Passing values
Functions defined with arguments accept values. 

In [None]:
def print_stuff(mystuff):
    """ Prints the string passed to it. """
    print(mystuff)
    
newstuff = "really cool stuff"
print_stuff(newstuff)

## Accessing functions in modules

One of the strengths of the Python language is the large number of modules available to it. To add functionality to your program, you make modules available using the import keyword. Below we import the math module and the random module.


In [None]:
# Get colume of a sphere using radius (r)

import math

def get_sphere_volume(r):
    """Returns volume of a sphere given the radius (r)."""
    volume = (4/3) * math.pi * r**3
    return volume

#Call the function to find the volume of a sphere with a radius of 2.
get_sphere_volume(2)



In [None]:
#What other functions are available in the math module? Use the dir() function to list a directory of math attributes.
dir(math)

## Void and Return Functions
PY4E calls functions that return values "fruitful." Functions that do not return values are void functions.

In [None]:
def addtwo(a,b):
    """ Returns the sum of two numbers."""
    added = a + b
    return added

x = addtwo(7,6)
print(x)

In [None]:
help(addtwo)

## Currency and date formatting
Because currency and date formats vary by locale, a recommended way of formatting currency and dates is to use the locale module. This module accesses the locale of your current system and applies it to format values.

In [None]:
import locale
import datetime

#Sets locale for all categories to the user's default setting
locale.setlocale(locale.LC_ALL, '')

#To add commas, set grouping = True
print(locale.currency(100000.55977, grouping=True))

today = datetime.date.today()
print(today)

In [None]:
dir(datetime)

In [None]:
#Source: https://www.programiz.com/python-programming/datetime#datetime
now = datetime.datetime.now() # current date and time

year = now.strftime("%Y")
print("year:", year)

month = now.strftime("%m")
print("month:", month)

day = now.strftime("%d")
print("day:", day)

time = now.strftime("%H:%M:%S")
print("time:", time)

date_time = now.strftime("%m/%d/%Y, %H:%M:%S")
print("date and time:",date_time)	

In [None]:
#Source: https://www.programiz.com/python-programming/datetime#datetime

from datetime import datetime, date

t1 = date(year = 2018, month = 7, day = 12)
t2 = date(year = 2017, month = 12, day = 23)
t3 = t1 - t2
print("t3 =", t3)

t4 = datetime(year = 2019, month = 1, day = 12, hour = 7, minute = 9, second = 33)
t5 = datetime(year = 2019, month = 12, day = 25, hour = 5, minute = 55, second = 13)
t6 = t5 - t4
print("t6 =", t6)

print("type of t3 =", type(t3)) 
print("type of t6 =", type(t6))

## Using Keyword (default) and required arguments

In [None]:
# Function using keyword and default arguments
def calc_tip(amount, percentage = .15):
    """ Calculate a tip based on an amount. 15% is default. """
    tip = amount * percentage
    return tip

print(calc_tip(10))

print(calc_tip(20, percentage = .25))

# String Operations
Text in Python is represented by a string. A string is a sequence of characters. It is a derived data type. Strings are immutable. This means that once defined, they cannot be changed.

You can access characters one at a time using the bracket [] operator.

You may use either single, double-quotes, or triple quotes. Use double or triple quotes when a string contains a single apostrophe, double apostrophe or both.


## Using single, double, and triple quotes

In [None]:
#Double quotes specifies a string.
statement = "I'm a Python programmer."

#Single quotes also specify a string. Triple quotes, too.
howdy = 'hello, world!'
print(howdy)

#To print quotes, you canuse the escape character (\)
as_good_as_it_gets = 'Sell crazy someplace else. We\'re all stocked up here.'
print(as_good_as_it_gets)

#Tripe quotes are helpful when you want to display single or double quotes within a string without using an escape character.
cannoli = """Clamenza said, "Leave the gun, take the cannoli." It's one of my 'fav' movie quotes.""" 
print(cannoli)




## String Capitalization

In [None]:
# Capitalize the first word
print(howdy.capitalize())

In [None]:
# Capitalize each word
print(howdy.title())
print("this is title case".title())

In [None]:
# Capitalize each word
print(howdy.upper())
print("this is all capps".upper())

#Use upper() to compare strings ignoring case
myfavfruit = "Kiwi"

if myfavfruit.upper() == "KIWI":
    print("That's my fav!")
else:
    print("Not my fav")

In [None]:
book_title = "THE UNOFFICIAL GUIDE TO ETHICAL HACKING"

# Lowercase
print(book_title.lower())

In [None]:
# Store the string "banana" in the favorite_fruit variable. 
favorite_fruit = "banana"

# A string is essentially an array of characters and you may access them like you would an array.
print(favorite_fruit)
print(favorite_fruit[0])
print(favorite_fruit[0:2])

In [None]:
# Strings are immutable. Unlike a typical array, you may NOT modify the items (characters) in the array.
# Rather than changing the first letter from 'b' to 'B', this code results in an error.
favorite_fruit[0] = 'B'

In [None]:
print(favorite_fruit)

# You can, however, replace the string value (e.g., "banana") with something else (e.g, "apple")
favorite_fruit = "apple"

print(favorite_fruit)

## String Concatenation

Use the '+' operator to join strings.

In [None]:
first_name = "Gregory"
middle_initial = "J"
last_name = "Bott"

full_name = first_name + " " + middle_initial + " " + last_name + ", Ph.D."
print(full_name)

## Removing white space
A common task when working with data is to remove white space (spaces, tabs, newlines) from the beginning and end of a string. to remove white space use the **strip** method.

In [None]:
# value is preceded by three tabs and followed by a line break
data_column = "\t\t\t 133422.88\n"
print(str(data_column) + "[end]")

#tabs and the line break are stripped from the string
print(data_column.strip() + "[end]")

## Format Operator
To substitute values from variables or functions into a string, use the *format operator* %. 

Do not confuse % with modulus operator. In the statement, 4 % 2 = 0 '%' is the modulus operator. 

Instead of using the % operator between integers as in the modulus operator, the *format operator* is used within a string.
<br>%d = signed integer decimal
<br>%s = string
<br>%f = float

For more conversion types go to https://docs.python.org/3/library/stdtypes.html#old-string-formatting

In [None]:
b_of_b_on_wall = 99
beverage = "beer"

for bottle_num in range(100,0,-1):
    print("%d bottles of %s on the wall, %d bottles of %s." % (bottle_num, beverage, bottle_num, beverage))
    print("Take one down and pass it around, %d bottles of %s on the wall." % (int(bottle_num)-1, beverage))
    

## Using str.format()
The format operator is a good option, but when you have multiple placeholders in a string, code becomes less readable. 

One advantage of the str.format() method is that you can use the replacement fields in any order. Simply use their index values.

In [None]:
name = "Greg"
age = "82"

print("Hello, {}. You are {}.".format(name, age))

In [None]:
name = "Greg"
age = "82"

#Reference index to use out of sequence order.
print("Hello, {1}. You are {0}.".format(age, name))

In [None]:
# Use dictionary values
person = {'name': 'Greg', 'age': 82}
print("Hello, {name}. You are {age}.".format(**person))

## Using f-Strings
Beginning with Python 3.6, you can use f-strings ("formatting string literals"). The syntax for f-Strings is similar to str.format() but results in more readable code.

In [None]:
name = "Greg"
age = "82"

print(f"Hello, {name}. You are {age}.")

## Splitting and Joining strings

In [None]:
# Below is a database record exported using the pipe symbol ("|") to seprate fields
exported_record = "Quin J. Alford|Proin Company|Ap #664-5782 Felis St.|Butte|35565|MT|-72.72653, -167.07764|4716 4071 8086 1415|436|eu@pellentesque.net"
print("original data:")
print(exported_record)
print()

#Split the data at each pipe symbol. The result of the split function is a Python list (essentially an array)
exported_record = exported_record.split("|")
print("converted to a list: ")
print(exported_record)
print()

In [None]:
#Print the first and last member of the list. Acess the first element (element 0), and the last element (-1).
#The second to last element would be accessed using [-2].
print("Name: " + exported_record[0], "   email: " + exported_record[-1] +"\n")

#Join the list by comma and store in exported_record
exported_record_csv = ",".join(exported_record)
print(exported_record_csv)

## Find() method for strings

In [None]:
gburg_text = "Four score and seven years ago our fathers brought forth on this continent, a new nation, conceived in Liberty, and dedicated to the proposition that all men are created equal."

#Find first instance of a word, find(value, start default = 0, end default = end of string)
find_pos = gburg_text.find("score")
print(find_pos)

# File Operations

Use the open() function to read(r), append (a), or write (w) to a file. Opening a file returns a file handle, not the actual data in the file. After opening the file you can read or write to it. When you are finished with the file, ensure it is closed. Failing to close a file may lead to memory issues, inaccessible files, and possibly data loss.

In [None]:
# use the os module to access operating system information such as the current working directory (getcwd())
import os

# Create (or overwrite) a file
# If no path is specified, the file will be created in the current working directory

# If the file exists, opening with the "w" parameter overwrites the existing file. To avoid overwriting a file, open it with "a" (append).
f = open("demofile.txt", "w")

f.write("This is the first line of the file.\n")

# Be sure to close your file. Failure to do so will cause problems.
f.close()

# Get the current directory
print(os.getcwd())

## Using the With statement for opening files
One advantage to using the With statement is that files you open using this method are automatically closed.

In [None]:
# Append the file
with open("demofile.txt", "a") as f:
    f.write("This is the second line.\n")

    # No need to explicitly close the file. Close() is automatically called.

## Reading files
There are several ways to read data from a file. Some of the methods to read a file include: reading a specified number of characters, reading line-by-line, or a reading number of lines.

### Reading an entire file

In [None]:
file_path = os.getcwd()
file_abs = file_path + "\\"+"gettysburg.txt"
with open(file_abs,"r") as fh_getty:
    #read() will access the entire file. Not a good option for large files.
    print(fh_getty.read())

In [None]:
with open(file_abs,"r") as fh_getty:
    n = 100
    #read() will access the entire file. Not a good option for large files.
    print(fh_getty.read(n)) # Read the first n characters

In [None]:
with open(file_abs,"r") as fh_getty:    
    #read() will access the entire file. Not a good option for large files.
    print(fh_getty.readline()) # Read a line
    print(fh_getty.readline()) # Read a line
    print(fh_getty.readline()) # Read a line

In [None]:
with open("gettysburg.txt","r") as fh_getty:    
    #read() will access the entire file. Not a good option for large files.
    x = fh_getty.readlines() # Read all lines with new line characters, separated by commas
    print(x[0])

In [None]:
with open("gettysburg.txt","r") as fh_getty:
    line_number = 0
    for x in fh_getty: # "x" will represent a line
        print(str(line_number) + ": " + x)
        line_number += 1

In [None]:
with open("fake_customer_list.txt", "r") as fh_customers:
    
    for record in fh_customers:
        customer_list = record.split("|")
        full_name = customer_list[0]
        email = customer_list[-1]
        print(full_name + " -- " + email)        

In [None]:
import os
print(os.listdir(os.getcwd()))

## Brief mention: Pandas and Numpy
Although built-in file operations in Python may be useful for trivial matters, Pandas and Numpy are much more effective for reading, shaping, and analyzing data. Using these libraries is beyond the scope of this course, however, you should be aware of these libraries. 

In [None]:
import numpy as np
import pandas as pd

df = pd.read_csv("fake_customer_list.txt", sep="|")
df.head()


In [None]:
df.describe()

# Data Structures
sources: (W3Schools, RealPython.com)

## What's a data structure?
As its name implies, a data structure is a containerthat holds data. Just like some post office boxes hold packages and others hold letters, Python's built-in data structures have different purposes and uses. Use data structures to organize and perform operations on data. Python has the following built-in data structures: Lists, Dictionaries, Sets, and Tuples. Each container has different attributes and is used for a different purpose.

<img src="images/post_office_boxes.jpg" align="middle">


## Comparing Built-in Data Structures
Below is a comparsion of four built-in data structures in Python. 

![](images/Structures.jpg)

## Lists

A list is an ordered sequence of *items*. Lists are similar to arrays in other languages. One difference is that Lists can contain different types of data.

### Creating Lists
 
Lists are created using several methods.

In [1]:
#Use square brackets to make list

my_list = [] # an empty list

cities = ["Dallas","Chicago","Miami","Grand Rapids" ]
print(cities, type(cities))

#Ordered -- accessible via index
print(cities[2])

# Lists are not limited to containing only values of a single type
# A list may contain objects such as another list 
my_list = [True, 0, "Greg Bott", 3.14159, ["steak","eggs","donuts"]]

['Dallas', 'Chicago', 'Miami', 'Grand Rapids'] <class 'list'>
Miami


In [3]:
# The split() operation results in a list

addr = 'monty@python.org'
uname, domain = addr.split('@')
email = addr.split('.' )
print(f"user name = {uname}")
print(f"domain name = {domain}")
print(type(uname))
print(type(email))
print(email[1])

user name = monty
domain name = python.org
<class 'str'>
<class 'list'>
org


### Testing for membership

Use the in keyword to test for list membership.

In [4]:
print("Dallas" in cities)
print("Tuscaloosa" in cities)

True
False


### Iterating a list

Use a for loop to iterate a list.

Use the len() function to determine how many items are in the list and use that within a range() function.

In [6]:
for i in range(len(cities)):
    print(cities[i])

Dallas
Chicago
Miami
Grand Rapids


### List operations


In [13]:
# Use the '+' operator to concatenate lists
a = [1,2,3]
b = [4,5,6]
c = a + b # does not alter 'a' or 'b'
print(c, type(c))

print(a)

a.extend(b) # This combines a and b, alters a but not b
print("list 'a' = ", a)
print("list 'b' =", b)
print(c)

[1, 2, 3, 4, 5, 6] <class 'list'>
[1, 2, 3]
list 'a' =  [1, 2, 3, 4, 5, 6]
list 'b' = [4, 5, 6]
[1, 2, 3, 4, 5, 6]


In [14]:
# Use the '*' operator to repeat items
print(a * 3)

[1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6]


### Slicing Lists
You can return parts of a list using slicing operators. Tuples can also be sliced.

In [15]:
# Slicing operations

t = ['a', 'b','c','d','e','f','g']

# return the 2nd and 3rd elements in t
print(t[1:3])

['b', 'c']


In [16]:
# Omitting the first parameter tells the intepreter to start at the beginning
print(t[:3])

['a', 'b', 'c']


In [17]:
# Omitting the second paramter tells the interpreter to continue to the end
# start with the third element and return all elements to the end of the list
print(t[3:])

['d', 'e', 'f', 'g']


### Sorting Lists

In [18]:
my_letters = ['n','r','y','x','a','w']

# Use the sort() method to sort a list
my_letters.sort()

print(my_letters)

my_letters = my_letters.sort() # Don't do this...sort() returns "None"
print(my_letters)

['a', 'n', 'r', 'w', 'x', 'y']
None


### Appending and Deleting Lists

In [None]:
# Deleting elements by index
t = ['a', 'b', 'c']


# We want to delete 'b' and we know it's index value is 1
x = t.pop(1)

# The pop() method deletes the element from teh list and returns the deleted value (stored in x)

print(t) # New list without 'b'
print(x) # 'b' stored in x

### List Comprehensions

## Dictionaries

Think of dictionaries like a list, but with a flexible index. The List index must be an integer, but the index or keys used to associate values can be different types.

Dictionaries are **unordered** and use key-value pairs to store and retrieve data. In other languages this structure might be called an *associative array*.



### Creating dictionaries
Use curly braces and a colon to indicate to the interpreter that you are creating a dictionary data structure.

In [None]:
# The employee ID is associated with the employee name
employees = {"2334":"Greg Bott", "2335":"John Gilbert", "2336":"Bill Hampton","2337":"Joe Odom"}
print(employees)

In [None]:
# Using the employee ID (key), display the name of the employee (value)
print(employees["2334"])

In [None]:
# Use a List within a dictionary (see Golf web-scraping example)
make_model = {"Ford":["Mustang","Explorer","Focus"],"Volkswagen":["Passat","Jetta","Beetle"]}
print(make_model["Ford"])


In [None]:
# Create an empty dictionary
person = {}

#Display the type of the 'person' variable
type(person)

person['fname'] = 'Greg'
person['lname'] = 'Bott'
person['spouse'] = 'Amy'
person['children'] = ['John Davis', 'Piper', 'Will', 'Truett']
person['pets'] = {'dog': 'Bama', 'cat': 'TJ'}

print(person)

In [None]:
print(person['fname'])

In [None]:
# Add to 
person['pets'] = {'flying squirrel':'Rocky'}

### Check for Values in a Dictionary

To determine if a value is present within a key, us the *in* keyword.

In [None]:
print("Focus" in make_model["Ford"])

## Sets
* Sets are unordered.
* Set elements are unique. Duplicate elements are not allowed.
* You may add or remove items from the set, but you cannot edit an item in a set.
* Accessing items by index (e.g., myset[1]) is NOT supported.

You can define a set using the set() function.
```python
x = set(<iter>)
```

In [None]:
my_list = ['a','b',1, 'c']
set2 = set(my_list)
print(set2)
print(my_list)

In [None]:
You can also create a set using curly braces {}.

In [None]:
# Use curly braces to create a set
my_set = {1,1, 6,7, 3, 5, 'red'}
print(type(my_set))
print(my_set)

### Why do I care about sets?
Sets in Python are the same as sets in mathematics. Sets contain a well-defined collection of distinct objects called elements. Using the set object enables you to perform set operations such as union and intersection.

![](images/data_science_diagram.png)
(image source: https://towardsdatascience.com)

In [None]:
# Persons with expertise in specific areas
cs_expertise = {"Bill", "Matt", "Alexandra", "Joe", "Dexter"}
stats_expertise = {"Dexter", "Subha", "Brad", "Bruce"}
business_expertise = {"Kay","Jonathan","Dexter","Suzanne", "Matt"}

# Who might be suited for Data Science (intersection of three topics)
data_scientists = cs_expertise.intersection(stats_expertise, business_expertise)
print(data_scientists)

In [None]:
#Error creating tropical_fruits set using set() contructor...why? [answer: it expects and iterable like a list or a tuple]
tropical_fruits = set("Guava", "Dragon Fruit", "Banana")
temperate_fruits = {"Apple", "Peach", "Plum"}

all_fruit = tropical_fruits.union(temperate_fruits)
print(all_fruit)

In [None]:
#Empty sets are evaluated as False
loch_ness_monsters = set()
print("The set of Loch Ness Monsters is " + str(bool(loch_ness_monsters)))
print()

#You can add, update, and remove items, but you cannot change items in a set
loch_ness_monsters.add("Marvin")
print("Added Marvin to monster set...")
print("The set of Loch Ness Monsters is " + str(bool(loch_ness_monsters)), loch_ness_monsters)
print("The length of the monster set is " + str(len(loch_ness_monsters)))
print()

#Find the unique grade values
grades = {81,100,81,89,76,94,93,86,75,88,96,76,87,90,81,78,99,83,94,75,83,92,96,81,99,89,99,98,100,95,84,94,97,100,92,97,98,92,95,88,90,98,87,86,95,86,84,91,87,88,83,89,84,98,75,90,100,79,83,94,89,93,84,83,94,84,93,97,75,81,91,84,78,89,96,97,99,90,98,83,93,96,98,91,77,98,97,76,98,75,89,92,81,83,84,82,94,89,77,96,94,100,86,79,87,78,83,86,89,99,77,96,88,91,86,89,99,82,83,92,91,84,83,76,89,90,82,75,84,83,81,96,87,90,82,93,76,86,100,81,88,100,94,84,99,77,91,92,98,88,90,83,88}
print(grades)
b_and_higher = set(range(75,101))

missing_grades = grades.symmetric_difference(b_and_higher)


print("What grades are missing from 75-100?: " + str(missing_grades))

In [None]:
#Use the set() method to create a set, parameter must be <iter> (an iterable --e.g., a list)
my_set2 = set(('foo', 'bar', 3.141))


## Tuples
What is the proper pronunication of "tuple"? Answer: either TEW-pull or TUP-pull. 



In [None]:
s = ' abc'
t = [0, 1, 2] 
zip(s, t)
for pair in zip(s, t):
    print(pair)


# Regular Expressions

# Network Programs

# Using Web Services

# Using Databases and SQLite

# Data Visualization