## BMIS-2542: Data Programming Essentials with Python
##### Katz Graduate School of Business, Fall 2019


## Session 1: Introduction and Getting Started with Python
***

Welcome to **Data Programming with Python**!

Jupyter Notebook documentation is [here](http://jupyter-notebook.readthedocs.io/en/latest/index.html). You can start with the [basics](http://jupyter-notebook.readthedocs.io/en/latest/examples/Notebook/Notebook%20Basics.html) first.

A quick reference to Jupyter tips and tricks can be found [here](https://www.dataquest.io/blog/jupyter-notebook-tips-tricks-shortcuts/).

Markdown is a special markup language (a superset of HTML). Markdown's syntax documentation is [here](https://daringfireball.net/projects/markdown/syntax) and some cheat sheets like [this](http://jupyter-notebook.readthedocs.io/en/latest/examples/Notebook/Working%20With%20Markdown%20Cells.html) and [this](https://medium.com/ibm-data-science-experience/markdown-for-jupyter-notebooks-cheatsheet-386c05aeebed) can be quite handy.

Here's the [HTML color coding cheat sheet](https://www.w3schools.com/colors/colors_names.asp) in case you want to decorate your notebook.

Knowing the Keyboard shortcuts is useful too: <mark>ctrl + shift + p (Windows)</mark> and <mark>cmd + shift + p (mac)</mark>.

Let's get started!

In [None]:
print('Hello, Python, here we come!')

#### Comments
 - Any text preceded by the hash mark (pound sign) <b>#</b> is ignored by the Python interpreter 
 - Useful to explain your code and also when you want to exclude certain blocks of code without deleting them

In [None]:
# This statement displays the words "Hello Python!" 
print("Hello Python!") # This is the Python-3 Syntax. The Python-2 syntax is: print "Hello Python!"

### Magic Commands
 - Magic commands (i.e., functions) are IPython's special commands which are not built into Python itself.
 - These are designed to facilitate comman tasks and enables the user to easily control the behavior of the IPython system. 
 - Magic commands are prefixed by the % symbol. 
 - They can be used without the % sign, as long as no variable is defined with the same name as the magic function in question (automagic!)
<br><br>
Let's try out some useful magic commands:

In [None]:
# Returns the current working directory path
%pwd

In [None]:
# Suppose you have the Python script "Hello.py" stored in the same directory as this Jupyter notebook
# You can run "Hello.py", using the %run magic command
%run Hello.py

In [None]:
%load Hello.py

In [None]:
# Time it takes to run Hello.py
%time %run Hello.py

In [None]:
# Time it takes to sum up the numbers from 0 to 9999
%timeit sum(range(10000))

In [None]:
# Displays detailed documentation for all of the availble magic commands
%magic

# Python Language Basics

### Python Scalar Types
 - Python has a small set of built-in types to handle numerical data, strings, boolean (True or False) values, and date and time. These are called scalar types or scalars. 
 - A scalar is a type that can have a single value such as 2, 7.14, or "Anne".
 - The commonly used scalar types in Python:
         - int (integer)
         - float (floating point number)
         - bool (True or False value)
         - str (String)
         - bytes (Raw ASCII bytes or Unicode encoded as bytes)
         - None (the Python "null" value)

In [None]:
type(5)

In [None]:
type(1.75)

In [None]:
type("Hello!")

In [None]:
type("5")

In [None]:
type(True)

"None" is the Python null value type. "None" is also a reserved keyword in Python. Every instance of "None" is of NoneType.

In [None]:
a = None

a is None

In [None]:
b = 5
b is not None

In [None]:
type(None)

### Type casting

The `str`, `bool`, `int`, and `float` types can be used as functions to cast values to those types.

In [None]:
s = '2.14159'
fval = float(s)

type(fval) 

In [None]:
int(fval)

In [None]:
bool(fval)

In [None]:
bool(0)

### Variables

Variables indicate stored information in the computer's memory. 
 - <mark>Variable names usually begin with an \_ (underscore) or a letter, not a number.</mark>
 - You can create a new variable and assign a value to it using **assignment** statements. e.g., `x = 5`<br>
   The `=` symbol is called the **assignment operator** and notifys Python to store the data on the right side of the  statement into the variable name printed on the left side (i.e., `x`).
 - When assigning a variable in Python, you are actually creating a reference to the object in the right hand side of the equal sign
 - Assignments are unidirectional in Python
 - Variable names are case sensitive.
 - Python's keywords cannot be used as variable names

In [None]:
greeting = 'Have a Lovely Day!' 
greeting

In [None]:
n = 100  
n

In [None]:
100 = n # Assignments are unidirectional

In [None]:
pi = 3.1428571428571428571428571428571
pi

In [None]:
1stName = 'John' # Variable starting with a number

In [None]:
place@ = 'Pittsburgh'

In [None]:
firstName = 'Harry'
lastName = 'Potter'
print(firstName)
print(lastname)

In [None]:
class = 'Quantum Physics'

### Expressions vs. Statements
 - An expression is a combination of values, variables, and operators
 - A Statement is a unit of code that has an effect , like creating a varibale or displaying a value

In [None]:
42 # expression

In [None]:
10 + 2 # expression

In [None]:
n = 17 # statement
print(n) # statement

### Binary Operators and Comparisons

Most of the binary math operations and comparisons are as you might expect.  <br>
It is a good practice to surround the following binary operators with a single space on either side:
 - assignment (=)
 - augmented assignment (+=, -=)
 - comparisons (==, <, >, !=, <=, >=, `in`, `not in`, `is`, `is not`)
 - Booleans (`and`, `or`, `not`)

If multiple operators are used, consider adding whitespace around the operators with the lowest priority.

In [None]:
5+7.3

In [None]:
10-2.5

In [None]:
20*5

In [None]:
10/3

In [None]:
10//3 # Floor devide 10 by 3. dropping any fractional remainder

In [None]:
2**3 # 2, raised to the third power

In [None]:
10%3 # modulo operator yields the remainder after division

When there are multiple operators in an expression evaluation depends on the order of operations. PEMDAS [Order of operations](https://docs.python.org/3.6/reference/expressions.html#operator-precedence) is expected. 

In [None]:
(1+1)**(5-2)

In [None]:
1 + 2**3

### Boolean Logic
There are two boolean values: `True` and `False`.<br>
`True` and `False` can be assigned to variables just like strings or numbers.

Logical operators for boolean values: `==`, `!=`, `and`, `or`, and `not`


 - **<mark>Equivalence</mark>**: <br>**`a == b`** evaluates to `True` if both `a` and `b` are the same. 
     - Both `a` and `b` are `True`
     - Both `a` and `b` are `False`  

 - **<mark>Negation</mark>**: <br>**`a != b`** evaluates to `True` when `a` and `b` are not the same.
     - `a` is `True` and `b` is `False` 
     - `a` is `False` and `b` is `True`
    
 - **<mark>and</mark>**:<br> **`a and b`** evaluates to `True` when both `a` and `b` are `True`
  - `a` is `True` and `b` is `True`
    
 - **<mark>or</mark>**:<br> **`a or b`** evaluates to `True` when `a` is `True` or `b` is `True` 
  - `a` is `True` and `b` is `True`
  - `a` is `True` and `b` is `False`
  - `a` is `False` and `b` is `True`
 - **<mark>not</mark>**:<br> `not a` evaluates to `True`when `a` is `False`, and to `False` when `a` is `True`

Can you complete the following truth table?

|`a`|`b`|`a == b`|`a != b`|`a and b`|`a or b`|`not a`|
|---|---|---|---|---|---|---|
|False|False|True|False|False|False|True|
|False|True||||||
|True|False||||||
|True|True||||||

In [3]:
# Comparison (Relational) Operators: ==, !=, >, <, >=, and <=

a, b = 5, 6 # Multiple assignments in one go
a == b # Returns True if a equals b

False

In [4]:
a!=b # Returns True if a is not equal to b

True

In [5]:
a < b # Returns True if a is less than b. Use <= for less than or equal

True

In [None]:
a > b # Returns True if a is greater than b. Use >= for greater than or equal

In [None]:
a > 1 and b > 1  # Both conditions must be satisfied (i.e., a is greater than 1 AND b is greater than 1)

In [6]:
a > 1 and b > 8

False

In [7]:
a > 1 or b > 8

True

In [9]:
# "is" evaluates to True if the variables on either side of the operator point to the same object and False otherwise
x = 'hello'
y = x
y is x

True

In [10]:
y is not x

False

### String Operations

In [11]:
# You can have single or double quotes to represent strings
print('Hello World')

Hello World


In [13]:
print("Hello World!")

Hello World!


In [14]:
print('Hello' + 'World')

HelloWorld


In [15]:
print(len('Python'))

6


In [16]:
print('Hello' + 5)

TypeError: can only concatenate str (not "int") to str

In [17]:
print('Python' * 5)

PythonPythonPythonPythonPython


In [18]:
print('Hello' * 'Python')

TypeError: can't multiply sequence by non-int of type 'str'

In [None]:
print('What is your name?')
myName = input()
print('Hi ' + myName)

What is your name?


Many Python objects can be converted to a string using the str function.

In [20]:
a = 5.2
s = str(a)
type(s)

str

In [None]:
s = 'python'
s[0] # the character at index 0

In [None]:
s[1:4]

In [None]:
s[:2] # slicing from the beginning of the string

In [None]:
s[4:] # slicing from middle to end

#### String Formatting

In [21]:
first_name = "Mary"
last_name = "Smith"
age = 24

print ('She is {} {} and she is {} years old.'.format(first_name, last_name, age))

She is Mary Smith and she is 24 years old.


In [22]:
print ('She is {0} {1} and she is {2} years old.'.format(first_name, last_name, age))

She is Mary Smith and she is 24 years old.


In [23]:
print ('She is {2} years old and she is {0} {1}.'.format(first_name, last_name, age))

She is 24 years old and she is Mary Smith.


 - The backslash character `\` is an *escape character*, which means that it is used with special characters.
 - E.g., `\'` (single quote), `\"` (double quote), `\t` (tab), `\n` (new line), `\\` (Backslash)
 - If you need to write a string literal with special characters you need to escape them with a backslash.

In [None]:
print('That is Shakya\'s Cat!')

In [25]:
print('hey, what\'s up? \n I\'m doing fine')

hey, what's up? 
 I'm doing fine


### Try it out

Write a print statement to produce the following output:

`Hey, what's up?
I'm doing fine`

In [None]:
# when you have a string with lots of backslashes but no special characetrs you can preface the leading quote with r.
# This means that the characters should be interpreted as is.

s = r'this\has\no\special\characters'
print(s)

#### Multiline Strings
A multiline string begins and ends with three single quotes or three double quotes.

In [26]:
print('''Dear Amy,

Shakya's cat has been arrested for cat burglary.

Sincerely,
Bob''')

Dear Amy,

Shakya's cat has been arrested for cat burglary.

Sincerely,
Bob


### Try it out
Can you write code to produce the output above, without using the multiline strings?

### Useful String Methods

In [41]:
myStr = 'Hello World!'

myStrUpper = myStr.upper()
myStrUpper
mystr1 = myStr.lower()
mystr1


'hello world!'

In [None]:
print(myStrUpper.lower())

In [None]:
'HELLO'.isupper()

In [None]:
'abc123'.islower()

In [None]:
'Hello'.upper().lower().upper()

In [None]:
'Hello'.upper().isupper()

In [None]:
'hello'.isalpha() # returns True if the string consists of only letters

In [None]:
'hello123'.isalpha()

In [None]:
'hello123'.isalnum() # returns True if the string consists only of letters and numbers

In [None]:
'123'.isdecimal() # returns True if the string consists only of numeric characters

In [None]:
' '.isspace() # checks for spaces, tabs, and new lines

#### Join, Split, and Strip

In [42]:
', '.join(['cat','bat', 'rat']) # want to combine a list of words with a comma followed by a space

'cat, bat, rat'

In [43]:
' '.join(['My','name', 'is', 'Harry'])

'My name is Harry'

In [44]:
'*'.join(['My','name', 'is', 'Harry'])

'My*name*is*Harry'

In [45]:
'My name is Harry'.split()

['My', 'name', 'is', 'Harry']

In [46]:
'My*name*is*Harry'.split('*')

['My', 'name', 'is', 'Harry']

In [None]:
myStr = '    Hello   '
myStr.strip()

In [47]:
myStr.lstrip()

'Hello World!'

In [None]:
myStr.rstrip()

## Imports
 - <b>Module</b>: A file with <i>.py</i> extension containing Python code 

In [48]:
import keyword
keyword.kwlist # we are using the kwlist function of the keyword module

['False',
 'None',
 'True',
 'and',
 'as',
 'assert',
 'async',
 'await',
 'break',
 'class',
 'continue',
 'def',
 'del',
 'elif',
 'else',
 'except',
 'finally',
 'for',
 'from',
 'global',
 'if',
 'import',
 'in',
 'is',
 'lambda',
 'nonlocal',
 'not',
 'or',
 'pass',
 'raise',
 'return',
 'try',
 'while',
 'with',
 'yield']

In [49]:
numbers = [1,2,3,4,5] # a list of numbers

In [None]:
import statistics

mean = statistics.mean(numbers)
print(mean)

In [None]:
import statistics as s

mean = s.mean(numbers)
print(mean)

In [None]:
from statistics import mean

mean = mean(numbers)
print(mean)

In [None]:
from statistics import mean as m
mean = m(numbers)
print(mean)

In [None]:
from statistics import mean, median
mean = mean(numbers)
median = median(numbers)
print(mean)
print(median)

### Closing notes:
- Practice, practice, and practice: continue your exploration using the books in the references
- Read the requirements of the five-minute analytics challenge and become familiar with the video presentation tools (e.g., Panopto)

### References
 - Python for Data Analysis 2<sup>nd</sup> edition - Wes McKinney O'RIELLY
 - [Automate the Boring Stuff with Python](https://automatetheboringstuff.com/) by Al Sweigart.
 - Think Python 2<sup>nd</sup> edition ---[PDF](http://greenteapress.com/thinkpython2/thinkpython2.pdf) book