# Introduction to Python for Data Analysis 
### C Kaligotla | BUS 774 G200 
### 04 Feb 2023

#### Chapter 1. Basic Intro

This workshop is designed as an introduction to Python specifically for Data Analysis used in BUS 774 G200. 
If you are new to Python and programming, and you want a more general introduction to Python programming - I'd recommend additional Tutorials. I will provide some options at the end of the course.

Things to keep in mind: 

1. Python is the programming language

2. Jupyter notebooks is the environment for writing and executing Python interactively (one or a few lines at a time)

3. Anaconda is one of the distribution packages that provides Python, some standard Python libraries, Juptyter, and a bunch of other stuff.


The Jupyter notebook interface is very simple: it is a web page with interactive cells in which you type short snippets of Python. You then hit "Shift-Enter" to run the code and the results are shown immediately below. 
    Note: Python will only display the output (if any) of the last line in a cell.

You can also enter plain text into cells (called “markdown”) to document what you are doing or any notes you wish to take.

You can also use comments in your code by using'#'. Note that Python will not run anything after the '#' sign. See example below.

Careful with indenting in Python
    Python considers indents as a chaining command
    It's fine to do it with text or your notes, but not with code!

First Things First.

1. Install Anaconda on your computer.
2. Then open Anaconda and ensure Jupyter Notebook is installed.
3. Click on Jupyter Notebook and navigate to the location of the notebook you want to open, or where you want to save your notebook.

A note on Jupyter Notebooks:
1. To enter a new line of code or text, simply choose the "+" button from the menu.
    -> The default is a code entry cell -- you will see an "In []" symbol to the left of the cell.
    -> To enter text simply choose the dropdown list on the right of the menu options and choose "Markdown". 

In [None]:
# Pyhton's Hidden Mission statement
import this

In [None]:
# a typical first line of any programming language
print('Hello World')
# you should see an output below when you press "Shift+Enter"

In [None]:
print('DONT PANIC')

#### Chapter 2. Variables, Data Types, and Data Structures

##### 2.1 Variables
Variables are basic building blocks of any program. 
    They are units (of a certain class or type) that are assigned to individual values
'=' is the assignment parameter
    'a=5' => the value 5 is assigned to variable name 'a'
    
Few rules of about Variable names:
    1. Cannot start with a number, but a number can follow a letter
    2. Cannot have special characters except an underscore
    3. Typically start variable names with lower case letters
        3b. Upper case names typically used for Classes which are different 

In [None]:
#2.1 define a variable
x  = 5 # value 5 assigned to variable name x
x1 = 7
# now let's see what's assigned to this variable

#  you can either say print, or you can simply just use the variable name - try it!
    # only difference is print's output is shown separately than the output of the last line

print(x)
print(x1)
#x
#x1
# Now Rerun this cell without the print commands and instead simply call x and x1
    #-> remember - jsut remove the # sign to make a comment into a code.
    # if you don't use print, the ouput of the cell is the last command
    # if you use print, you see all print commands executed!


# To type some text in your print command, the basic structure is
print('x=',x) 
    # everything within '..' is considered text, 
    # the comma and the variable name tells what you want to print after that line

# now type out the same for x1!
print()

In [None]:
# getting comfortable with looking up errors
1x = 5
# remember you cannot start a variable name with a number!
# the error is telling you the syntax is invalid
# a lot of programming is looking things up - so please get used to seeing errors without panicking

##### 2.2 Basic Variable Data Types

A sampling of some basic variable types
To see what type of variable, simply use type() command pointing it at the value or variable name

1. Numbers (including floats, integers,and complex numbers)
    e.g., 
    num1 = 52
    num2 = 5.2
        Floats just means the decimal points float around
    num3 = 5
        Integers don't have decimal points 
    num4 = 5j
        Complex numbers 'i' (sqrt of -1) is denoted as j in python!

In [None]:
num1 = 52
num2 = 5.2    
num3 = 5
num4 = 5j

In [None]:
# To see what type of variable, simply use type() command pointing it at the value or variable name
print('num 1=',num1, 'is type', type(num1)) # you can use multiple ('<text>',<var/value>) structure in print commands
print('num 2=',num2, 'is type', type(num2))
print('num 3=',num3, 'is type', type(num3))
print('num 4=',num4, 'is type', type(num4))
# variables are stored as "class" types - i.e., a structure common to all variables in that class

2. Strings - characters declared as single or double quotes

    let's try the following examples:
    s1 = "String 1"
    s2 = "String 2"
    
    Now saw you want to join the strings (Concontenate)
            You can simple add  them together

In [None]:
s1= "String 1"
s2= 'String 2'
s1 + s2 # just joins the 2 strings!

In [None]:
# say you want add a spacce
print(s1 + ' ' + s2) #(you can simply add a space within + and write a chain of additions)

In [None]:
# now try this: 
s1 = '1'
s2 = '1'
# we declare both these as strings or charcters
#If you add them, say 
#s1 + s2 -  What is the answeR?
s1 + s2 # recall you are adding as strings

In [None]:
# now try this
'1' + 1

You cannopt add an int to a character string!
Get comfy reading errors!

3. Booleans 
    - simply True or False
    - Used for logic operations or statements

In [None]:
x=True
y=False

In [None]:
x==y

In [None]:
x!=y # != is Not Equal to

In [None]:
#what is this program doing?
n1 = 5
n2 = 5
ans = (n1==n2)
# What will the output be?

In [None]:
print(ans)
# let's check the type of ans
type(ans) 
# bool is boolean

##### 2.3 Data Structures

While variables are assigned a single value, 
    there are other data structures in Python that can take multiple values or an array of values

Some common Data Structures in Python:
    1. Lists
        List of numbers
        List of strings
        List of Lists
    2. Sets
        All elements of sets have to be unique!
        Set of Sets!
    3. Tuples
    4. Dictionaries

Some useful command for data structures:
    len() -> gives the length of the list, i.e., # of elements


###### 2.3.1 LISTS
think about lists as an array of values, of any data type
we define lists using '= [ , ]' structure, i.e., square brackets where each element separated by a comma
let's start with a list of numbers


In [None]:
my_list_1=[1,2,3,4] 
# the commas tell Python to store the unique value in a separate index (think cells in excel)
# ORDER MATTERS - the first element has index 0, the second has index 1, and so on...

print('length:',len(my_list_1))
print(my_list_1)
print(type(my_list_1))

In [None]:
# now let's make a list of characters or strings
my_list_2 = ['i','want','beer']
print('length:',len(my_list_2))
print(my_list_2)

In [None]:
# now let's make a list of lists
my_list_3 = [my_list_1,my_list_2]
print(my_list_3)
print('length:',len(my_list_3))
# let's print the second element of my_list_3
my_list_3[1] # remember - python starts indexing with 0!

##### 2.3.2 Sets
sets have unique elements
sets defined by using '= { , }' structure where commas do same thing as in lists
order not as important as elements are unique

In [None]:
my_set1 = {1,2,3,4}
print(type(my_set1))
print(my_set1)
print('length:',len(my_set1))

In [None]:
my_set2 = {1,1,2,2,3,3}
    # even though this set has 6 values, there are only 3 unique values
print(type(my_set2))
print(my_set2) 
print('length:',len(my_set2))

In [None]:
# Ordering Matters for Lists

[1,2] == [2,1] # list


In [None]:
# order does not matter for sets
{1,2}=={2,1}

In [None]:
# order does not matter for sets
{1,1,2,3,3,3,3}=={3,2,1}
# useful for storing unique values in your data!

###### 2.3.3. Tuples 
-similar to lists - same length calculation, i.e., NOT unique values
-order matters
-Tuples cannot be appended, i.e., you cannot add on a value to a tuple once defined
    => tuples once set, can only be rewritten completely - you cannot add individual values
-useful to store (x,y) values 
-much more efficient with memory 
    
-Tuples defined by using '= ( , )' structure where commas do same thing as in lists

In [None]:
my_tuple= (1,2,3)
print(my_tuple)
print(type(my_tuple))
print('length',len(my_tuple))    

In [None]:
# recall my_list_3 from earlier
print(my_list_3)

In [None]:
# now let us append this list, i.e., add elements using <obj name>.append(<what to add>) function
my_list_3.append('NOW') 
print(my_list_3)
    # now try to append a list at the end to my_list_3!
    

In [None]:
# this won't work for tuples
my_tuple.append(4)
# remember - read the error message

##### 2.3.4. Dictionaries
- just like real dictionaries - think of a lookup type function
- structure is within { }, separated by ',', and each element has a 2 values in this format: 'key' : 'return' 
- if you enter the value for  <dictionary.name>[key], you will get the (return) value as output
- like sets, only stores unique key values

In [None]:
# use indentation for multi line command
my_dictionary = {
    'apple': 'keeps doctors away',
    'bear': 'fluffy but scary animal',
    'beer': 'good',
    2:'Min. # of beers',
    42: 'the answer to life the universe and everything '
}
# notice you can use different data types in a dictionary

In [None]:
print(type(my_dictionary))
print('length',len(my_dictionary))
# to lookup a value 
my_dictionary['beer'] # beer in ' '  since it is a string 

In [None]:
my_dictionary[2] # here you are looking up a number, so no ' '

In [None]:
my_dictionary[42]

##### A note on how Python interprets things...

Say you have the following lines of a program:
    a=2
    b=3
    c=a+b
Python sees 'a=2', and stores the value '2' in the memory and assigns it to variable name 'a'. 
It essentially points or maps 'a' to the data stored in the memory, '2'!
Python then sees 'b=3' and does the same
Python then sees 'c=a+b' 
    - it first looks up value of 'a', fetches the value '2, 
    - then does the same for 'b', 
    - then peforms the addition and stores the value to variable 'c'
    
 You don't need to know how Python looks up value or data stored in memory.
 Bur remember that these are just mapping
 
Now consider a program
    a = [1 2 3 4 5]
    b = a
 'a' here refers to a list or vector of values
 If you change one value of 'a'  (Say 1 becomes 6), and you run the 2 lines of the cell,
     Then python doesn't duplicate the value of 'a' to assign to 'b'. 
         Instead, it simply assigns the data stored in memory for 'a' to 'b'
         
Let's test this out!
 

In [None]:
a = [1,2,3,4,5] # python stores these numbers to variable 'a' as a list or vector 
# each list or vector has index values (like excel) starting with 0,1,2....
# Thus a[0], the first value in list a is 1
# a[1], the second value in list a is 2, and so on
# let's print a and see! - you dont need to say print, you can simply just use the variable name
a
# Now let's print the second value in list a
# a[1]

In [None]:
# now let's assign variable b and make it equal to a
b=a
#check value of b - same as a right?
b

In [None]:
# now let's change the first index value in list a
a[0] = 6 # look up index value [0] within a and assign value 6

In [None]:
# now check value of a
a 
# see how the first index value to 6?

In [None]:
# now let's see value of b
b
# notice value of b has changed and it remains the same as 'a' even though we only changed value of a 
# remember when we assigned b=a, we simply pointed variable b to the data stored memory for a

#### Chapter 3: Operators and Logic Operations


##### 3.1 Arithmetic Operators
- common math or formulaic operators
- careful: sqrt(-1) is denoted by j
- BODMAS rules apply
+, - , *, / ,
** for exponents (not ^) 

In [None]:
1+1

In [None]:
5**2

In [None]:
5**(.5) # sqrt

In [None]:
3 + 2*(4*5**2) -9/3 # BODMAS Rules

##### 3.2 Logic operations: 
typically boolean operations

Comparison Operators
       1. '==' is a comparison operator (is left hand side (lhs) equal to right hand sinde (rhs))
       2. '>' or '>='
       3. '<' or '<='
       and so on....
Logical Operators
    1. and - both sides need to be True for True
    2. or - at least one side needs to be true
    3. not - flip boolean
    
Membership Operators
    1. in
    2. not in 

In [None]:
# Comparison Operators
x1 = 5
x2 = 6
print(x1==5)
print(x1+1==x2)
print(x2<=x1)
print(x2>=x1)
# please play around to build intuition
# you can do this with lists, sets, tuples....

In [None]:
# Logical Operators
print(True and True)
print(True and False)
print(False or False)
print(False or True)
print(not False)

In [None]:
# membership - useful to check if something belongs in an object
x= [1,2,3,4]
print(1 in x)
print(10 not in x)
print('cat' not in 'my pet cat')
print('beer' in my_dictionary)

#### Chapter 4: Control Flow (if-else, loops)

##### 4.1 If-else statements

   1. If [check]: 
           then [return/do this]
   2. If [check]:
           then [return/do this] 
       else: 
           [return/do this]
           
Remember - Indentation matters!

In [None]:
a=True
# simple if statement
if a:print('it is true!')

In [None]:
# if statement with multiple operations
x=1
y=1

if x==y: # this is check
    print('it is true!') #operation 1
    print('also print this') # operation 2

# now change y to 2 and see what happens?
# it doesn't run coz the check failed!

In [None]:
# if else statement
# note the indentation!

x=1
y=2

if x==y:
    print('it is true!')
    print('also print this')
else:
    print(' It is false')       
    
# now change y to 2

In [None]:
# complex if else - a second if-else within the first

x=1
y=1
z = 'beer'

if x==y:
    print('it is true!')
    print('also print this')
    if (z in my_dictionary):
        print('What if i get beer after class?',my_dictionary['beer'])
        print(my_dictionary[2],'2')
else:
    print(' It is false')       
    
# now change y to 2
# now change z to 'coffee'

##### 4.2 For and While Loops 

Used to avoid complex if else statements

Remember, indentation matters

Also be careful: avoid infinite loops


for [condition]:
    do [this]
    
while [condition]:
    do [this]
   

In [None]:
# for loops
a = [1,2,3,4,5]

for number in a: 
    print(number**2) 
# for each number in a (condition):
    # print square of the number (return)

#the condition in here is pointing within object a, and Not doing a check

In [None]:
# while loops

a=0 
while a < 5:
    print(a)
    a=a+1  
    
# what happens if you forget the last line? (a=a+1)
    # you get an infinite loop!
    # DONT DO THAT!

#### Chapter 5: Functions

Functions are essentially automated operations
Think of the print() function, etc....

structure: 
    def <function.name>(input):
        do [this]
        return [this]  # if needed

sometimes functions don't need to return anything (e.g., a function to append a value into a list)

In [None]:
# simple function with 1 input
def fn_1(val):
    return val ** 3

# define function called fn_1 with input (val):
    # return val to the power 3
    

In [None]:
fn_1(21111)

In [None]:
# function with 2 inputs
def fn_2(val1,val2):
    return (val1 ** 2) + (val2 ** 2)

In [None]:
fn_2(1,2)
# you can use this function on any 2 inputs

In [None]:
# function with no return
def fn_3(num,list):
    list.append(num)
    
# define function fn_3 with 2 inputs, a num, and a list:
    #append the num to the list

In [None]:
a=[1,2,3,4] # example list
b=5 # example num

fn_3(b,a) # function to append b into a
print(a) # magic!

A package is essentially a collection of functions

When we import pandas for example, we are simply calling all functions that are part of pandas package
pandas deal with managing data
for plotting, we use other packages, like seaborn

packages are why Python becomes to useful

### END OF CODE

There's a lot more to python programming, but this introduction is a good starting point for class.
I'll provide a curated list of learning resources if you want to get deeper into it
