## QUAAF Python Crash Course
### Part 1 Data Types & Structures

#### Numerics

In [19]:
#integers (whole #'s only)
x = 7
print(type(7))

#floats (any # with a decimal)
y = 7.0
print(type(y))

<class 'int'>
<class 'float'>


#### Basic Math

In [93]:
#addition/subtraction
x = 7
y = 7
w = x+y
z = x-y
print(w,z)

#multiplication/division (note that dividing an int/int will give you a float)
a = x*y
b = x/y
print(a,b)

#exponents
c = x**2
d = x**3
print(c,d)

14 0
49 1.0
49 343


#### Strings

Strings are objects that can contain any text and/or numerics. They are also **immutable** ie. once a string is created it cannot be changed without creating a new string. Strings are largely used for labelling variables, printing out text, labelling plots, etc. **Any numbers contained in strings have to be extracted and converted into ints or floats in order to be used for just about anything other than being printed as text.**

In [94]:
#strings (can contain any text/numerics enclosed with "" or '')
s = "this is a string object"
s_1 = "65648" 
print(s)
print(type(s),type(s_1))

this is a string object
<class 'str'> <class 'str'>


In [47]:
#basic string operations
print(s.capitalize()) #capitalize the 1st character of a string
print(s.split()) #split the string into its components
print(s.find('string')) #locate the position of certain element in a string
print(s.replace('this','...stonks only go up...')) #replace string elements with new elements

This is a string object
['this', 'is', 'a', 'string', 'object']
10
...stonks only go up... is a string object


### Basic Data Structures

#### Lists 

A list is a collection of **objects that is ordered and changeable**. Lists can contain numerics, strings, other lists, etc. Each object in a list has its position in the list represented by an index value. In python these indices are arranged as follows: [0,1,2,3,...-1]. Lists are also **mutable giving you the ability to be change, delete and add elements to an existing list without having to make new one**. For most things you'll be doing with python for finance you'll want to store your data in lists rather than tuples (see below) unless you have good reason.

In [102]:
#lists can created with []
l = ["Titantic",1997,"Leo","DiCaprio","Leo"]
print(l)

#other data structures (like tuples) can also be converted into lists using list()
l = list(t)
print(type(l))

#working with lists is just like working with tuples
print(l[0]) #selecting the 1st item
print(l[-1]) #selecting the last item
print(l.count('Leo')) #counting the # of Leo's in the list
print(l.index(1997)) #finding the position of 1997 in the list

['Titantic', 1997, 'Leo', 'DiCaprio', 'Leo']
<class 'list'>
Titantic
Leo
2
1


#### Tuples

Tuples are very similar to lists with the key difference being that they are **immutable ie once a tuple is created it cannot be changed without creating a new tuple**. Tuples are typically used for grouping together related info and are used in lieu of lists in certain situations in order to do things like optimize memory usage, etc. These are likely scenarios you won't need to worry much about. **In almost all cases you should be using a list not a tuple (unless you have a good reason otherwise).** Think of tuples as the Read-Only companion to lists.

In [103]:
#tuples can be created with () 
t = ("Titantic",1997,"Leo","DiCaprio","Leo") 
print(type(t))

#elements within tuples & lists are selected by their positions
#python list/tuple indices go: [0,1,2,..,-1]

#working with tuples
print(t[0]) #selecting the 1st item
print(t[-1]) #selecting the last item
print(t.count('Leo')) #counting the # of Leo's in the tuple
print(t.index(1997)) #finding the position of 1997 in the tuple

<class 'tuple'>
Titantic
Leo
2
1


#### Dictionaries

Dictionaries are mutable collections of **unordered, changeable and indexed objects**. They are made with a key/value structure where each key has a corresponding value. They are extremely memory efficient, can be rapidly expanded and very useful for large amounts of paired data like employee ID #s & names, positions & salaries, employees & supervisors, etc.

In [110]:
#dictionaries can be created with {}
d = {
     'Jim' : 'EmployeeID #69',
     'Salary' : 67900,
     'Supervisor' : 'Bob'
     }
print(type(d))

#working with dictionaries
print(d["Jim"],d["Salary"]) #using name & salary keys to obtain ID # and salary values
print(d.keys()) #obtain a list of all keys in the dictionary
print(d.values()) #obtain a list of all values in the dictionary
print(d.items()) #obtain a list of all the key,value pairs in the dictionary

<class 'dict'>
EmployeeID #69 67900
dict_keys(['Jim', 'Salary', 'Supervisor'])
dict_values(['EmployeeID #69', 67900, 'Bob'])
dict_items([('Jim', 'EmployeeID #69'), ('Salary', 67900), ('Supervisor', 'Bob')])


### Numpy Data Structures

#### Creating NumPy Arrays

NumPy arrays are very similar to lists in terms of the types of data they can store, how they are indexed and how they are iterated through. The key difference being the types of functions you can perform on them. **You can perform just about anything you would want to on a NumPy array to transform into what you need it to be**. For this reason NumPy arrays are probably **the most important data structure to familiarize yourself with** as they are likely the data structure that you will find yourself working with most often.

In [125]:
#numpy arrays can be created with np.array([])
import numpy as np
a = np.array([0,0.5,1.0,1.5,2.0])
print(type(a))

#working with arrays
a[:2] #indexing just like with lists
print(a.sum()) #sum of all elements
print(a.std()) #std of array

#manipulating arrays (the kinds of things you cant do with lists)
print(a*2) #multiplying every element in array by 2
print(a**2) #squaring every element in array
print(np.sqrt(a)) #sqrting every element in array

<class 'numpy.ndarray'>
5.0
0.7071067811865476
[0. 1. 2. 3. 4.]
[0.   0.25 1.   2.25 4.  ]
[0.         0.70710678 1.         1.22474487 1.41421356]


In [132]:
#creating array of arrays
b = np.array([a,a*2,a**3])
print(b)

#working with array of arrays
print(b[0]) #selecting first row
print(b[0,3]) #selecting 4th element of first row

#array operations
print(b.sum()) #sum of all elements
print(b.sum(axis=0)) #sum along 0-axis (column-wise sum)
print(b.sum(axis=1)) #sum along 1-axis (row-wise sum)

[[0.    0.5   1.    1.5   2.   ]
 [0.    1.    2.    3.    4.   ]
 [0.    0.125 1.    3.375 8.   ]]
[0.  0.5 1.  1.5 2. ]
1.5
27.5
[ 0.     1.625  4.     7.875 14.   ]
[ 5.  10.  12.5]


### Pandas DataFrames

#### The DataFrame

Probably the most useful structure of them all a Pandas DataFrame is a 2-d, mutable, tabular data structure. In other words, its **basically a spreadsheet**. Columns, rows, data. Simple enough. DataFrames are more or less made up of NumPy arrays and behave in the same way (they can be operated on more or less however you'd like)

#### Creating a DataFrame

In [164]:
#creating a dataframe
import pandas as pd

df = pd.DataFrame([1,2,3,4],columns = ['numbers'],
                 index = ['a','b','c','d'])
df

Unnamed: 0,numbers
a,1
b,2
c,3
d,4


In [165]:
#working with the dataframe
print(df.index) #index values
print(df.columns) #column names
print(df.loc['c']) #selection via index
df **2 #squaring the df (just like with np arrays)

Index(['a', 'b', 'c', 'd'], dtype='object')
Index(['numbers'], dtype='object')
numbers    3
Name: c, dtype: int64


Unnamed: 0,numbers
a,1
b,4
c,9
d,16


In [166]:
#manipulating dataframe

df['floats'] = (1.0,2.0,3.0,4.0) #adding additonal col

df['names'] = pd.DataFrame(['John','Jim','Mark','Mike'],
index = ['d','a','b','c']) #adding aditional col (indexed)

df = df.append(pd.DataFrame({'numbers': 100, 'floats': 5.75,
                             'names': 'Henry'}, index=['z',]))
df

Unnamed: 0,numbers,floats,names
a,1,1.0,Jim
b,2,2.0,Mark
c,3,3.0,Mike
d,4,4.0,John
z,100,5.75,Henry


In [169]:
#df basic analytics
print(df.sum()) #calculates the sum of each col
print(df.mean()) #calculated the mean of each col

numbers                     110
floats                    15.75
names      JimMarkMikeJohnHenry
dtype: object
numbers    22.00
floats      3.15
dtype: float64
