# Python for Data Science
#### See 'Python Library.docx' for additional info

In [3]:
# import math module as m
import math as m

# import copy module
import copy

## Functions
- `int(data)` converts data to 'int' (truncates floats)
- `float(data)` converts data to 'float'
- `str(data)` converts data to string
- `max(data)` returns the maximum value
    - data could be values separated by commas or a list (see numpy for arrays)
- `round(num[, digits])` rounds number to specified digits (not decimal places) or 'int' by default

#### Math Module
- `import math as m`
- `m.pow(num, pow)` raises num to pow (num ** pow)
- `m.sqrt(num)` returns the square root of num
- `m.ceil(num)` rounds num up
- `m.floor(num)` rounds num down
- `m.pi` returns pi to 15 decimal places

## Lists
#### List Comprehensions
- Process lists without using for loops and increases efficiency by saving memory and time
- Works like any iterable
- `new_list = [var_statement for var in old_list]`
    - `var_statement` is your calculation or code to execute for each list item
    - `var_statement` should reference `var` which is assigned by you and represents each element
    - `old_list` is the original list
- `new_list = [statement_with_var1_var2 for var1 in list1 for var2 in list2]`
    - nested syntax
- `statement` is something to return
    - () would return a tuple
    - [] would return a list
    - can also do arithmetic or other math-like-stuff

#### List Operations to Remember
- Slicing
    - `list[start:end:step]` just like with numpy arrays
    - if start or stop is blank, starts at the beginning or goes to the end respectively
    - default step is 1
- Creating a Deep Copy
    - `copy.deepcopy(list)` outputs a copy of a list, not just another reference to it
- `len(list)`, `min(list)`, and `max(list)` all perform as expected
- Sorting
    - `sorted(list[, key=function])`
        - **returns a new list** sorted, with optional function to run before sorting
            - option function like `key=str.lower` does not affect values, just sort behavior
            - see .sort() method for rearranging a list
    - `list.sort([key=function])`
        - **modifies the current list** and sorts, based on optional function
        - default sort order in all cases is 0-9, A-Z, a-z
- `list.append(item)` adds item to the end of a list
- `list.remove(item)` removes item from the list
- `list.index(item)` returns the index for the item
- `list.pop([index])` removes and returns item at specified index (or last item by default)
- `list.count(item)` returns the number of occurrences of item in list or 0 if none
- `list.reverse()` reverses the order of items in the list

## Dictionaries
- Checking if a key exists
    - `if key in dictionary:`
- Deleting items
    - `del dictionary['key']
    - `dictionary.pop['key'[, default_value]]
        - these throw errors if 'key' doesn't exist, or will return a 'default_value'
    - `dictionary.clear()` deletes all dictionary items
- `dictionary.get(key[, default_value])`
    - returns the value associated with key, or default_value if specified and key doesn't exist
    - works like `dictionary[key]` except you can specify default value
- `dictionary.keys()`
    - returns a view object of all the keys (**default iterator for a dictionary**)
- `dictionary.items()`
    - returns a view object containing a tuple of each key/value pair
- `dictionary.values()`
    - returns a view object containing all of the values
    
#### Looping Through Dictionaries
- Need an iterator obtained using the `.keys()`, `.items()`, or `.values()` methods
- `for key in dictionary.keys():`
      `statements with dictionary[key]`
- `for key, value in dictionary.items():`
      `statements using key and value`
- `for value in dictionary.values()`
      `statements using value

## Formatting Numbers as Strings
- Used for display or converting numbers to strings
- `"{:format_specification}".format(data)` formats supplied `data` to `:format_specification`
- It is possible to pass mulitple format specifications and multiple data entries/types
    - `"{:form_spec1}{:form_spec2}{form_spec3}".format("str", float, int)
    - can print grid if doing multiple times using consistent `field_width` in each position
- format_sepcification is comprised of:
    - `[field_width][,][.decimal_places][type_code]`
    - `field_width` is specified in pixels as 'int'
        - by default, strings are justified left and numbers justified right
        - add `>` or `<` just before pixels to specify 'right' and 'left' justified
    - `,` specifies whether large numbers use commas
    - `.decimal_places` dot with number of decimal places to include
    - `type_code` specifies data type
        - `d` integer (decimals can't be specified)
        - `f` float (will round to specified decimal places)
        - `%` percent (supply decimal, it multiplies by 100 and adds '%'
        - `e` converted to scientific notation

In [4]:
# format grid example
heads = ["item", "price", "qty"]
x = ["hammer", 9.99, 11]
y = ["nails", 1.99, 24]
print("{:15}{:>8}{:>8}".format(heads[0], heads[1], heads[2]))
print("{:15}{:8.2f}{:8d}".format(x[0], x[1], x[2]))
print("{:15}{:8.2f}{:8d}".format(y[0], y[1], y[2]))

item              price     qty
hammer             9.99      11
nails              1.99      24


## Program Structure
#### Shebang
- `#!/usr/bin/env python3`
    - first line of a .py file
    - ignored by Windows but used by Unix-like systems to use correct interpreter

#### Three-Tier Approach
- Good practice when writing a program to use three tier approach
- Main function
    - starts an application, but is defined last
    - `def main:` with code to execute the app
    - `if __name__ == '__main__':
          `main()`
- UI Tier
    - User interface
    - Contains the `__main__` function
    - Console app is usually procedural code vs GUI which is object-oriented
- Business or Object Tier
    - Processing tier
    - OOP to work with classes/objects and data from a database
- Database Tier
    - Provide database access