
# Intro to Python

This tutorial will cover the following topics:

* basics of jupyter notebooks
* variables
* operators
* lists
* dictionaries

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](http://colab.research.google.com/github/jayunruh/python_introDS_course/blob/master/python_class_01.ipynb)

# Jupyter Notebooks
## Markdown blocks
* Jupyter notebooks are an _interactive_ notebook used to run code (mostly Python, but it can do other languages)
* A 'cell' is a section of markdown or code that can be run

  * This is a markdown cell

* You'll notice that the formatting changes inside the cell when you are editing, and after you 'run' the cell

* You can create nicely formatted blocks of text with markdown [here's a great cheatsheet!](https://www.ibm.com/docs/en/watson-studio-local/1.2.3?topic=notebooks-markdown-jupyter-cheatsheet)

* Markdown sections be used to split up your analysis, and help define the 'story' of your analysis.


#
### Objects

* Python is an object orientated programming language, meaning almost everything you interact with is an _object_
  * An object is basically a fancy way to describe a collection of data (a _variable_) and a associated methods (_functions_)

### Variables

* Variables are the staple of any programming language.  They are simply names that act as placeholders for values.  Those values could be numbers or text strings (or objects, like a table!).  
  * There are few rules on naming them, but they cannot contain spaces or operators (e.g. + or -) or python keywords.  To create and assign a variable, simply enter it's name with = and the assignment value.

   * Tip: It's also good practice to avoid using spaces or operators in your file names!

* Not all variables are created equal - in fact, there are several different types of variables, but the underlying principles are the same!

### Functions

* Don't worry - we won't learn too much about functions today. For our purposes, just keep in mind that everytime you 'print' something, or find the max value, or do anything that involves closing a 
variable in paranetheses or using a '.' at the end of a variable you're calling a function!


In [None]:
print("This is a code block!")
print("When you hit shift+enter, or click the 'play' button - the code in this block will be executed!")
print("You can rerun a cell at any time, but you must be careful not to overwrite existing variables")


### Creating, viewing and manipulating variables

1. Assign a variable using the '=' operator
2. View the contents of a variable using the `print` function
3. View the type of a variable usign the `type` function

In [None]:
# This is a comment. Comments are identified by using the '#' symbol, and are not run as code.
my_variable = 2 
my_string = "This is a string" 
print(my_variable)
print(type(my_variable))
print(my_string)
print(type(my_string))

* Different `types` of variables have different methods (read: built-in functions) associated with them. For instance, you can use any mathematical operation on any variable that's an int, like `my_variable`
  * Variables don't always share the same methods. Check out what happens when you try to divide `my_string` by 2

In [None]:
#create a new variable "half_variable" and set it to my_variable/2
half_variable=my_variable/2
#now report that value and it's type
print(half_variable)
print(type(half_variable))

# What happens if I try to divide a string by 2?
my_string/2

* In addition to type, there are a few more helpful methods to view the contents and characteristics of a variable
  * This include useful information about what methods are available, and even documentation about the creation of the variable

* Don't worry too much about reading the code output - right now it's enough to know that the information is available to you

In [None]:
print(my_string.__doc__)

dir(my_string)

* Sometimes, performing an operation on a variable is enough to change it's `type` and therefore the methods available to it (and therefore the things you can do to it!)

In [None]:
my_num = 1
my_string_num = '1'

print(my_num/2)
print(my_string_num.upper()) 

my_num == my_string_num

* Sometimes you may want to intentionally change the type of the variable you're working with

In [None]:
my_num_str = str(my_num)
print(my_num_str)
print(type(my_num_str))

* There are quite a few fun operators that you may have seen before, including `==` or `!=` and `+=` or `-=`

In [None]:
#try incrementing our half_variable
half_variable+=1
print(half_variable)
#is it equal to the original now?
print(half_variable==my_variable)
print(half_variable != my_variable)

* One of the interesting and powerful things about variables is that they can contain boolean (true or false) values:

In [None]:
isequal=half_variable==my_variable
print(isequal)

### Lists
* The real power of programming is doing large number of operations at once.  To collect variables together we use lists.  Those of course get assigned to variables.  Once you make a list you can access it's values through bracketed indexes. 
  * A `list` does not take on the methods of the attributes it contains, instead it has it's own list of methods.


* Note that unlike R, Python indexing starts at '0'

In [None]:
my_array=[10.0,20.0,25.0]
print(type(my_array))
print(my_array[0])
print(my_array[2])

In [None]:
#len gets the length of an array
print(len(my_array))

* How can you access elements of your list?
  * The power of indexing!

In [None]:
print(my_array[-1]) #this is the last array value
print(my_array[-2]) #this is the second to last value

* You can work with list values just like any other variables.  You can combine arrays using the + operator.

In [None]:
long_array=my_array+[0,1,2]
print(long_array)

* You may have noticed that python is ok with mixing data types in arrays.  You can add to an array with "append".

In [None]:
long_array.append(5)
print(long_array)

* You can get parts of an array with "slicing" with start:end+1 notation:

In [None]:
long_array[1:4]

* If you want to create an array of repeated items, you can multiply it by the number of repeats:

In [None]:
my_repeats=[5]*5
print(my_repeats)

If you want you can even make lists of lists--the repeat multiplication works for this too.  This is as close as python gets to multidimensional arrays outside of numpy.

In [None]:
list_list=[long_array]*3
print(list_list)

### Dictionaries 
* Lists are a special type of construct called a "sequence".  Note that text strings and dataframes are also sequences.  Dictionaries are a kind of construct that is not a sequence--values are assigned by name rather than index:

In [None]:
my_dictionary={'value1':10.0,'value2':20.0,'another_value':1,10:3}
print(my_dictionary)


* Accessing the value of a key is as simple as calling the dictionary with the key of choice

In [None]:
print(my_dictionary['value2'])
print(my_dictionary['another_value'])
print(my_dictionary[10])

* You can also access keys and values independently of each other

In [None]:
print(my_dictionary.keys())
print(my_dictionary.values())
max(my_dictionary.values())

* Dictionaries are great ways to store information by label rather than by numerical index.  You can add values to a dictionary through simple assignment.

In [None]:
my_dictionary['a_string']='value'
print(my_dictionary)

In [None]:
max(my_dictionary.values())

# On reading error codes and troubleshooting issues

* The number one skill in programming is learning to read error messages - most of they time they're going to be pretty informative. 
* You can use google to search for errors - but it's most likely going to take you to StackOverflow: https://stackoverflow.com/

# On the possible pitfalls of Jupyter Notebooks
* The order that you run cells in is important - it is easy to overwrite a variable in a later chunk such that re-running an earlier chunk creates a different results

In [None]:
my_example = "pretend this is the result of a super complex dataframe manipulation"

In [None]:
my_example[0] == 'p'

In [None]:
my_example = my_example.upper()

# On examples that we can do if we have time...

In [None]:
# What are the types of the 3 values in our dictionary? What methods are available to them?
my_dict = {
    'value1': 'string',
    'value2': 2,
    'value3': {"my_dict_of_dicts": 1}
}