## Workshop II&mdash;Basic Python

This workshop is a beginner's Python workshop, but it has a specific goal in mind. Thus my focus is on providing a summary of the Python necessary for data analysis development and nothing more. If you're serious about this domain you will need to spend a lot of time learning Python on your own as well!

Obligatory print statement.

In [None]:
print("Hello World!")

Print statement in a function.

In [None]:
def print_stuff():
    print("Hello World!")

Now we can execute it.

In [None]:
print_stuff()

Functions can take arguments.

In [None]:
def print_better_stuff(arg):
    print(arg)

In [None]:
print_better_stuff("Better stuff right here!")

When an argument is declared this way it is mandatory: the function call will fail otherwise.

In [None]:
print_better_stuff()

Python has a facility for optional parameters.

In [None]:
def print_even_better_stuff(stuff="Hello World!"):
    print("Hello World!")

In [None]:
print_even_better_stuff()

In [None]:
print_even_better_stuff(stuff="Goodbye World!")

Print statements are great--you will use them to debug your code forever after. But let's move on to functions which actually do stuff to things, and then return them. This is the bread and butter of programming. In Python this is handled by a `return` statement.

In [None]:
def print_this_thing(stuff_to_print):
    return "There, I gave you " + stuff_to_print + ". Happy?"

In this case we are building a `string`. One of the simplest data types, a string is just that: a string of characters.

In [None]:
print_this_thing("50 bucks")

You can tell this is a string because it has single quotes (`'`) around it. You can also have double-quoted strings, the difference is only that of a minor technicality:

In [None]:
"Also a string."

Other data types are integers:

In [None]:
1 + 1

And floats (for "floating point").

In [None]:
3.14 + 42

All of these data types represent instances of `objects`. We can bind objects to names so that we can use them to do useful stuff.

In [None]:
life = 1
death = -1
life + death # 1 + -1

Functions which `return` something return objects.

In [None]:
def return_42():
    return 42

life = return_42()
death = -return_42() # Hey! We took a negative!
life + death

Python is an object-oriented language, as is almost every programming language of note today, so in addition to these simple objects - strings, floats, etc. - we can also define our own, more complicated objects. Here's how to create one:

In [None]:
class Life_Universe_Everything:
    
    def __init__(self):
        self.answer = 42
    
    def answer_question(self):
        return self.answer

In [None]:
question = Life_Universe_Everything()

There's a lot going on here - but in the interest of time we'll gloss over the details, since we'll only be using pre-existing objects in this project, not defining our own. Instead I want to point out two things:

1. To run an object method, do `object.method()`.
2. To access an object parameter, do `object.parameter`.

Since our `answer_question()` method is pretty silly, we can answer this question in two ways:

In [None]:
question.answer_question()

In [None]:
question.answer

These kinds of structures are especially important in data science, where robust error handling is super critical.

There are two kinds of basic data structures that we will use right away: `lists` and `dicts`. Here's a list:

In [None]:
silly_list = [1, 2, 3, 4, "Elmo wants to count some more!"]
print(silly_list)

To index a list you use an index. **Note that indexes in programming start at 0, not 1**. See:

In [None]:
print(silly_list[0])
print(silly_list[1])

If you try to get an item that is outside of the bounds of the list, you get an explosion:

In [None]:
print(silly_list[5])

The other data type is the dict. A dict stores information by name.

In [None]:
silly_dict   = {"One": 1,
                "Two": 2,
                "Three": 3,
                "Four": 4,
                "Five": 5
               }

To index these we have to call what we want by name.

In [None]:
silly_dict["One"]

We can provide control conditions using a so-called `if-else` block.

In [None]:
def count_to_four(name):
    if name == 'Elmo':
        print("One, two, three, four, I want to count some more!")
    else:
        print("Are you questioning my intelligence?")

In [None]:
count_to_four("Elmo")

In [None]:
count_to_four("You")

`==` here is a comparison operator. It checks if the `name` argument that we pass really is `Elmo`. `If` it is, do this, or `else` do that.

Loops are useful for more complex things. There are two types, `while` loops and `for` loops. A `while` loop executes as long as a condition holds, a `for` loop iterates through a list.

Most programming languages implement so-called `try-catch` blocks to help with handling errors. Python is no exception. Here's an example in action:

In [None]:
try:
    print(1 + "Dagnabit!")
except TypeError:
    print("You can't add an integer and a string! Like what?")

You can also `raise` your own errors.

In [None]:
raise OSError("Most troubling, Master Bruce.")

Why reinvent the wheel? A lot of libraries (packages) exist out there that solve a lot of the problems that you will encounter, at a minimum, and allow whole new worlds to explore, at a maximum.

To get these packages yourself you need to `pip install` them in the command console. We did that a bunch of times for the stuff that we need for this project at the beginning of this session (or at home beforehand, even better!). Then once they're available, you can `import` them so that you can use them.

Here's what happens when your package is not available:

In [None]:
import pseudoscorpion

Of course when it is available, nothing happens - it just works.

In [None]:
import os

There's one more semantic thing that we need to pay attention to. When importing a library you can choose instead to import a class or module from that library. There's all sorts of constructions that you can use for this stuff, but for our purposes you'll only see three:

* `import library`. Then, if we want the `Book` object in our code we'll need `library.Book`.
* `from library import Book`. Then if we want the `Book` object in our code we'll need the `Book` object, that's it.
* `import library as lib`. Then if we want the `Book` object we'll need `lib.Book`. Nothing major.

That concludes our crash course in basic Python!

# What you need to be doing on your own

All of our future work is going to be in Python so you really need to bootcamp the language.

Though it's a core skillset, being a pro Python programmer is not a prerequisite for doing data science. Even though it will take you a while, at first, to figure out how to do things, those things that you do will be more memorable and stick with you better in the long term if you learn them while working on a project that interests you, instead of random assignments that other people come up with that don't have any relation to what you're doing at all.

[This is the best introductory course in data science-targeted Python available](https://www.codecademy.com/). This course is ~13 hours long, more than a good enough time investment to learn everything that you need to start working on your own projects.

# What we will be doing next time

Today was a code-heavy day. Our next workshop will involve no code at all&mdash;instead we will work through installing the data science technology stack that we used today, but glossed over. Once we learn how all the bits and pieces of a project fit together next week, you'll have all of the tools you need to start working on projects on your own!