![ADSA Logo](http://i.imgur.com/BV0CdHZ.png?2 "ADSA Logo")

# ADSA Workshop 1 - Introduction to Big Data &amp; Python
> Workshop content adapted from https://github.com/ADSA-UIUC/PythonWorkshop_1 by Srujun Gupta and http://www.amazon.com/Data-Science-Scratch-Principles-Python/dp/149190142X

***

Welcome to the first workshop in the ADSA Python series. In this Python tutorial, we will learn how to:
* Declare variables
* Work with strings of text in Python
* Perform arithmetic operations in Python
* Use if-else conditionals
* Create loops
* Work with basic data structures
* Declare functions

To run any block of code, type `Ctrl + ENTER`. This will execute the code and tell you the errors encountered, if there were any.
***

Import the testing helper code for simple tests of your code. Run this block of code so that the tests that we have provided will be able to run.

In [None]:
from test_helper import Test

## Hello world!

Let's start with a classic "Hello world!" program. In order to print text to a screen in Python, we can use the `print` function followed by text in quotations. Run the code block below to see its output.

In [None]:
print "Hello world!"

Now try printing `"I am learning Python!"` onto the screen by typing your code into the block below.

***
## Comments
If any line starts with a `#`, then the Python interpreter ignores that line when running your code. You can use these lines, called comments, to annotate and explain your code.

In [None]:
# This is a comment.
# This line is not executed by the interpreter.
# print "This won't print" because it has a #
# The line below this does not start with #, it will be executed.

print "This is not a comment."

***
## Variables and Data Types

A variable is used to hold a value. Variables and values have an associated datatype. Common datatypes are:
* `string` (text)
* `int` (integer)
* `float` (decimal value)
* `booleans` (true or false values)
* Many more 

In many programming languages, you must specify what datatype your variable is, however, Python is able to tell the type based on the value you enter. When you declare a variable, you give the variable a name, and specify its value.

### Strings
Strings are a set of characters. They must always be enclosed in either single- or double-quotes. This is how you declare the string variable with name `my_string` and value `"Hello!"`.

In [None]:
my_string = "Data Science"

Now try printing `my_string` using the print command like so: `print my_string`.

In order to create multiline strings, we can use three sets of double quotes like this:

In [None]:
multi_line_string = """This is the first line of our string
and this is the second,
and woah! A third line!
"""

print multi_line_string

To determine the length (number of characters in a string) we can use the built-in Python function len() like this:

In [None]:
print len('You can enter a string like this')

new_string = 'Or you can enter a variable like this!'

print len(new_string)

#### Concatenating Strings
You can combine multiple strings into a single string using the `+`. In the code below, we are concatenating the strings in variables `first_name` and `last_name` to create the variable `full_name`, and then printing it. Replace the value for first_name and last_name with your name, and run the program. 

In [None]:
first_name = 'Joe '
last_name = 'Python'

full_name = first_name + last_name 
print full_name

Now create two strings with names `word1` and `word2` and concatenate them to form the string `BigData`. Store this in the variable `my_word`.

In [None]:
word1 = 
word2 = 

my_word = 

In [None]:
Test.assertEquals(my_word, "BigData", "incorrect string value for variable my_word")

You can add a space between the two concatenated strings like so:

In [None]:
my_word = word1 + " " + word2
print my_word

### Numbers

You can do all of the basic operations with integers. Addition and subtraction use the standard plus and minus symbols. Multiplication uses the asterisk, and division uses a forward slash. Exponents use two asterisks.

In [None]:
print 3+2

In [None]:
print 3-2

In [None]:
print 3*2

In [None]:
print 3/2 # Dividing two integers returns the nearest integer rounded down

In [None]:
# If we want to return the real value from integer division, we can import the Python module division from __future__
from __future__ import division 

print 3/2

In [None]:
print 3**2 #This equates to 3^2

With decimal numbers (called `float`), you can do the same operations.

In [None]:
print 1.7*4.4

In [None]:
print 33.3/11.1

And like strings, you can store them in variables and refer to them by the variable names.

In [None]:
x = 5
y = 12

# The + sign below does not concatenate the two variables X and Y because they are not strings.
# The + sign in this case will compute the sum of the two variables.
print x+y

Now try storing the value of `x-squared plus y-squared` in the variable z.

In [None]:
z = 

In [None]:
Test.assertEquals(z, 169, "incorrect value of variable z, should be 169.")

***
## If-Else Conditional
Has an application ever asked you a question? Maybe it asked you if you really want to quit because unsaved changes might be lost, or if you want to leave a webpage. If you answer OK, one thing happens. But if you answer No or Cancel, something else happens. In all those cases there is a special piece of code that is being run somewhere. It is an if condition.

Like all languages, Python allows us to conditionally run code.

To have an if condition we need the idea of something being true and something being false. Remember, we call numbers "integers" and "floating point", and text "strings". We call true or false "boolean" values. True would represent OK where as false would represent No or Cancel in the example above.

The literal values in Python for true and false are `True` and `False`

Try running the blocks of code below to get a sense of how boolean conditions work.

In [None]:
1 > 2

In [None]:
"Cool".startswith("C")

In [None]:
"Cool".endswith("C")

In [None]:
"oo" in "Cool"

In [None]:
42 == 1 # note the double equals sign for equality

In order to write an "if" statement we need code that spans multiple lines

    if condition:
        print("Condition is True")
    else:
        print("Condition is False")

Some things to notice. The if condition ends in a colon (":"). In Python blocks of code are indicated with a colon (":") and are grouped by white space. Notice the else also ends with a colon (":"), "else:". Let's try changing the condition and see what happens.

In [None]:
# condition holds either a True or False value
condition = 1 > 2
if condition:
    print("Condition is True")
else:
    print("Condition is False")

About that white space, consider the following code below. Since the last print statement isn't indented it gets run after the if block or the else block. Also note that the comparison condition can be directly used in the if statement and does not have to be stored in a variable.

In [None]:
if 35 >= 17:
    print("Condition is True")
else:
    print("Condition is False")
print("Condition is True or False, either way this is outputted")

You can also use boolean logic inside if conditions using the keywords `and`, `or`, and `not`.

***
## Data Structures

While declaring simple variables to hold data is very useful, once you begin working with a large set of related data or values, it becomes necessary to group this collection of data in some sort of structure to work with.

Below, we will discuss a few fundamental data structures in Python and compouter science in general.

### Lists

Lists are perhaps the most fundemantal data structure in Python. A Python `List` is very similar to what other languages call an `Array`, however it has some added functionality.

A list in Python is just like a shopping list or a list of numbers. They have a defined order and you can add to it or remove from it.

Let's take a look at some simple lists.

In [None]:
# The empty list
[]

In [None]:
["Milk", "Eggs", "Bacon"]

In [None]:
[1, 2, 3]

List literals are all about square brackets ("[ ]") and commas (","). You can create a list of literals by wrapping them in square brackets and separating them with commas.

You can even mix different types of things into the same list; numbers, strings, booleans.

In [None]:
[True, 0, "Awesome"]

We can put variables into a list and set a variable to a list.

In [None]:
your_name = "Albert O'Connor"
awesome_people = ["Eric Idle", your_name]
print awesome_people

You can append to a list. The following code lets you add an item to the end of a list.

In [None]:
awesome_people.append("John Cleese")
print awesome_people

Lists, like in many other languages, are 0-indexed. Which means that when you try to access the first element in the list, you use the index 0 like so:

In [None]:
awesome_people[0]

And the second element like this:

In [None]:
awesome_people[1]

There are many Python functions which can be performed on Lists. Here is an example using Python's len() and sum() functions.

In [None]:
my_list = [1,2,3,4,5]

print len(my_list) # Prints the number of items in the list

print sum(my_list) # Sums up the total value of all integers in the list

Another useful List operation is `in`, which returns true or false depending on if a value is or is not in a list.

In [1]:
my_list = [1,2,3,4,5]

print 3 in my_list # Checks if the value 3 appears in our list

print 6 in my_list # Checks if the value 6 appears in our list

True
False


***
## Loops

Indexes are useful, but lists really shine when you start looping. Loops let you do something for each item in a list. They are kind of like if statements because they have an indented block.

They look like this:

    for item in list:
        print(item) # Do any action per item in the list

"for" and "in" are required. "list" can be any variable or literal which is like a list. "item" is the name you want to give each item of the list in the indented block as you iterate through. We call each step where item has a new value an iteration.

Let's see it in action with our list:

In [None]:
# This is what out list of awesome people looks like right now:
print awesome_people

for person in awesome_people:
    print(person)

This is bascially the same as writing:

In [None]:
person = awesome_people[0]
print(person)
person = awesome_people[1]
print(person)
person = awesome_people[2]
print(person)

But that is a lot more code than:

    for person in awesome_people:
        print(person)

Considering that our list of awesome people could be very long!

You can use the built-in function "range" to create lists of numbers easily.

In [None]:
range(0, 10)

And then we can use that with a loop to print a list of squares. Note that we use special string formatting here in which `{0}` represents the first variable inside the format parentheses, and `{1}` represents the second variable.

In [None]:
for number in range(0, 10):
    print "{0} squared is {1}".format(number, number**2)