# Python and Jupyter Introduction

One of the post popular languages for data science today is Python.  It is easy to pick up for anyone who has some previous coding experience.  It is also used in the early UTG courses with PixelPad so it is likely familiar to many of you!

Python was named after the comedy troupe and show Monty Python's Flying Circus.  

We're using Python in the context of a Jupyter Notebook system.  The Jupyter Notebook is an open source web application that we'll use to create and share documents that contain live code, equations, visualizations, and text.

Let's get started with a review of Python on Jupyter Notebook.

In [7]:
spam_amount = 0
print(spam_amount)

# Ordering Spam, egg, Spam, Spam, bacon and Spam (4 more servings of Spam)
spam_amount = spam_amount + 4

if spam_amount > 0:
    print("But I don't want ANY spam!")

viking_song = "Spam " * spam_amount
print(viking_song)

0
But I don't want ANY spam!
Spam Spam Spam Spam 


In this code you can see a variable assignment to an integer, some arithmentic, an if statement and multiplying strings.

Some things to remember about the syntax of Python:
* that there are no semicolons as in Java
* no brackets `{` for code blocks, and indentation is very important.  The colon at the end of the if statement says there is more code to write after.  This is the same for function definitions
* We don't declare the type of the value.  Even though the type is not declared, variables have types.

In [9]:
print(type(spam_amount))
print(type(19.5))


<class 'int'>
<class 'float'>


## Getting help

One thing is that in data science we use a lot of libraries and built in Python functions.  With Python you can get help if you don't remember what something does.  Lets say you can't remember what the `abs` function does, you can use the help function like this.

In [15]:
help(abs)

Help on built-in function abs in module builtins:

abs(x, /)
    Return the absolute value of the argument.



Note that you don't want to provide the output of the function, just the function itself.  So this would be wrong:

```
help(round(2.3))
```
Whereas this is right:
```
help(round)
```

Also, you should not hestiate to Google your questions, or use Stack Overflow.  The Python and Pandas community is large, and you can most likely find the answer to any of your questions there.

## Defining functions

One thing that we do a lot is create reusable functions ourselves.  These can be used to do the same thing repeatedly, and can be used as arguments to other functions as shown later.  When doing data analysis, we may need to create functions to pass in to another function to aggregate data in some way.

Lets create a function that finds the least difference between some numbers


In [17]:
def least_difference(a, b, c):
    diff1 = abs(a - b)
    diff2 = abs(b - c)
    diff3 = abs(a - c)
    return min(diff1, diff2, diff3)

We can then call it like this to test it out and understand better what it does.

In [19]:
print (least_difference(1,10,100),
least_difference(1,10,10),
least_difference(5,6,7),
)

9 0 1


We can start to get an idea of what it does, but what if we forget.  Let's use help!

In [20]:
help(least_difference)

Help on function least_difference in module __main__:

least_difference(a, b, c)



Nothing to show here.  That's because we need to write our help for it!  Computers aren't mind readers.

Here's how you document functions so that help works.

In [21]:
def least_difference(a, b, c):
    """Return the smallest difference between any two numbers
    among a, b and c.
    
    >>> least_difference(1, 5, -5)
    4
    """
    diff1 = abs(a - b)
    diff2 = abs(b - c)
    diff3 = abs(a - c)
    return min(diff1, diff2, diff3)

If you're going to reuse a function, its good to document it so you can get help!

In [22]:
help(least_difference)

Help on function least_difference in module __main__:

least_difference(a, b, c)
    Return the smallest difference between any two numbers
    among a, b and c.
    
    >>> least_difference(1, 5, -5)
    4



## Default arguments in functions

When we get help for `print` we can see that it has some optional arguments.

In [23]:
help(print)

Help on built-in function print in module builtins:

print(...)
    print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)
    
    Prints the values to a stream, or to sys.stdout by default.
    Optional keyword arguments:
    file:  a file-like object (stream); defaults to the current sys.stdout.
    sep:   string inserted between values, default a space.
    end:   string appended after the last value, default a newline.
    flush: whether to forcibly flush the stream.



Changing default arguments is easy, and adding them to our own functions is straightforward:

In [24]:
print(1,2,3,sep=" < ")

1 < 2 < 3


In [27]:
def greet(who="world",punct="!"):
    print("hello ",who,punct)

greet()
greet("Bob",".")
greet(who="Everyone")

hello  world !
hello  Bob .
hello  Everyone! !


## Functions used in Functions

One idea that is very powerful is the idea of using functions as arguments to other functions.  When a function uses another function as an argument, it is called a higher order function.

For example:

In [28]:
def mult_by_five(x):
    return 5 * x

def call(fn, arg):
    """Call fn on arg"""
    return fn(arg)

def squared_call(fn, arg):
    """Call fn on the result of calling fn on arg"""
    return fn(fn(arg))

print(
    call(mult_by_five, 1),
    squared_call(mult_by_five, 1), 
    sep='\n', # '\n' is the newline character - it starts a new line
)

5
25


By default, max returns the largest of its arguments. But if we pass in a function using the optional key argument, it returns the argument x that maximizes key(x) (aka the 'argmax').

In [30]:
def mod_5(x):
    """Return the remainder of x after dividing by 5"""
    return x % 5

print(
    'Which number is biggest?',
    max(100, 51, 14),
    'Which number is the biggest modulo 5?',
    max(100, 51, 14, key=mod_5),
    sep='\n',
)

Help on built-in function max in module builtins:

max(...)
    max(iterable, *[, default=obj, key=func]) -> value
    max(arg1, arg2, *args, *[, key=func]) -> value
    
    With a single iterable argument, return its biggest item. The
    default keyword-only argument specifies an object to return if
    the provided iterable is empty.
    With two or more arguments, return the largest argument.

Which number is biggest?
100
Which number is the biggest modulo 5?
14


## Booleans and Conditionals

In Python boolean values can be `True` or `False`

In [31]:
x = True
print(x)
print(type(x))

True
<class 'bool'>


The boolean operators are similar to other languages.  Here's an example of a function that prints info about an argument:

In [32]:
def inspect(x):
    if x == 0:
        print(x, "is zero")
    elif x > 0:
        print(x, "is positive")
    elif x < 0:
        print(x, "is negative")
    else:
        print(x, "is unlike anything I've ever seen...")

inspect(0)
inspect(-15)

0 is zero
-15 is negative


Note the use of colons and whitespace

## Lists

Lists represent ordered sequences of values.

* Arrays
* Accessing Arrays
* 2D Arrays
* Slicing
* Sorting

In [43]:
numbers = [1,2,3,4,5,6,7]
numbers[0]

1

In [44]:
numbers[1:3]

[2, 3]

In [45]:
numbers[::-1]

[7, 6, 5, 4, 3, 2, 1]

In [52]:
squared = map(lambda x:x**2,numbers)
list(squared)

[1, 4, 9, 16, 25, 36, 49]

In [55]:
hands = [
    ['J', 'Q', 'K'],
    ['2', '2', '2'],
    ['6', 'A', 'K'], # (Comma after the last element is optional)
]
hands[2][1]

'A'

In [57]:
planets = ['Mercury', 'Venus', 'Earth', 'Malacandra', 'Jupiter', 'Saturn', 'Uranus', 'Neptune']
planets

['Mercury',
 'Venus',
 'Earth',
 'Malacandra',
 'Jupiter',
 'Saturn',
 'Uranus',
 'Neptune']

In [60]:
planets[:3] = ['Mer','Ven','Ur']
planets

['Mer', 'Ven', 'Ear', 'Malacandra', 'Jupiter', 'Saturn', 'Uranus', 'Neptune']

In [64]:
planets[:4] = ['Mercury', 'Venus', 'Earth', 'Mars',]
sorted(planets)

['Earth', 'Jupiter', 'Mars', 'Mercury', 'Neptune', 'Saturn', 'Uranus', 'Venus']

In [65]:
len(planets)

8

In [66]:
sum(numbers)

28

In [67]:
max(numbers)

7

In [68]:
planets.append('Pluto')
planets

['Mercury',
 'Venus',
 'Earth',
 'Mars',
 'Jupiter',
 'Saturn',
 'Uranus',
 'Neptune',
 'Pluto']

In [70]:
planets.index('Earth')

2

In [71]:
planets.index('PlanetX')

ValueError: 'PlanetX' is not in list

In [72]:
'Pluto' in planets

True

In [73]:
help(planets)

Help on list object:

class list(object)
 |  list(iterable=(), /)
 |  
 |  Built-in mutable sequence.
 |  
 |  If no argument is given, the constructor creates a new empty list.
 |  The argument must be an iterable if specified.
 |  
 |  Methods defined here:
 |  
 |  __add__(self, value, /)
 |      Return self+value.
 |  
 |  __contains__(self, key, /)
 |      Return key in self.
 |  
 |  __delitem__(self, key, /)
 |      Delete self[key].
 |  
 |  __eq__(self, value, /)
 |      Return self==value.
 |  
 |  __ge__(self, value, /)
 |      Return self>=value.
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __getitem__(...)
 |      x.__getitem__(y) <==> x[y]
 |  
 |  __gt__(self, value, /)
 |      Return self>value.
 |  
 |  __iadd__(self, value, /)
 |      Implement self+=value.
 |  
 |  __imul__(self, value, /)
 |      Implement self*=value.
 |  
 |  __init__(self, /, *args, **kwargs)
 |      Initialize self.  See help(type(self)) for accurate sign

### Tuples

Tuples are like lists but they are not mutable.  The syntax uses the parenthesis not square brackets.
They are often used to return multiple values from functions.

In [74]:
x = (1,2,3)
x[1] = 20

TypeError: 'tuple' object does not support item assignment

In [75]:
value = 0.125
value.as_integer_ratio()

(1, 8)

## Loops and List Comprehensions

## Strings and Dictionaries

## Using External Libraries