## Introduction to Python I
**Nick Kern**
<br>
**Astro 9: Python Programming in Astronomy**
<br>
**UC Berkeley**

---
1. [Introduction](#Introduction)
2. [Hello World!](#Hello-World!)
6. [Linearity of a Program](#Linearity-of-a-Program)
3. [Data Types](#Data-Types)
4. [Python Operators](#Python-Operators)
5. [Variable Assignment](#Variable-Assignment)
7. [String Formatting](#String-Formatting)

### Introduction

<img src="imgs/python-logo.png" width=500px/>

In this notebook, we will cover the very basics of the Python programming language. For this course, we will be using Python 3. 

At their core, programming languages are methods for manipulating data. Here, we will discuss the ways in which the Python language can be used to manipulate data for different purposes. Before we begin, let's think about why Python has become so widely used not only amongst scientists, but also amongst programmers in general. 

> "Python is an interpreted, object-oriented, high-level programming language with dynamic semantics."
<br>
> -- https://www.python.org/doc/essays/blurb/

Python being an "interpreted language" implies there is no formal step for compiling one's code, instead this is done automatically for us, and the code is translated to lower-level code (although not machine code). The term "object-oriented" refers to the fact that Python is designed to support the construction of complex data structures that can act as "containers" for other functions and data structures, known as methods and attributes. The term "high-level" means that the language forms abstractions around machine logic to be more readable and similar to human language. The term "dynamic" means that variables can be defined "on-the-fly" or dynamically, and the Python interpreter handles the details for us automatically. 

These aspects make Python a very readable, powerful and easy-to-use language. Compared to languages that are traditionally used for scientific programming, like C, C++ and Fortran, Python is easier to read, debug, interact with, can visualize data easier, and can be used to do just about anything with a shallower learning curve. Other high-level languages with similar features, like IDL and Matlab, are proprietary (and therefore expensive) and don't have the versatility of being able to do just about anything. Because Python is open-sourced, there has been an incredibly large number of modules and libraries that have been built by the Python community that enable Python to be used for almost any task. Python does have its limitations, however, and the most evident one is its speed. Because Python is interpreted and is a higher-level langauge, it tends to have larger computational overheads, and in some cases is over 100x slower than languages like Fortran. There are ways to optimize Python to mitigate this to some degree, which we may explore later in the course. However, for many contexts, *code development* tends to be a larger time overhead than *code runtime*, making Python the more time-efficient choice in the long-run. In addition, being able to program effectively in Python will not only help you write code faster and more efficiently, it is also a very attractive skill for jobs in the private sector and industry!

Hopefully we have convinced you that learning Python is worth your time, particularly if you are interested in doing research in the physical sciences. Now, let's start learning Python with a hands-on approach!

### Hello World!

One of the simplest scripts we can write is to have the Python interpreter return a statement with the `print` function. If we put the following text into a `hello.py` script and run it via `python hello.py` we should get a greeting. 

In [None]:
%%file hello.py
#!/usr/bin/env python

# this is a hello world!
print("Hello World!")

Now we can run this from the command line like:
```bash
python hello.py
```

or just run it straight from the Jupyter Notebook as

In [None]:
%run hello.py

Note that in Python 3, the `print` statement **needs** to have brackets, otherwise it will error. This highlights the two major ways we can run Python code: dynamically in a Python interpreter (see below) or by running scripts in the command line (see above).

From here on out, all of the code we are going to see needs to be run within a Python interpreter, or needs to be put into a `filename.py` script. If you are following along in the Jupyter Notebook, then you can run the code directly inline as-is, like what I am doing for lecture!

### Linearity of a Program

When we create and run a script, the script is evaluated in a linear order from top to bottom. This is the same when we run an cell in the Jupyter Notebook. When writing code we should bear this in mind. This means that the following script, for example, will fail:

In [28]:
%%file hello.py
#!/usr/bin/env python

# Hello World 2.0
greeting = "What a wonderful day!"
print(greeting)

Overwriting hello.py


In [None]:
%run hello.py

Here, the error is that the code is trying to `print()` the variable `greeting` before we have assigned it a value. We will discuss variable assignment further down, but the point is that code is evaluated top-to-bottom in a linear fashion.

### Data Types

Let's explore the different formats data can take in the Python environment. Like most languages, this includes
* integers
* floats
* complex
* booleans
* strings
* None types

For a single unit of data, we can learn its **data type** with the `type` function.

In [16]:
print("This is a ", type(1))
print("This is a ", type(1.0))
print("This is a ", type(1j))
print("This is a ", type(True))
print("This is a ", type('hi'))
print("This is a ", type(None))

This is a  <class 'int'>
This is a  <class 'float'>
This is a  <class 'complex'>
This is a  <class 'bool'>
This is a  <class 'str'>
This is a  <class 'NoneType'>


### Integer and Decimal Arithemtic

Let's experiment by doing some integer & floating point arithmetic with these data types.

In [19]:
# integer times integer yields an integer
2 * 2

4

In [20]:
# float times a float yields a float
2.0 * 2.0

4.0

In [21]:
# float divided by a float yields a float
4.0 / 2.0

2.0

In [22]:
# integer divided by an integer yields a float!
5 / 2

2.5

In [23]:
# integer division division by an integer yields an int!
5 // 2

2

Integer division shown above changed from Python 2 to Pyton 3: In Python 2 integer division yields an integer, whereas in Python 3 it yields a float.

We can also represent decimals with scientific notation! Recall:
\begin{align}
1\rm{e}2 &= 10^{2} = 100.0\\
1\rm{e}1 &= 10^{1} = 10.0\\
1\rm{e}0 &= 10^{0} = 1.0\\
1\rm{e}\text{-}1 &= 10^{-1} = 0.1\\
1\rm{e}\text{-}2 &= 10^{-2} = 0.01\\
\end{align}

In [24]:
# scientific notation
1e-2 * 1e4

100.0

### Boolean Arithmetic

Let's explore arithmetic with other booleans.

In [25]:
# Boolean arithmetic
print(True * True)
print(True + True)

1
2


In [26]:
print(False * True)
print(False + True)

0
1


In [27]:
print(False * False)
print(False + False)

0
0


Can you guess what integer form the boolean types True and False take when having arithmetic performed on them?

In [30]:
# there are also the logical "and" "or" statements as well
print(True or False)
print(True and False)

True
False


### Complex Arithmetic

Complex numbers can be specified by including a `j` next to a number (with no multiplication sign `*`). They also multiply as you would expect them to.

In [31]:
# addition
(2 + 2j) + (-4 + 4j)

(-2+6j)

In [32]:
# multiplication
(1 + 3j) * (2 + 5j)

(-13+11j)

### String Arithmetic?

Strings are sequences of characters enclosed by either apostrophes or quotations. Strings don't support string-string arithmetic. However, some int-string arithmetic is supported. 

In [None]:
# '' and "" are considered strings
print("hi there!")
print('hi there!')

In [37]:
# note that \n is a special character! (as is \r and \a and \t)
print("well hello,\nhow are you?")

well hello,
how are you?


In [42]:
# string - integer arithmetic
print("hi" * 10)
print("hi" + " " + "goodbye")

hihihihihihihihihihi
hi goodbye


In [43]:
# try some other string - integer arithmetic
print("32"*"23")

TypeError: can't multiply sequence by non-int of type 'str'

### Converting Data Types

Data types can be converted from one to another, assuming it doesn't raise any errors. The easiest way to do this is with the
* `int`
* `float`
* `complex`
* `bool`
* `str`

built-in functions. We will talk more about functions later!

In [44]:
int(100.0)

100

In [45]:
float(100)

100.0

In [46]:
complex(100.0)

(100+0j)

In [54]:
bool(0)

False

In [63]:
print(str(2500.0))

2500.0


In [61]:
float("3.2")

3.2

In [62]:
int("3")

3

In [66]:
int(9.9)

9

In [69]:
round(9.8)

10

### More Operators

Along with addition, subtraction, multiplication and division operators, Python also supports the arithmetic operators
* exponential `**`
* modulus `%` 

and the comparison operators
* equal to `==`
* not equal `!=`
* greater than `>`
* less than `<`
* greater or equal `>=`
* less or equal `<=`


In [70]:
# Take exponential (whitespace doesn't matter!)
2**2

4

In [72]:
# notice order of operations P-E-MD-AS
(2 ** 2) + (4 * 2)

12

In [73]:
# Modulus operator gives the numerator of the remainder after division
11 % 5

1

In [74]:
# What if it is a multiple of the divisor?
15 % 5

0

In [75]:
# What do you expect here?
4 % 11

4

In [76]:
5 == 5

True

In [77]:
# comparison also works w/ strings
"hello" == "hello"

True

In [81]:
(5 > 2) and (45 < 46)

True

In [80]:
(5 > 2) or (100 < 50)

True

In [83]:
# confirms our intuition from before?
0 == False

True

When computers store information, they can only do so to a finite precision. Decimal (floating point) precision is good to 16 significant figures, meaning that, perhaps contrary to your intuition:

In [84]:
print(1.0 == 0.99999999999999999)

True


### Breakout

1. Try converting between different data types:
    1. float <-> int
    2. int <-> complex
    3. bool <-> float
    4. string <-> float
    5. string <-> int
<br><br>
2. Most of these conversions should have worked (and by worked, I mean not errored). One thing that shouldn't have worked, is if you tried to convert a string-representation of a float into a integer (i.e., `int("3.5")`). How might you get around this error?
<br><br>
3. You should notice a hierarchy of data types in terms of their information content. When you convert a float into an integer, for example, you lose information about significant figures *after the decimal*. If you convert an integer to a float, for example, you can still perfectly express the integer in decimal form (no loss of information). Can you come up with a rough hierarchy of data types in terms of their information content?

### Variable Assignment

Often we want to assign data both a name and place to live in our interpreter environment, so we can perform multiple operations on it, track it, visualize it, print it, and save it. We can assign variables with the assignment operator `=`.

In [99]:
a_number =        \
10.0
print(a_number)

10.0


There is a special kind of syntax when we want to perform arithmetic on an already existing variable utilizing its current value.

In [100]:
# this is a way to add to an existing variable
a_number = 10
a_number = a_number + 1
print(a_number)

11


In [101]:
# this is the exact same thing, and is similar to the bash a++ command
a_number += 1
print(a_number)

12


In [102]:
# other arithmetic operations are also possible
a_number /= 2
print(a_number)

6.0


Recall that the `=` sign does not imply equality! It is an assignment operation, telling the variable to the left to take on the value of whats on the right. In this sense, you can think of computer code as being read *right-to-left*, meaning that the line `a_number = a_number + 1` is read: 1) take the integer `1`, 2) add it to the current value of `a_number` and 3) assign that value to the variable `a_number`.

Note that there are limitations as to what characters we can use for variables in Python. The general rule is that you can use any sequence of alphabetical characters (although be careful b/c there are built-in words we don't want to overwrite like the `str` and `int` functions!) as well as the underscore `_`, as well as integers, so long as it doesn't start with an integer. Things that won't work as variables are characters like (`.`, `#`, `%`, `{`, `:`, `@`, etc.).

In [103]:
# these all work!
int1 = 10
string2 = "hello"
boolean3 = False
_c0mpl3x = (1 + 2j)

In [104]:
# will this give us an error?
5float = 2.0

SyntaxError: invalid syntax (<ipython-input-104-e25e52f1145c>, line 2)

**Getting User Input**

We can ask for user input and assign the input to a variable with the built-in `input()` function like

In [106]:
# print someone's name
response = input("What is your name? ")
print("Your name:", response)

What is your name? 10.0
Your name: 10.0


In [108]:
# regardless of input, the data type is always a string type
number = input("list something: ")
print( type(number) )

list something: False
<class 'str'>


### String Formatting

One useful functionality is the ability to format variables into a string, which can but need not be strings themselves.

In [109]:
# this is the old way of string formatting
name = "Nick"
print("Hi my name is %s! Nice to meet you!" % name)

Hi my name is Nick! Nice to meet you!


In [110]:
pizza_for_me = 2
pizza_for_you = 1
print("I'd like %s pizzas for me, and %s pizza for my friend."%(pizza_for_me, pizza_for_you))

I'd like 2 pizzas for me, and 1 pizza for my friend.


In [111]:
a_number = 2.23425
print("The number %f is a little complicated, how about %d instead?" % (a_number, a_number))

The number 2.234250 is a little complicated, how about 2 instead?


In [114]:
a_number = 2.23485
print("How about we round the number %f to three decimal places like %.10f" % (a_number, a_number))

How about we round the number 2.234850 to three decimal places like 2.2348500000


In [117]:
# this is the new style of string formatting
print("This style of formatting the number {:f} gives us the same thing {:.3f}".format(a_number, a_number))

This style of formatting the number 2.23485 gives us the same thing 2.235


In [118]:
# how could I format this with new-style formatting?
code1 = 'python'
code2 = 'fortran'
print("It is {} that I know {}, but it is {} that I know {}.".format(True, code1, False, code2) )

It is True that I know python, but it is False that I know fortran.


These are the following placefillers for string formatting:

* s : string
* d : integer
* f : float

Note that for the new method of string formatting, there is no "s" placefiller, but "{}" by default means a string.

### Breakout:

Write a snippet of code that asks the user for a floating-point number, and prints out that number as an `int`, `float`, and `complex` data type, while informing the user which one is which. Use string formatting to do the print statements.

### Extra: Data Structures Intro

Often we want to organize and containerize data into structures (called data structures!). The built-in data structures we will study in this class are 

* lists
* tuples
* sets
* dictionaries

Here we will look at just lists and sets, and will cover the basics on how to construct and manipulate these data structures. Later, we will look at all of these structures in more detail.

### Lists

A list is constructed with square brackets and comma-separated elements.

In [None]:
my_list = [1, 2, 3, 4]
print(my_list)

In [None]:
# confirm its type
type(my_list)

You can store any kind of data in a list: they need not have the same type.

In [None]:
my_other_list = [1, 10.0, True, 'hello']
print(my_other_list)

You can access a single element of a list using index notation, remember, **Python is zeroth ordered**!

In [None]:
# get the zeroth element
print(my_other_list[0])

# get the second element
print(my_other_list[2])

You can reassign individual elements of an array after it has been created

In [None]:
my_other_list[0] = 'a new datum!'
print(my_other_list)

You can add (or concatenate) lists together with a few different methods

In [None]:
# method1 use the + operator
[1, 2, 3] + ['four', 'five', 'six']

In [2]:
# use the [].append function for individual elements
my_list = [1, 2, 3]
my_list.append( 'four' )
print(my_list)

[1, 2, 3, 'four']


In [3]:
# use the [].append function for other lists!
my_list = [1, 2, 3]
my_list.append( ['four', 'five', 'six'] )
print(my_list)

[1, 2, 3, ['four', 'five', 'six']]


In [4]:
# remove an element with the [].pop function
print(my_list)
my_list.pop(2)
print(my_list)
# this removed the 2nd element of my_list for us

[1, 2, 3, ['four', 'five', 'six']]
[1, 2, ['four', 'five', 'six']]


Instead of manually creating each element of an list, you can use the `range` function to auto-generate a list. In Python 3.0, however, the `range` function returns an iterable (we will cover later), so to turn it into a list you need to use the `list()` function, as such:

In [5]:
# auto-generate a big list
a_big_list = list( range(0, 100) )
print(a_big_list)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]


Note that the first argument of `range` is the starting point (inclusive) and the second argument of `range` is the ending point (exclusive).

You can also slice lists, which means to take a subset of the list elements. We will cover this in more detail when we cover data structures next time, but briefly, list slicing looks like:
```
my_list[<start>:<stop>:<step>]
```
where I specify the starting index with `<start>`, stopping index with `<stop>`, and if I want to skip elements with `<step>`. If you leave these blank, it will default to `<begining>:<end>:1`.

In [6]:
my_list = [1, 2, 3, 4, 5]
print(my_list[0:3])

[1, 2, 3]


where "0" is the starting point and "3" is the ending point. This means I took the elements with indices from [0, 3) where "[" means inclusive and ")" means exclusive. Try changin them and see what happens!

Note that I can also take every-other element by giving the slice one more number such as this:

In [7]:
print(my_list[::2])

[1, 3, 5]


where the "2" means to take every second element. I can change this to 3 to take every third element. Fun fact, I can also change this to "-1" to reverse the order of the array (P.S. **this also works for strings!**)

In [8]:
print(my_list[::-1])

[5, 4, 3, 2, 1]


### Sets

Sets are another data structure and are similar to lists in a few ways, but differ greatly in others. One way that they are similar is that they can take data of any type. One way that they are different is that **they cannot be indexed**. Another way they are different, is that they can only hold one element of unique data. Let's see what I mean first-hand.

In [9]:
# sets are constructed using curly brackets and comma-separated values
my_set = {1, 2, 3}
print(my_set)

{1, 2, 3}


In [10]:
# confirm its a set
type(my_set)

set

In [11]:
# another way to make a set is to use the set() function on a list
my_other_set = set( [1, 2, 3, 'four', 'five', 'six'])
print(my_other_set)

{1, 2, 3, 'five', 'four', 'six'}


In [12]:
# lets try to add data to the already existing my_set with {}.add
my_set.add(7)
print(my_set)

{1, 2, 3, 7}


In [13]:
# lets try to add it again to my_set
my_set.add(7)
print(my_set)

{1, 2, 3, 7}


In [14]:
# see if a number is in the set
2 in my_set

True

One thing we can do with sets that we cannot do with lists are set operations with multiple sets. See the Venn Diagram and example below.

<img src='imgs/venn.jpg' width=400px>
<center> A venn diagram, showing the overlap of two sets </center>

In [15]:
# Create overlapping sets
set1 = set([1,2,3,4,5])
set2 = set([4,5,6,7,8])

In [16]:
# set union
print(set1 | set2)

{1, 2, 3, 4, 5, 6, 7, 8}


In [17]:
# set intersection
print(set1 & set2)

{4, 5}


In [18]:
# set difference
print(set1 - set2)

{1, 2, 3}


In [19]:
# set symmetric difference
print(set1 ^ set2)

{1, 2, 3, 6, 7, 8}


### Conditionals (aka. `if-elif-else` statements)

Conditionals are blocks of "True-or-False" statements constructed with comparison operators. The statements are evaluated and, if True, execute one piece of code and, if False, execute a different piece of code. The basic syntax can be seen as follows

In [20]:
# a simple if statement
my_var = 10

if my_var > 5:
    print("this is a big number!")

this is a big number!


Note that the indent of the `print` statement after `if` is required: this is a fundamental syntactical element of Python--indents matter! In a nutshell, indents represent separted blocks of code. In this case the indent specifies which lines of code are executed if the statement is True. Let's make this a little more complex.

In [21]:
# a simple if-else statement
my_var = 2

if my_var > 5:
    print("this is a big number!")
else:
    print("this is a small number!")

this is a small number!


We can see that the `if` statement was rejected, becuase 2 is not larger than 5. In this case, we provided an `else` statement which is activated when the `if` statement returns False. 

Next is an example with `elif`, which stands for *else-if*.

In [22]:
temperature = float( input('What is the temperature in F? ') )

if temperature > 100:
    print("Don't go outside!")
elif temperature > 70:
    print("Wear shorts.")
else:
    print('Wear pants.')

What is the temperature in F? 


ValueError: could not convert string to float: 

You will notice that if we feed a temperture of 105, we only get the `print` statement of the first condition, even though the second condition is still met. This is because of the `elif` statement. What does the `elif` statement do? It connected to the statement before it, and is only activated if the statement before it returns `False` ***and*** its own condition is met. What do you think would happen if we changed the `elif` to another `if`?

### Loops

A loop is a when we separate a block of code and repeatedly run it for a set number of times. There are two kinds of loops, a **`for`** loop and a **`while`** loop. A `for` loop is an iteration through the elements of a list, whereas a `while` loop is an indefinite number of loops until some condition is broken. Careful, with a `while` loop you can in principle make your computer loop forever, which is not desirable. 

The different parts of a loop are separated by indentation, so pay close attention to how indentation is used to separate blocks of code.

**`for` loop**

In [23]:
# the most basic for loop we can construct
for i in [1, 2, 3, 4, 5]:
    print(i)

1
2
3
4
5


You can also nest loops togther, to get complex iteration schemes. The following, for example, is a method for finding all the different ways we can combine two letter together from a set of three letters.

In [24]:
# A nested FOR loop

# create an empty list
perm = []

# enter first FOR loop
for i in ['a', 'b', 'c']:
    
    # enter second FOR loop
    for j in ['a', 'b', 'c']:
        
        # append to the list
        perm.append(i+j)
        
print(perm)

['aa', 'ab', 'ac', 'ba', 'bb', 'bc', 'ca', 'cb', 'cc']


**`while` loop **

A `while` loop is a loop that iterates indefinitely until some condition is broken. Everytime the loop starts it evaluates the condition and continues if `True` and exits if `False`.

In [25]:
# A while loop

# set counter equal to zero
counter = 0

# iterate until counter is >= 5
while counter < 5:
    
    print("the counter =", counter)
    
    # increase counter by 1
    counter += 1
    
print("the loop finished!")

the counter = 0
the counter = 1
the counter = 2
the counter = 3
the counter = 4
the loop finished!


### Manually Skipping and Exiting Loops

Sometimes you want to be able to skip one iteration in a loop, but continue the rest of iterations in the loop. Other times you want to exit the loop outright, if some condition is met. You can do this with the `continue` and `break` statements. Let's see some examples.

In [26]:
# skip an iteration if counter == 5
for counter in range(10):
    if counter == 5:
        continue
    
    print("counter = %s" % counter)

counter = 0
counter = 1
counter = 2
counter = 3
counter = 4
counter = 6
counter = 7
counter = 8
counter = 9


You can see that once the `if` statement is executed (when counter == 5), the rest of the loop is ignored! The loop simply restarts having moved on to the next element in `range(10)`. Note that because of the indentation, the `print` statement *is not* within the `if` statement, only the `continue` command is in the `if` statement.

We can do the same thing but use `break` to exit the loop entirely.

In [27]:
# exit loop if counter == 5
for counter in range(10):
    if counter == 5:
        break
    
    print("counter = %s" % counter)

counter = 0
counter = 1
counter = 2
counter = 3
counter = 4


In this case, the entire loop stops once it executes the `break` command. If that wasn't there, it would have happily gone off to print numbers up to 9.

### Breakout

1. Write a code that takes as *input* someone's score out of 100 and determines a letter grade based on a 10-point scale (i.e. >= 90 is A, >= 80 is B, >= 70 is C, etc.), and then print out their letter grade.
<br><br>
2. Write a code (using a `for` loop) that calculates the sum of all odd numbers from 1 - 99. You might find the sum() function useful, as well as the fact that the expression `i % 2` equals either 0 or 1 depending on whether `i` is even or odd. 