# Tutorial 1: Introduction to Python and Jupyter

#### GEOL 455/855, Day 2 Exercise
#### Dr. Lynne Elkins

Welcome to your first Python coding tutorial! This is an introduction to writing basic code, which you will work through with classmates during class today.

This code has been written as what is called a "Jupyter notebook," which is designed to display and run code as discrete, modular, embedded cells in a web browser window. Although writing a more standard script code file is still a great approach for a lot of computational and numerical modeling, there are a few advantages to working with a Jupyter notebook:
* Usable at all skill levels
* Approachable for non-experts, with an intuitive sequential layout
* Easy to edit and test small pieces of code within a larger script
* Also approachable for experts, who may not need this format but can use it!
* Convenient user-facing format that can import and "call" other code packages and functions
* Can be run native on a computer, or in a browser using a cloud service that hosts the Jupyter environment
* Professional look with formatted text

### Getting started

So, how does it work?

The next cell provides a useful example. It contains a snippet of code that can be run independent of anything else. It is a simple command, written in the language Python, that defines a variable by assigning it a value, and then printing that value to the screen.

To run the code, select the cell with the mouse, and then type Shift + Return and observe the results.

In [1]:
testvalue = 3.0

print(testvalue)

3.0


Depending on what you saw above when you opened this notebook file, you might discover that someone had previously run that script cell and saved the notebook, so it remained printed to the screen. Because it was a relatively simple operation, you probably just overwrote that old result with a fresh one. Often, though, it helps to start fresh. Sometimes you will even want to pause and restart the code after you have used it a while, to test all of the cells fresh.

To do so, use the menus at the top of the page and go to Kernel --> Restart and Clear Output, and then confirm by clicking the Restart. Now you can rerun the cell above, and not worry about any prior saved results.

#### Exercise 1

Now try a little practice! Go back to the cell above, try to edit the cell, and rerun it.

Your goal: print a new, different numerical value to the screen.

### Data types

Now, depending on how you typed your new number, you may actually have unintentionally told Python some additional information above. Variables can actually be saved with several different types of data:
* Text (called a "string")
* Exact, whole numerical values (integers)
* Decimal numerical values (called "floats")

The difference between integers and floats is quite important for computation. Computers cannot actually save infinite decimal values (because storage is finite), so to tackle numbers with a reasonable level of precision, a variable is saved to a specific number of digits. In Python, floats have 18 digits. However, if the final value of a decimal is always the same, that introduces a bias error to your computations; so instead, the computer assigns a random, "floating" value at the very end of the long decimal. This usually works well, but in rare cases can cause problems in computations. It helps to be aware of what a program is really doing in case you need to work around a problem!

It is also the case that only certain kinds of values can be used for certain things. If you want to count the number of apples in a basket (or iterations in a repeating calculation), you want a precise integer number, like 5, not 5.000000000000001. You cannot use floats for this kind of operation. On the other hand, complex calculations rarely rely on exact integers, and usually require floats.

And then there are strings! Often, it helps to use simple text for saving values, communicating information, printing to the screen, making exported tables and figures, and more. And a Python program will treat strings as simple characters, without reading any numerical values from it, even if it contains numbers. Sometimes it is necessary to change a numerical data type into a text string (or even vice versa), so you can work with it in different ways.

#### Exercise 2
Thus, it is important to be able to name, identify, and even switch between these data types! To get started, run the cells below to see what kind of data type you created above:

In [2]:
type(testvalue)

float

How can you be clear with Python about what type of variable you want a numerical value to be? Well, for whole numbers, the simplest way is just to use or omit a decimal:

In [3]:
value_int = 1
value_float = 1.

In [4]:
type(value_int)

int

In [5]:
type(value_float)

float

What about changing these? Well, it might work to simply assign a new value to the existing variable, but overwriting like that is not considered a great coding practice... and sometimes in Python that actually can create problems. One of the idiosyncrasies of this language is that variables "point to" their values (or to each other), and when redefining them there can occasionally be unexpected results.

A more reliable, and generally cleaner coding practice is simply to write a new variable that draws information from the old one:

In [6]:
convert_int_float = float(value_int)
convert_int_float

1.0

In [7]:
type (convert_int_float)

float

Can you figure out a way to do the reverse **(change the float value into an integer),** and then **display the results to the screen?** Use the blank cells below to try to do this: write test code in the cell and run it, and see what happens.

NOTE that you can add more blank cells using the "+" button in the menus above, and remove them using the "cut" button with scissors. (If you then **save** this notebook, your work will still be here later!)

#### Tips and tricks: Markdown, comments, etc.

What about all this text? It turns out that what you are reading as embedded text on the page is actually also a code cell! Select this cell with your mouse, and note that the drop-down menu in the taskbar above now says "Markdown" instead of "Code." Markdown is a text rendering language based on LaTeX, which allows you to insert and edit text in Jupyter notebooks. This is great for writing everything from user-facing code, to professional manuscripts, to just a nice web page with embedded code.

#### Exercise 3
To do some experimenting with Markdown, double click this cell to access the basic Markdown text editor. Edit the text below by typing in the requested information:

**Code user name: (Type your name here)**

**Date: (Type today's date here)**

Note how those lines are in **bold** because of the double asterisks; this is one of several easy shortcuts for text editing. The number signs above create headers. Feel free to experiment a little!

When you are finished, simply run this cell (Shift+Enter while it is selected) to render the formatted text. Double-click again to edit if you are not satisfied with the results, or if you want to experiment some more.

#### Commenting

Finally, one more basic formatting tip: within a code cell (or in a larger Python script saved as a separate code file), non-code information must be indicated clearly as such, so that the program doesn't try to run your notes and comments as though they were code.

Annotating and documenting your code with useful tips, notes, status, and use information is critically important, and you should develop the habit of documenting what you are doing as you write and test your code, instead of saving it for later. This way, you and others can always figure out what each line of code is doing, whether or not it works yet, what you might want to change in the future, etc. If you *don't* document what you are doing as you go along, and then your work gets interrupted, you very likely will forget the details and won't be able to tell what's happening when you come back to it later! This is even more of a problem when working in teams, as is so often the case in coding work.

Inserting notes and documentation in your code is called "commenting." In Python, comments are indicated using the number sign/hashtag symbol '#' at the beginning of a line (or even in the middle of it!).

In [8]:
# This is an example of a comment. It does not run as code.

# This is also a great way to test your code line-by-line:
# simply "comment-out" the lines you aren't using and want to "mute" from running!

# What happens if you comment in the middle of a line?
variable1 = 1.0  #This part of the line remains as a comment, and won't affect the rest of the code.

# The following line will not run unless you delete the number sign. Try it!
#variable2 = 1.0


### Data collections

For our next exercise, let's look at another type of data that Python can save and work with: data collections. There are actually four types of data collections in Python:
* List
* Dictionary
* Set
* Tuple

For each, it helps to know 1) what kind of data it saves and how it can be used, and 2) the correct syntax (how to write and edit them based on formatting).

*Lists* can be edited and rearranged. They are also indexed: the first item can be identified based on its position or order in the list, as well as the second, third, etc.

*Dictionaries* are a bit different. They are designed for use when it is necessary to define items in a series, which means you cannot duplicate entries, and they are arranged in pairs where one item is the "key" and the other is the "value." The values can be strings, integers, etc. Dictionaries can be edited, and in current versions of python they are ordered.

*Sets* are similar to lists, but they have no set item order (which means they are not indexed), the items cannot be edited once defined (though you can add and delete items), and there cannot be duplicate entries.

*Tuples* are basically special lists that cannot be changed once they are defined. Permanence is sometimes useful.

The code tutorial homework will help you practice how to write, index, and edit some of these data collections, so we won't do a lot of that today, but some examples are below.

Two very important tips that for working with data collections:
* Variables and names are always *case-sensitive* in Python
* Indexing (finding an item in a list, tuple, etc.) always starts counting from 0 (not 1)! This is true for most computer languages, not just Python. Note, however, that this primarily applies to *indexing* and not necessarily to all things you could count.

In [9]:
# Lists have square brackets:
example_list = ["apple", "banana", "cherry"]
print(example_list)

['apple', 'banana', 'cherry']


In [10]:
# Lists are not limited to strings:
float_list = [0.01,0.02,0.05,0.01]
print(float_list)

[0.01, 0.02, 0.05, 0.01]


In [11]:
# Dictionaries have curly brackets and pairs of items:
example_dict = {
    "firstname": "Lynne",
    "lastname": "Elkins",
    "office": 304,
}
print(example_dict)

{'firstname': 'Lynne', 'lastname': 'Elkins', 'office': 304}


Likewise, dictionaries can have integers, floats, or strings. This one has a combination of strings and integers.

Note that in this case, a decision was made to split up the elements of the dictionary on different lines. This helps keep everything visible on the screen together and is neat and tidy for longer collections. Doing this is always an option, but it requires correct indenting as above, and you do still need to include all of the required syntax elements to define your collection (colons, commas, curly start and end brackets for dictionaries, etc.). Normally Python will indent lines for you when you type "Return," though, so this is not too tricky!

In [12]:
# Sets also have curly brackets, a simpler list of items with no repeats or particular order:
example_set = {"cat", "dog", "bird",5,1.2}
print(example_set)

{1.2, 5, 'bird', 'dog', 'cat'}


In [13]:
# Tuples have non-editable elements, fixed indexed positions, and round brackets/parentheses:
example_tuple = ("one", 2.0, "three")
print(example_tuple)

('one', 2.0, 'three')


### (Brief) introduction to loops

Now that we have defined some basic data types in Python, let's do a little preview of functional code! In numerical calculations, it is often useful to repeat a calculation or operation several times. This is really just iteration, something computers are particularly good at doing. A common application of iterative operations is changing an independent variable and then testing it for new results in a calculation. Writing this from scratch requires breaking down your operation into a few pieces, though:
* What is the calculation?
* How many times will you do the calculation?
* What are the input values you will use each time you do the calculation?
* How will you save and use the results from each iteration?


#### Exercise 4
For our purposes of just introducing this technique and practicing it, let's keep the calculation simple. Your overall task is to calculate and save the results for the following mathematical formula:

$y = 2x + 5$

given a randomly-generated series of 5 values for the independent variable, $x$. The results will be saved to a list.

To do this, work through the instuctions below, and then write your own code in the blank cells, run it, and see what you get! If your results don't make sense or have some kind of error (only reporting 4 values instead of 5, or all 5 values are the same... etc.), edit the code and try it again, until you get the expected outcomes! This troubleshooting approach is a standard way to test and update code, and it helps you learn to do it all better the next time.

First, how do you generate random values? It turns out there is a useful program package called "random." Run the cell below to import this package. This just means that from now on, within this notebook you will be able to access functions saved within that package.

In [14]:
import random

The cell below is an example of how to use the random package to generate a list of values. Study it, run it, test it, and then write your own lines of code in the blank cells below to define your variable "x" as a data collection. Based on the examples above, this syntax will generate a *list.*

In [15]:
# Generate a list of 3 random numbers between 0 and 100
randomlist = random.sample(range(0, 100), 3)
print(randomlist)

[55, 5, 61]


Next, you need to write your actual calculation function, and tell your program how many times to run it.

It turns out you can do these things at the same time, because Python is smart enough to simply go through all the items in a list. If you are curious, it is also possible to identify how many items are present in your list using the length function:

In [16]:
# How many items are in the list "randomlist"?
# Save that number to a variable called "n" in case you want to use it later:
n = len(randomlist)
print(n)

3


Performing an operation like a calculation iteratively requires setting up a *loop*. For this kind of operation, you want to run the loop once (that is, perform the calculation once) *for* every value of your variable x, which means you are using a "for" loop. Here is the syntax for a very simple "for" loop:

In [17]:
# A simple 'for' loop:
for i in randomlist:
    j = i + 1
j

62

Now, this worked, but note how it only printed the final value. That's because each time the loop ran through the calculation, it actually overwrote the variable 'j' and saved the latest result. Once the whole loop was finished, only the final round was actually saved, and you have lost the data from the previous iterations.

To fix this, you need to do a little bit of additional work telling the program to save the result somewhere else that won't be overwritten. It's like keeping a log of results.

There are several ways to save values iteratively, but one thing that does *not* work is manually naming each variable, because you might want to loop this calculation 10, 20, or 1000 times. You might even want to run it as many times as needed, based on some other piece of information, so you may not actually know how many times to do it.

In fact, the best code is flexible and not fixed: it can consult another piece of information (say, the "length" of another variable) and decide from that information how many times to run the loop, and then save all the outputs for you. You therefore need an automated method of saving values instead of assigning each one to its own variable.

This is one of the things data collections are meant for! But how do you add on a new value to a list, without overwriting the old ones? It turns out there is a handy function called "append" that does exactly this:

In [18]:
# Example of a for loop with appended values saved to a list:
numbers=[10,20,30,40]

for i in range(5,11):
    numbers.append(10*i)

print(numbers)

[10, 20, 30, 40, 50, 60, 70, 80, 90, 100]


**NOTE 1**: This is the first time we have used a kind of syntax that is particular to the Python language, namely the "dot operator" (i.e., the period). Python is a language that uses objects; in fact, almost *everything* is an object, with its own attributes and methods: every constant, variable (like "numbers" above), or function is, in fact, an object and you can define and then use all kinds of things for them. The connection between any object and its attributes or methods is indicated by a dot (”.”) written between them.

An attribute is a *property* of the object that you can retrieve (to, say, print or use in a calculation). You can define and use attributes using the syntax: `<object_name>.<attribute_name>`.

Similarly, a method is a *function* that the object provides, and is set up using similar syntax (usually with parentheses so you can add specific commands and instructions for how to run the function; if you aren't going to define any of these, the parentheses are just left blank).

So for example, you could define a "class" of functions and variables that is overall called "dog," which contains individual variable objects like "Fido." The dog class might include some methods you have written, like eats(), runs(), sleeps(). You could then write Fido.eats(), Fiido.runs(), Fido.sleeps(), and something would happen: some calculations or printing or other commands will run, using the Fido variable and whatever methods were previously defined. You can also define attributes for Fido, such as Fido.size = tall, or Fido.hair_color = brown, and these attributes will then be saved to the object. It takes a little practice to get used to the notation, but it works well!

With this in mind, what do you think the line `numbers.append(10*i)` is actually doing? Look carefully at the code. `numbers` is a variable that you defined as a list *before* you started the loop; it starts out with just the values [10,20,30,40]. But for each pass through the loop, the `append` function is adding an element to the end of `numbers` that is equal to a value ($i$) times 10.

**NOTE 2:** The example above is actually showing a slightly different application of for loops that may or may not be what you want to do! This is often the case when looking up examples of how to write code: they were written for some other purpose, but there is still something useful in how it is set up if you can figure it out! You can always choose elements of the syntax that are useful, and try some other approaches to see what works.

In this case, there are two differences between this example and the prior one:
1) Instead of selecting all the variables in a list for iterating, the independent variable '$i$' is defined as a list using the 'range' function. You may or may not want to use this yourself, but it's good to know the option exists! This simply tells the script to run the defined operation (that is, to append a value equivalent to $10i$ to the list) for the values i = 5, 6, 7, 8, 9, 10 (not 11).

2) You may not actually need an existing list like the one called "numbers" for your own solution, because you may not have any initial or preexisting values for 'y' to start with. But in order to append your results for 'y' to a list, you *do* need to first create an empty list where your code can save values (that is, give the code a destination for saving items):

In [19]:
# Example of how ranges work:
for i in range(5, 11):
    print(i)

5
6
7
8
9
10


In [20]:
# How to create a blank list, where you can add values later by appending them:
example_list_2 = []
print(example_list_2)

[]


One final tip: the syntax used in the 'for' loop above is important! If you forget, say, the colon ':', you will get an error. Sometimes the error messages are clear, but other times they refer to functions from other code packages that you cannot see and may not be familiar with, so you may need to do some hunting to figure out what you did wrong. Trial and error helps a lot!

You now have some examples and options for how to do your calculation. See if you can make it work, and good luck!!

### Next steps

Either at the end of class or whenever you finish this exercise, be sure to **save a copy of the notebook** and **upload the saved file to today's tutorial assignment for participation credit!** This is a group exercise and will be graded purely for completing the work. Note that you do not need to take this tutorial home to finish outside of class, so if you don't finish just save it and upload what you have done; but there may be some future tutorials that we ask you to finish on your own before uploading for credit.

Did you finish early, so you have some remaining time in class after finishing this guided tutorial? Get a head start on the assigned Python tutorials due next week! Those tutorials will review a lot of what we did today, but in further detail. Please note that the assignment is individual work, not group work, but feel free to check in with a classmate if you have questions or are uncertain about something as you get started and try to figure things out!