# Intro to Python: The Basics

A Reproducible Research Workshop

(A Collaboration between Dartmouth Library and Research Computing)

[_Click here to view or register for our current list of workshops_](http://dartgo.org/RRADworkshops)

_Created by_:

- Version 1.0: Jeremy Mikecz, Research Data Services (Dartmouth Library)
- Version 2.0: revised, rewritten, and updated by Mansa Krishna (Earth Science) - (\*much thanks Mansa!)
- Some of the inspiration for the code and information in this notebook was taken from https://www.w3schools.com/python/python_intro.asp -- This is a great resource if you want to learn more about Python!

We will be learning how to code with Python by using Jupyter Notebooks hosted on Dartmouth College's Jupyter Hub. Confused?

- **Python** is the programming language we will be using. It is a general purpose, high-level programming language meaning it does lots of different things and is designed to be readable by humans. You can run Python in many ways: from a command terminal, from a script called in the command terminal, from an _Integrated Development Environment_ software like PyCharm or Spyder, or from interactive notebooks like Jupyter Notebooks: speaking of which...
- **Jupyter Notebooks** allow you to mix code, instructions and explanations meant for humans, plots/graphs, and the output of your code all in the same document. Thus, Jupyter Notebooks are ideal for teaching and collaboration. However, running Jupyter Notebooks requires locally installing Python and all the different Python Libraries / Packages you will need for your code. Fortunately, there are no-install options for running Python on cloud environments:
- Dartmouth's **Jupyter Hub** provides a cloud-based environment where we can run Python code, thus saving us the hassle of installing a full Python environment on our own computers. \*Much thanks to Simon Stone and the folks at Research Computing for setting this up for us.

You can access Dartmouth's Jupyter Hub (and, if you are reading this you probably already have) by going to jhub.dartmouth.edu. This course's notebooks are found in RR-workshops/infrastructure/intro-to-python.


## I. Saving your Notebook

1. Before you make any changes to this notebook, please save your own instance of this notebook by selecting File --> Save As and then save it by adding your name at the end of your filename like this:

```
"path/to/folder/Basics_standalone_your-name.ipynb"
```


## II. Working with Jupyter Notebooks

Jupyter Notebooks contain two types of cells:

- **Text / Markdown Cells**, where you add information intended for your fellow humans. This can be instruction, explanation, or just background information about the code you are running or data you are analyzing.
- **Code cells**, where you write your code to be read and run by your computer.


### IIa. Text Cells

You can format the style of texts cells in two main ways:

- through markdown, such as **(double-click on this text cell to see the full markdown)**:

# This is a first-level header

## Header 2

### Header 3

**Bold text is wrapped by two asterisks**, and _italics by one_.

`Quote blocks are found between grave accents / back quotation marks`

For more on using Markdown with Jupyter see the [Markdown/Jupyter Cheatsheet](https://www.ibm.com/docs/en/watson-studio-local/1.2.3?topic=notebooks-markdown-jupyter-cheatsheet) {_<-- and this is a link by the way, view this cell in edit more - by clicking on it - to see how to insert links_}

- You can also format text cells using traditional html, such as:

<h2 style="color:purple; font-size: 100; background-color:pink">Alt Header</h2>

<p style = "color:gray; font-size: 10px">paragraph</p>

[save]

1. Create a text cell below. You can do so by pressing the "+ Markdown" button at the top of this notebook or clicking on the keyboard shortcut ESC + B. _For more keyboard shortcuts, see **????**._ Then practice with typing different styles of texts, including headers, bold text, italics, hyperlinks, and - if you feel comfortable, perhaps even some html.


### IIb. Code Cells

2. Run the code cell below by pressing the play button to the left side of the cell or placing your cursor in the cell and selecting CTRL + ENTER. Feel free to modify the text inside the print command and examine the results.


In [None]:
print("Hello World!")

### IIc. Python Comments

3. Comments start with `#` and Python will ignore them in a code cell. Comments make it easier for us to understand what our own code is doing! Insert your own comment below.


In [None]:
# This is a comment!
print("Hello World!")

## III. Operations

You can perform some basic calculations directly in Python!!! (without assigning any variables or other information to memory.)

4. Run the following code cells (click them and press Ctrl+Enter) and see what the different operands do (+, -, *, \*\*, /, //, %). *Experiment with replacing some of the values used in these operations and examine the results.\*


In [None]:
# Multiplication between two numbers
7 * 3

In [None]:
# Division between two numbers
7 / 3

In [None]:
# Integer division between two numbers
7 // 3

In [None]:
# Modulo operator: finds the remainder of the division between two numbers
7 % 3

In [None]:
# Addition between two numbers
7 + 3

In [None]:
# Subtraction between two numbers
7 - 3

In [None]:
# raise the first number to the power of the second number
7**3

In [None]:
# These operands don't just work for numbers, we can add/join two strings together, like so:
# Strings syntax - between quotation marks
"Good" + " Morning!"

In [None]:
# Even though these are 7 and 3, they are in quotes so
# the interpreter will consider them as strings!
# The addition of these strings will simply join them together!
"7" + "3"

In [None]:
# In programming, you often need to know whether a statement
# or expression is True or False.
# When comparing two values, Python will return a 'Boolean' answer
7 > 3

In [None]:
7 < 3

To check if two items are equal you will need to use 2 equal signs ("==") to distinguish from assignments ("=").

For example:

```
x = 7
```

assigns the value 7 to the variable x, whereas

`x == 7`

checks to see if x equals 7.


In [None]:
3 == 3

In [None]:
# Notice the order of operations!
# Note that '>=' stands for greater than or equal to
# but '>' is strictly greater than.
3 * 8 >= 24

In [None]:
3 * 8 <= 24

In [None]:
# We can also use the 'and', 'or' and 'not' operators!
(3 * 8) == 23 and (4 * 6) == 23

In [None]:
(3 * 8) == 23 or (4 * 6) == 23

In [None]:
"eo" not in "hello"

For more information on Python operators: https://www.w3schools.com/python/python_operators.asp


## IV. Variables

5. More commonly, you will want to store values (whether numbers, text, or more complex data structures) in variables. The general syntax for doing so is:

```
[variable_name] = [value]
```

Simply put, variables are created when you assign values to them!

Some general rules for variable names in Python:

- variable names should begin with letters (not numbers or other special characters) or - in special cases - underscores ("\_")
- do not include whitespace in variable names, instead use underscores (i.e. "variable_name")
- do not use [Python's **reserved words**](https://realpython.com/lessons/reserved-keywords/) as variable names (i.e. "True", "False", "class", etc.)
- variable names are case sensitive, so the variable "Age" is distinct from "age"


In [None]:
# create a variable called 'num' and assign it a value of 99
num = 99

6. Notice, above, when we assign a variable, nothing is outputed. Python simply stores the value 99 to the variable name "num". To output the value(s) of "num" we can use the **print()** function.


In [None]:
print(num)

7. We can then perform mathematical calculations using variables (assuming variables are numbers).


In [None]:
# let's multiply num by 3!
print(num * 3)

In [None]:
# Another example!
first_name = "Jeremy"
age = 43

In [None]:
print(first_name, "is", age, "years old.")

8. Variables must be created before they are used.


In [None]:
# print(last_name)   #to get this to work, we first must define the variable "last_name"

Type in your own last name below, run the cell, and then try re-running the cell above. Has the error message disappeared?


In [None]:
last_name = "Mikecz"  # replace w/ your own last name

## V. Data Types

It is important to know the types of data stored in a variable. Variables store different types of data, each data type can do different things! Python has several default or in-built data types! The following are just a feq examples of data types. You can find more information here: https://www.w3schools.com/python/python_datatypes.asp

- Text type : `str` or strings
- Numeric type : `int` or integer, `float` or decimal
- Sequence type : `list`, `tuple`, `range`
- Mapping type : `dict` or dictionary
- Set type : `set`
- Boolean type : `bool`
- Binary type : `bytes`
- None type : `NoneType`

9. We can use the **type()** function to identify the data type of a variable. For example:


In [None]:
x = 23.4
type(x)

10. Note: when you store a numerical value within quotes, Python recognizes it as a string ("str") rather than a number (either an "int" or a "float")


In [None]:
y = "23.4"
type(y)

11. One of the following two cells will produce an error. Can you guess which one? Then run both and see if you were correct.


In [None]:
x + 3

In [None]:
# try running the code below:
# y + 3

12. What about the following two cells. Which do you expect to produce an error?


In [None]:
# "The temperature is " + x + "degrees today."

In [None]:
"The temperature is " + y + " degrees today."

13. So, you may have noticed we can add ("+") numbers (floats or integers) to other numbers. We can also add strings to other strings ("+" concatenates strings). Thus, we can avoid the TypeError above by converting x into a string so we are concatening strings to strings.


In [None]:
x = str(x)
"The temperature is " + x + " degrees today."

14. In the above code applying the **str()** function to x (which was assigned the float 23.4), converts x to a string ("23.4"). Functions for converting data from one type to another include:

- **str()** converts a value into a string
- **int()** converts a numerical value into an integer
- **float()** converts a numerical value into a float (a number with decimal points)

Note: you cannot convert alphabetical characters to integers and floats, but you can convert ints and floats to strings.


In [None]:
print(float(28))
print(int(24.63))
print(str(x))

<div class="alert alert-block alert-info">

**Objects in Python:**

Before we dive deeper into more complex elements of Python, we should briefly mention the important concept of _objects_ in programming.

An _object_ in this context is some sort of structure that has one ore more specific _properties_, which define its _state_. An object also often defines special functions that can interact with these properties called _methods_.

Which properties a particular _object_ has is defined by its _class_, which you can think of as a blueprint for the object. Every object of the same class will have the same methods and properties, but the properties might have different values.

Every class defines an initialization method that can be used to create a new object of this class. This is called _instantiation_ of an object: The object is an instance of this class.

Almost everything in Python is (also) an object! For example, when we type the following code:

```{python}
x = 3
```

We are instantiating an object of the class `int`! This is not immediately obvious, because we are using a kind of short-hand by typing in the literal value of this integer. But Python converts this short-hand into the following:

```{python}
x = int(3)
```

Here, we are calling the initialization function of the `int` class, passing the value `3` to it, and receive back an object of the class (a.k.a type) `int`!

This may seem awfully abstract for now! We will encounter many examples of objects and their methods in this workshop series that will hopefully illustrate the usefulness of these concepts.

</div>


## VI. Built-In Functions

15. The basic syntax to use a built-in or default function is as follows:

```
function_name(input, arguments)
```


In [None]:
# The print function is an example of a built in function
print("Hello", "World", sep=";")

In [None]:
# The round function is another example of a built in function
round(58.29831, 2)

For more information on built-in functions: https://www.w3schools.com/python/python_ref_functions.asp


## VII. Conditionals

Python supports mathematical logical conditions. Suppose we have variables/values `x` and `y`. The following are some examples of logical statements.

- Equal: `x == y`
- Not equal: `x != y`
- Strictly lesser than: `x < y`
- Strictly greater than: `x > y`
- Lesser than or equal: `x <= y`
- Greater than or equal: `x >= y`

These conditions may be used in several ways; a common use of these are in '`if` statements'. See the following example.


In [None]:
x = 20  # assigning variable x a particular value

In [None]:
if x > 15:
    print("This number is greater than 15")
else:
    print("This number is less than or equal to fifteen")

In [None]:
if x > 15:
    print("This number is greater than 15")
elif x >= 10:  # elif == else if
    print("This number is between 10 and 14")
else:
    print("This number is less than 10")

## VIII. Let's try combining what we've learned so far


In [None]:
print("Enter your name:")
name = input()


print("Hello, " + name)

In [None]:
print("What is your age?")
age = input()  # this takes in a string type!
if int(age) < 21:
    print(
        f"Sorry we cannot serve someone who is {age} years old."
    )  # adding the f before a string creates a formatted string, which allow us to add variables between {}
else:
    print(f"Hey, you look young for {age}. What would you like?")

## IX. A Note on Working with Files and Libraries

### IXa. Libraries, Modules, and Functions

With Python alone, a programmer can perform some basic operations using simple functions like **print()**, **len()**, **max()**, **min()**, **sorted()** as well as some methods applied directly to particular data types (like **.lower()** and **.upper()** for strings).

However, to do more advanced or specialized things we need to install and import Python **Libraries** also known as **packages**.

A **Python library** is a collection of files (known as **modules**) that each contain **functions** to complete a set of related tasks.

_Confused?_

_This can get confusing as some large libraries have multiple sub-packages each with many different modules. In other cases a library consists of a single module._ **_The important thing to know is that you need to import each library or module you want to use._**/

Some commonly used modules are found in [Python's Standard Library](https://docs.python.org/3/library/).

### IXb. Working with Libraries

For more on functions, we can refer to a more detailed lesson provided by [**Constellate's** Python 4 lesson](https://lab.constellate.org/perfusion-stearns-eliot/notebooks/tdm-notebooks-2023-04-03T23%3A17%3A07.601Z/python-basics-4.ipynb):

```
**Functions**

You can identify a function by the fact that it ends with a set of parentheses () where arguments can be passed into the function. Depending on the function (and your goals for using it), a function may accept no arguments, a single argument, or many arguments. For example, when we use the print() function, a string (or a variable containing a string) is passed as an argument.

Functions are a convenient shorthand, like a mini-program, that makes our code more modular. We don't need to know all the details of how the print() function works in order to use it. Functions are sometimes called "black boxes", in that we can put an argument into the box and a return value comes out. We don't need to know the inner details of the "black box" to use it. (Of course, as you advance your programming skills, you may become curious about how certain functions work. And if you work with sensitive data, you may need to peer in the black box to ensure the security and accuracy of the output.)

**Libraries & Modules**

While Python comes with many functions, there are thousands more that others have written. Adding them all to Python would create mass confusion, since many people could use the same name for functions that do different things. The solution then is that functions are stored in modules that can be imported for use. A module is a Python file (extension ".py") that contains the definitions for the functions written in Python. These modules (individual Python files) can then be collected into even larger groups called packages and libraries. Depending on how many functions you need for the program you are writing, you may import a single module, a package of modules, or a whole library.
```

First, we will import the [**math module**](https://docs.python.org/3/library/math.html).

The syntax for importing a module is:

```
import module_name
```


In [None]:
import math

By importing the math module, for example, we have access to a wide range of mathematical functions (see the [math module documentation here](https://docs.python.org/3/library/math.html)).

For example, we can calculate the square root of a large number (by using the **math** module's **sqrt** function) or identify the value of pi (as well as more advanced mathematical operations).


In [None]:
print(math.pi)

Use the **help()** function to learn more about the math module.


In [None]:
help(math)

## X. Working with Files

An essential skill in Python is to be able to navigate through files on your computer to either read in existing files into Python or to output new files.

To enable navigating through files on your computer, we will use the **os** and **pathlib** libraries.

Let's import them now.


In [None]:
# import os
from pathlib import Path

In [None]:
# Examine what the following functions do. Hint: **cwd()** means "current working directory."
print(Path.cwd())
print(Path.cwd().parent)
print(Path.cwd().parent.parent)

In [None]:
# Open dataset of texts
sotudir = Path("~/shared/RR-workshop-data/state-of-the-union-dataset/txt").expanduser()
print(sotudir)

# We can then print out all files ending in the ".txt" extension using:
pathlist = sorted(sotudir.glob("*.txt"))
print(pathlist)

For each path in the pathlist, we can extract only the name of the file (rather than the whole path) using the .name method. For more on pathlib functions and methods see [pathlib documentation](https://docs.python.org/3/library/pathlib.html).
