## Running code cells

This is a *markdown* cell that allows you to provide explanations for your code. Read more about Markdown [here](https://markdownguide.org).

In [1]:
2 + 2

4

In [2]:
# This is a comment
# Code cells can contain multiple lines of code
2 * 2

4

## Variables and Data Types

One of the most basic things we can do in Python is assign values to **variables**:

In [3]:
text = "Botany 2021"  # An example of a string
number = 42  # An example of an integer
pi_value = 3.1415  # An example of a float

Here we’ve assigned data to the variables *text*, *number* and *pi_value*, using the assignment operator =. To review the value of a variable, we can type the name of the variable into the interpreter and press *Return*:

In [4]:
text

'Botany 2021'

Everything in Python has a type. To get the type of something, we can pass it to the built-in **function** type:

In [5]:
type(text)

str

In [6]:
type(number)

int

In [7]:
type(pi_value)

float

We can also use comparison and logic operators: <, >, ==, !=, <=, >= and statements of identity such as and, or, not. The data type returned by this is called a **boolean**.

In [8]:
3 > 4

False

In [9]:
result = pi_value > number
type(result)

bool

### Collections: Lists and Dictionaries

A **list** is a common data structure to hold an ordered sequence of elements. Each element can be accessed by an index. Note that Python indexes start with 0 instead of 1:

In [10]:
numbers = [1, 2, 3]
numbers[0]

1

To add elements to the end of a list, we can use the append **method**. Methods are a way to interact with an object (a list, for example). We can invoke a method using the dot . followed by the method name and a list of arguments in parentheses. Let’s look at an example using append:

In [11]:
numbers.append(4)
print(numbers)

[1, 2, 3, 4]


A **dictionary** is a container that holds pairs of objects - **keys** and **values.**

In [12]:
prunus_common = {'cerasus': 'sour cherry', 'armeniaca': 'apricot'}
prunus_common['armeniaca']

'apricot'

Dictionaries work a lot like lists - except that you index them with keys. You can think about a key as a name or unique identifier for the value it corresponds to.

To add an item to the dictionary we assign a value to a new key:

In [13]:
prunus_common['dulcis'] = 'almond'
print(prunus_common)

{'cerasus': 'sour cherry', 'armeniaca': 'apricot', 'dulcis': 'almond'}


## Loops

A **for loop** can be used to access the elements in a list or other Python data structure one at a time.

Indentation is very important in Python. Note that the second line in the example below is indented. Colab should indent if for you.

In [14]:
for num in numbers:
    print(num)

1
2
3
4


In [15]:
for letter in text:
    print(letter)

B
o
t
a
n
y
 
2
0
2
1


For loops for dictionaries are a bit more complicated, since each entry has a key and a value.

Here is how to loop through all of the keys.

In [16]:
for species in prunus_common.keys():
    print(species)

cerasus
armeniaca
dulcis


You can also loop through key/value pairs:

In [17]:
for species, common in prunus_common.items():
    print('The common name for Prunus',species,'is',common)

The common name for Prunus cerasus is sour cherry
The common name for Prunus armeniaca is apricot
The common name for Prunus dulcis is almond


## Functions

We've seen a few built-in functions so far (like *type()* or *print()*), but we can create our own functions.

Defining a section of code as a function in Python is done using the *def keyword. For example a function that takes two arguments and returns their sum can be defined as:

In [18]:
def add_function(a, b):
    result = a + b
    return result

z = add_function(20, 22)
print(z)

42


## Python packages

A Python package (or library) is a collection of custom functions and data types for use by other programs.

You can import a Python package with the *import* keyword.

In [19]:
import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


A very commonly used Python package for working with data is called *pandas*. Let's import pandas to create a custom dataframe.

In [20]:
import pandas as pd

Let's make a list of dictionaries to feed into pandas.

In [21]:
records = [{'voucher_number':'123', 'collector':'Mike', 'scientific_name':'Prunus cerasus'},
           {'voucher_number':'124', 'collector':'Richie', 'scientific_name':'Prunus armeniaca'}]
df = pd.DataFrame(records)

In [22]:
type(df)

pandas.core.frame.DataFrame

In [23]:
df

Unnamed: 0,voucher_number,collector,scientific_name
0,123,Mike,Prunus cerasus
1,124,Richie,Prunus armeniaca


In [24]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2 entries, 0 to 1
Data columns (total 3 columns):
 #   Column           Non-Null Count  Dtype 
---  ------           --------------  ----- 
 0   voucher_number   2 non-null      object
 1   collector        2 non-null      object
 2   scientific_name  2 non-null      object
dtypes: object(3)
memory usage: 176.0+ bytes


In [25]:
df.to_csv('specimen_records.csv')