# Get Started with Python
Earth Analytics Bootcamp Demo Notebook Week 2

## Variables
Python uses "duck typing," which makes it easy to get started because you don't have to specify the type of every variable. Sometimes you will still want to know what type Python decided on - For that you can use the `type()` function.

*Define a variable for the conversion factor between inches and millimeters*

*Define a variable for the number of months in a year*

*Define a variable for the 3-letter abbreviation for the month of November*

## Operators

*Convert 1 mm of rain to inches*

*Define a variable, precip, as 2.36 inches, and convert **the same variable** to mm*

*Define two variables, `april_precip_mm` (43.2 mm) and `may_precip_mm` (15.6 mm). Check if April precipitation was more than May precipitation.*

#### Some less intuitive operators
There are a couple of operators in Python that don't do what most people expect

*Compute the area of a 500 m raster pixel*

In [1]:
# Try the caret operator


In [2]:
# Now the correct operator


*Check if April and May precipitation are equal*

In [3]:
# We already know what the equal sign does...

In [4]:
# The correct operator:

#### Underneath the hood - Equality

*Define two variables with values 2.0 and 2. Test if they are equal*

*Check if they are the same type*

*Now test if they are the same*

*Finally, convert the float to an integer and test if they are the same*

## Lists
*Create a list of the following (invented) monthly maximum Fahrenheit temperature values:*
  1. 3.4
  2. 10.5
  3. 20.1
  4. 45.3
  5. 57.8
  6. 75.8
  7. 85.3
  8. 90.2
  9. 73.2
  10. 52.9
  11. 35.2
  12. 25.1

In [5]:
monthly_max_temp_lst = [
    3.4,  10.5, 20.1, 45.3, 57.8, 75.8, 
    85.3, 90.2, 73.2, 52.9, 35.2, 25.1]
monthly_max_temp_lst, type(monthly_max_temp_lst)

([3.4, 10.5, 20.1, 45.3, 57.8, 75.8, 85.3, 90.2, 73.2, 52.9, 35.2, 25.1], list)

What about this method of storing temperatures might be considered inconvenient or error-prone?

### Slicing and Indexing lists
Python indices start at zero!

*Pull out the March temperature, 20.1 deg F, from the list*

In [6]:
monthly_max_temp_lst[2]

20.1

*Pull out the last element of the `monthly_temp_f` list*

In [7]:
monthly_max_temp_lst[11]

25.1

*Pull out the second-to-last element of the list. Use the Python shortcut for getting elements counting from the end!*

In [8]:
monthly_max_temp_lst[-2]

35.2

*Get a list of the temperatures for the summer months (March, April, May)*

In [9]:
monthly_max_temp_lst[2:5]

[20.1, 45.3, 57.8]

### Changing and combining lists

*Make a list of month names for each season (we'll define seasons as DJF, MAM, JJA, and SON)*

In [10]:
winter_month_names = ['December', 'January', 'February']
spring_month_names = ['March', 'April', 'May']
summer_month_names = ['June', 'July', 'August']
fall_month_names = ['September', 'October', 'November']
winter_month_names, spring_month_names, summer_month_names, fall_month_names

(['December', 'January', 'February'],
 ['March', 'April', 'May'],
 ['June', 'July', 'August'],
 ['September', 'October', 'November'])

*Combine the lists to make a single list of all months in the year*

In [13]:
month_names = (
    winter_month_names 
    + spring_month_names 
    + summer_month_names 
    + fall_month_names)
month_names

['December',
 'January',
 'February',
 'March',
 'April',
 'May',
 'June',
 'July',
 'August',
 'September',
 'October',
 'November']

*We'd like the list to start with January. Move December to the end*

In [15]:
month_names = month_names[1:] + [month_names[0]]
month_names

['February',
 'March',
 'April',
 'May',
 'June',
 'July',
 'August',
 'September',
 'October',
 'November',
 'December',
 'January']

What happens if you run the cell above multiple times?

*Modify the code above so that it only runs if the list does not already start with January*

In [39]:
if not month_names[0] == 'January':
    month_names = month_names[1:] + [month_names[0]]
    
month_names

['January',
 'February',
 'March',
 'April',
 'May',
 'June',
 'July',
 'August',
 'September',
 'October',
 'November',
 'December']

*Define a variable `month_num` (4). Use it with your list and the `.format()` method of strings  to print out the phrase "April is month 4". Try other months to make sure it works.*

In [38]:
month_num = 4
"April is month {}".format(month_num)

'April is month 4'

*Print out the following phrase about each month: "January begins with Jan"*

In [48]:
for month in month_names:
    print('{} begins with {}'.format(month, month[:3]))

January begins with Jan
February begins with Feb
March begins with Mar
April begins with Apr
May begins with May
June begins with Jun
July begins with Jul
August begins with Aug
September begins with Sep
October begins with Oct
November begins with Nov
December begins with Dec
December


*Use the `enumerate()` function to print the following phrase about each month: "January is month 1"*

In [54]:
for i, month_name in enumerate(month_names):
    print("{} is month {}".format(month_name, i+1))

January is month 1
February is month 2
March is month 3
April is month 4
May is month 5
June is month 6
July is month 7
August is month 8
September is month 9
October is month 10
November is month 11
December is month 12


*Using the `zip()` function, print out the following phrase for each month: "In January the maximum temperature was 3.4 degrees F".*

In [56]:
for month_name, temperature_f in zip(month_names, monthly_max_temp_lst):
    print((
        "In {} the maximum temperature was {} degrees F"
        .format(month_name, temperature_f)))

In January the maximum temperature was 3.4 degrees F
In February the maximum temperature was 10.5 degrees F
In March the maximum temperature was 20.1 degrees F
In April the maximum temperature was 45.3 degrees F
In May the maximum temperature was 57.8 degrees F
In June the maximum temperature was 75.8 degrees F
In July the maximum temperature was 85.3 degrees F
In August the maximum temperature was 90.2 degrees F
In September the maximum temperature was 73.2 degrees F
In October the maximum temperature was 52.9 degrees F
In November the maximum temperature was 35.2 degrees F
In December the maximum temperature was 25.1 degrees F


## Dictionaries
Sometimes we'd like to refer to objects by name in Python, instead of by their order alone. Dictionaries let us do that.

A note: dictionaries usually don't have an order. However, Python was recently updated to make dictionaries more efficient - and it so happens that this method also ensures that dictionary keys will always remain in the order they were added. There is also an object called an `OrderedDict` from the `collections` library that will allow you to control the order of a dictionary like you do a list.

*Create a dictionary where the keys are the month names and the values are the Fahrenheit maximum monthly temperatures.*

In [64]:
max_temp_dict = {}
for month_name, temperature_f in zip(month_names, monthly_max_temp_lst):
    max_temp_dict[month_name] = temperature_f
max_temp_dict

{'January': 3.4,
 'February': 10.5,
 'March': 20.1,
 'April': 45.3,
 'May': 57.8,
 'June': 75.8,
 'July': 85.3,
 'August': 90.2,
 'September': 73.2,
 'October': 52.9,
 'November': 35.2,
 'December': 25.1}

*What was the maximum temperature in March?*

In [60]:
max_temp_dict['March']

20.1

*Suppose we noticed an error in the October temperature - it should actually be 52.0. Correct the error*

In [65]:
max_temp_dict['October'] = 52.0
max_temp_dict

{'January': 3.4,
 'February': 10.5,
 'March': 20.1,
 'April': 45.3,
 'May': 57.8,
 'June': 75.8,
 'July': 85.3,
 'August': 90.2,
 'September': 73.2,
 'October': 52.0,
 'November': 35.2,
 'December': 25.1}

*Try removing the value for June and adding it back again. What happens to the order?*

In [66]:
del max_temp_dict['June']
max_temp_dict['June'] = 75.8
max_temp_dict

{'January': 3.4,
 'February': 10.5,
 'March': 20.1,
 'April': 45.3,
 'May': 57.8,
 'July': 85.3,
 'August': 90.2,
 'September': 73.2,
 'October': 52.0,
 'November': 35.2,
 'December': 25.1,
 'June': 75.8}

*That was a lot of typing to insert the dictionary! More typing means more errors. Try to create the dictionary one month at a time using a for loop instead.*

*What was the maximum temperature in month 8?*

In [67]:
max_temp_dict[month_names[7]]

90.2

*Print out "January begins with Jan" for each month using the dictionary*

In [72]:
for month_name in max_temp_dict:
    print("{} begins with {}".format(month_name, month_name[:3]))

January begins with Jan
February begins with Feb
March begins with Mar
April begins with Apr
May begins with May
July begins with Jul
August begins with Aug
September begins with Sep
October begins with Oct
November begins with Nov
December begins with Dec
June begins with Jun


*Print out the following phrase for each month: "In January the maximum temperature was 3.4 degrees F"*

In [73]:
for month_name, temperature_f in max_temp_dict.items():
    print((
        "In {} the maximum temperature was {} degrees F"
        .format(month_name, temperature_f)))

In January the maximum temperature was 3.4 degrees F
In February the maximum temperature was 10.5 degrees F
In March the maximum temperature was 20.1 degrees F
In April the maximum temperature was 45.3 degrees F
In May the maximum temperature was 57.8 degrees F
In July the maximum temperature was 85.3 degrees F
In August the maximum temperature was 90.2 degrees F
In September the maximum temperature was 73.2 degrees F
In October the maximum temperature was 52.0 degrees F
In November the maximum temperature was 35.2 degrees F
In December the maximum temperature was 25.1 degrees F
In June the maximum temperature was 75.8 degrees F


*Print out the following phrase for each month using the `.items()` method of dictionaries: "In January the maximum temperature was 3.4 degrees F"*

## Pandas DataFrames
Right now, our temperature list doesn't have any labels, which makes it hard to work with. The `pandas` library offers a solution for this problem. DataFrames are a lot like databases or spreadsheets. If you use dataframes or tibbles in R, there are a lot of similarities there as well (although beware that Python does not support the fancy quoting/quosure syntax that R does).

*Make a DataFrame with your two lists, `months` and `monthly_temp_f` as columns*

In [84]:
import pandas as pd
temp_df = pd.DataFrame({
    'month_name': month_names,
    'max_temp_f': monthly_max_temp_lst
})
temp_df

Unnamed: 0,month_name,max_temp_f
0,January,3.4
1,February,10.5
2,March,20.1
3,April,45.3
4,May,57.8
5,June,75.8
6,July,85.3
7,August,90.2
8,September,73.2
9,October,52.9


*Take a look at the month column*

In [85]:
temp_df.month_name

0       January
1      February
2         March
3         April
4           May
5          June
6          July
7        August
8     September
9       October
10     November
11     December
Name: month_name, dtype: object

*Take a look at row 5 - use the `.loc` attribute of DataFrames to access rows*

In [86]:
temp_df.loc['September']

KeyError: 'September'

*We'd like to be able to access temperature values by month name. The easiest way to do this in pandas is to set an Index, or a set of row names, using the `.set_index()` method of DataFrames. What was the maximum temperature in October?*

In [80]:
temp_df.set_index('month_name', inplace=True)
temp_df

Unnamed: 0_level_0,max_temp_f
month_name,Unnamed: 1_level_1
January,3.4
February,10.5
March,20.1
April,45.3
May,57.8
June,75.8
July,85.3
August,90.2
September,73.2
October,52.9


*Take a look at your DataFrame again. is this what you would expect? Set the Index using the `inplace=True` argument so that it persists.*

In [81]:
temp_df.index

Index(['January', 'February', 'March', 'April', 'May', 'June', 'July',
       'August', 'September', 'October', 'November', 'December'],
      dtype='object', name='month_name')

In [82]:
temp_df.loc['September']

max_temp_f    73.2
Name: September, dtype: float64

*Write a **function** to convert a temperature from Fahrenheit to Celcius (subtract 32 then multiply by 5/9). Test it on the value 57.8 degrees F (roughly 14.3 degrees C)*

*Convert all the Fahrenheit values in your `temp_f` column using the `.apply()` method of pandas Series.*

*Plot the temperatures using the `.plot.bar()` method of DataFrames*