# Activity 1 - Python Basics & NDBC Weather Data
**2020 Data Labs REU**

*Written by Sage Lichtenwalner, Rutgers University, June 9, 2020*

Welcome to Python!  In this notebook, we will demonstrate how you can quickly get started programming in Python, using Google's cool [Colaboratory](https://colab.research.google.com) platform. Colab is basically a free service that can run Python/Jupyter notebooks in the cloud.

In this notebook, we will demonstrate some of the basics of programming Python.  If you want to lean more, there are lots of other resources and training sessions out there, including the official [Python Tutorial](https://docs.python.org/3/tutorial/).  But as an oceanographer, you don't really need to know all the ins-and-outs of programming (though it helps), especially when just starting out.  

Over the next few sessions we will cover many of the basic recipes you need to:
* Quickly load some data
* Make some quick plots, and make them look good
* Calculate a few basic statistics and averages
* And save the data to a new file you can use elsewhere.

## Getting Started

Jupyter notebooks have two kinds of cells.  "Markdown" cells, like this one which can contain formatted text, and "Code" cells, which contain the code you will run.

To execute the code in a cell, you can either:
* click the **Play** icon on the left
* type **Cmd (or Ctrl) + Enter** to run the cell in place 
* or type **Shift + Enter** to run the cell and move the focus to the next cell.

You can try all these options on our first very elaborate piece of code in the next cell.

After you execute the cell, the result will automatically display underneath the cell.

In [0]:
2+2

In [0]:
print("Hello, world!")

In [0]:
# This is a comment

As we go through the notebooks, you can add your own comments or text blocks to save your notes.

In [0]:
# Your Turn: Create your own print() command here with your name
print()

**A note about print()**

* By default, a Colab/Jupyter notebook will print out the output from the last line, so you don't have to specify the `print()` command.  
* However, if we want to output the results from additional lines (as we do below), we need to use `print()` on each line.
* Sometimes, you can suppress the output from the last line by adding a semi-colon `;` at the end.

In [0]:
3
4
5

In [0]:
print(3)
print(4)
print(5)

## Some Basics
Let's review a few basic features of programming.  

First, it's great for math.  You can use addition (+), subtraction (-), multiplication (\*), division (/) and exponents (**).

In [0]:
# Your Turn: Try some math here
5*2

The order of operations is also important.

In [0]:
print(5 * 2 + 3)
print(5 * (2+3))
print((5 * 2) + 3)

## Variables

In [0]:
# We can eailsy assign variables, just like in other languages
x = 4
y = 2.5

In [0]:
# And we can use them in our formulas
print(x + y)
print(x/y)

In [0]:
# What kind of objects are these?
print(type(x))
print(type(y))

## Strings

In [0]:
# A string needs to be in quotes (single or double)
z = 'Python is great'
z

In [0]:
# You can't concatenate (add) strings and integers
print( z + x )

In [0]:
# But you can multiply them!
print( z * x )

In [0]:
# If you convert an integer into a string, you can then catenate them
print( z + ' ' + str(x) + ' you!' )

In [0]:
# A better way
print( 'Python is great %s you!' % x )

## Fun with Lists
Remember, Python uses 0-based indexes, so to grab the first element in a list you actually use "0".  The last element is n-1, or just "-1" for short. In Matlab this would be 1 to n, or 1:end.

In [0]:
my_list = [3, 4, 5, 9, 12, 13]

In [0]:
# The fist item
my_list[0]

In [0]:
# The last item
my_list[-1]

In [0]:
# Extract a subset
my_list[2:5]

In [0]:
# A subset from the end
my_list[-3:]

In [0]:
# Update a value
my_list[3] = 99
my_list

In [0]:
# Warning, Python variables are object references and not copies by default
my_second_list = my_list
print( my_second_list )

my_second_list[0] = 66

print( my_second_list )
print( my_list ) # The first list has been overwritten

In [0]:
# To avoid this, create a copy of the list, which keeps the original intact
my_list = [3, 4, 5, 9, 12]

my_second_list = list(my_list) # You can also use copy.copy() or my_list[:]

my_second_list[0] = 66

print( my_second_list )
print( my_list )

## Arrays
Note, a list is not an array by default.  But we can turn it into an array using the *NumPy* library.

NumPy is an essential library for working with scientific data. It provides an array object that is very similar to Matlab's array functionality, allowing you to perform mathematical calculations or run linear algebra routines.

In [0]:
my_list * x

In [0]:
import numpy as np

In [0]:
a = np.array(my_list)
a * x

Note, we won't be explicitly creating NumPy arrays much in this course.  But later on, when we load datasets using Pandas or Xarray, the actually arrays under the hood will be numpy arrays.

## Dictionaries
These are a great way to stored structured data of different types.  You'll often find metadata information inside dictionaries.

In [0]:
my_dict = {'temperature': 21, 'salinity':35, 'sensor':'CTD 23'}
my_dict

In [0]:
# Grab a list of dictionary keys
my_dict.keys()

In [0]:
# Accessing a key/value pair
my_dict['sensor']

## Functions, Conditions and Loops
If you're familiar with how to do these in Matlab or R, it's all very similar, just with a different syntax. 

Remember, Python uses spaces to group together sub-elements, rather than parentheses, curly braces, or end statements. Traditionally, you can use 2 or 4 spaces to indent lines.

In [0]:
def times_two(num):
  return num * 2;

In [0]:
times_two(3)

In [0]:
def my_name(name='Sage'):
  return name;

In [0]:
my_name()

Here one quick example that demonstrates how to define a function, use a conditional, and iterate over a for loop all at once.

In [0]:
# A more complicated function
def my_func(number):
  print('Running my_func')
  if type(number)==int:
    for i in range(number):
      print(i)
  else:
    print("Not a number")


In [0]:
my_func('Test')

In [0]:
my_func(4)

## Fun with NDBC Data
Now that we've covered some basics, let's start having some fun with actual ocean data.

The [National Data Buoy Center (NDBC)](https://www.ndbc.noaa.gov) provides a great dataset to start with.  And for this example, we'll use my favorite buoy [Station 44025](https://www.ndbc.noaa.gov/station_page.php?station=44025).

![NDBC Mid-Atlantic Station Map](https://www.ndbc.noaa.gov/images/maps/NorthEast.gif)

To load datasets like this, there are 2 popular libraries we can use.

* Pandas
  * Great for working with "spreadsheet-like" tables that have headers and rows, like Excel or CSV files
  * Can easily load text or CSV files 
* Xarray
  * Supports multidimensional arrays (e.g. x,y,z,t)
  * Can open NetCDF files or data from Thredds servers which are common in Oceanography
  * If you're using a Thredds server, you don't have to load all the data to use it

NDBC actually makes their data available in a variety of ways.  Text files are often more intuitive.  However, the NDBC text files require a few hoops to load a use (each file is a separate year, dates are in multiple columns, etc.).

Luckily, NDBC also provides a Thredds server [DODS](https://dods.ndbc.noaa.gov), which we can use to quickly load some data to play with.

In [0]:
import xarray as xr
!pip install netcdf4

In [0]:
data = xr.open_dataset('https://dods.ndbc.noaa.gov/thredds/dodsC/data/stdmet/44025/44025.ncml')

In [0]:
# The Dataset
data

In [0]:
# Let's look at one variable
data.air_temperature

In [0]:
# And one piece of metadata
data.air_temperature.long_name

In [0]:
# Now let's make a quick plot
data.air_temperature.plot();

In [0]:
# Let's subset the data in time
data2 = data.sel(time=slice('2019-01-01','2020-01-01'))

In [0]:
# Let's make that quick plot again
data2.air_temperature.plot();

In [0]:
import matplotlib.pyplot as plt

In [0]:
# We can even plot 2 variables on one graph
data2.air_temperature.plot(label="Air Temperature")
data2.sea_surface_temperature.plot(label="Sea Surface Temperature")
plt.legend();

Tomorrow, we'll delve a lot more into data visualization and many of the other plotting commands you can use.  But now, it's your turn to create your own plots. 

Try plotting different:
* Variables (see options above)
* Time ranges (you will need to reload the dataset)
* Different stations (you will need to change the dataset URL). Check out the [NDBC homepage](https://www.ndbc.noaa.gov) for available stations

As you create your graphs, try to write *figure captions* that describe what you think is going on.

In [0]:
# Your Turn: Create some plots 

### Additional Intros and References
[2019 Data Labs Quick Intro to Pytyon](https://github.com/ooi-data-lab/data-lab-workshops/blob/master/Summer_Examples/Python_Introduction.ipynb)

[2018 Python Basics for Matlab Wizards](https://github.com/ooi-data-review/2018-data-workshops/blob/master/chemistry/examples/extras2/Python_Basics_for_Matlab_Wizards.ipynb)

[Rowe Getting Started with Python](https://github.com/prowe12/python_resources/blob/master/Introduction_to_python3.ipynb)