# Day 2: From text files to plots
Today we'll learn how to read text files into Python, how to extract data from those files, and how to create plots.

Session outline:
1. Introduction to `numpy` (Matlab-style matrices in Python)
2. Creating plots with `matplotlib` (functionally similar to Matlab plots)
3. Loading data from text files with `numpy`
4. More plotting

## Arrays with `numpy`
Numpy is a matrix manipulation package for Python. Numpy arrays are similar to Matlab marices, although there are some notable differences, which are outlined in:
* https://numpy.org/doc/stable/user/numpy-for-matlab-users.html

In [None]:
import numpy as np # import the numpy package

# creating one-dimensional numpy arrays

# array indexing

# numpy arrays vs. Python lists

# element-wise operations

# creating 2-dimensional arrays

# matrix transposition and multiplication


## Plotting with `matplotlib`
Using `matplotlib` to plot data stored in `numpy` arrays.

In [None]:
import numpy as np
import matplotlib.pyplot as plt

# needed to use matplotlib in Jupyter notebooks
%matplotlib inline 

## Loading text files
First, let's print the contents of the file.

In [None]:
%%bash
cat D2/Dovre1-Snoheim.txt

Next, let's load the data into Python with the `loadtext` function in `numpy`.
* https://numpy.org/doc/stable/reference/generated/numpy.loadtxt.html

In [None]:
# loading text files with np.loadtext
filename = 'D2/Dovre1-Snoheim.txt'

That didn't work!

`numpy.loadtext` doesn't know what to do with the datetime strings in the first column. Because `numpy` arrays can only contain numbers we need to convert the strings into numbers when loading the file.

In [None]:
# converting datetime strings to and from floats
from datetime import datetime
from matplotlib.dates import num2date, date2num

# Plotting the data we loaded
1. Extract the data we want to plot
2. Convert floats back to dates
3. Create the plot

In [None]:
# make the columns available as variables


In [None]:
# create the plot
import matplotlib.pyplot as plt

# add y-axis label

# change date format
import matplotlib.dates as mdates


## Windspeed on the right y-axis

In [None]:
# create the plot

# add y-axis label

# change date format
import matplotlib.dates as mdates

# add windspeed on the right y-axis


# Exercise: Airtemp vs. windspeed scatter plot
1. Create a scatter plot with air temperature (`airtemp`) on the x axis and wind speed (`windspeed`) on the y axis. Add axis labels, set axis limits, adjust colors, etc.
2. Save the figure as a pdf using the `plt.savefig` command
    * https://matplotlib.org/3.2.1/api/_as_gen/matplotlib.pyplot.savefig.html

# Exercise: recreate the following figure
* Data: `D2/rro_Bulken.txt`
* Runoff is the 3-rd column from the right
<img src="D2/bulken.png">

In [None]:
%%bash
cat D2/rro_Bulken.txt

In [None]:
import matplotlib.pyplot as plt
import matplotlib.dates as mdates

from datetime import datetime
from matplotlib.dates import num2date, date2num

def str2date_rro(s):
    date = datetime.strptime(s, "%d%m%Y")
    return date2num(date)

def missing_to_NaN(istr):
    ''' Convert a string containing a number to a float, interpreting unparsable strings as NaN '''
    try:
        val = float(istr)
    except ValueError:
        val = float('NaN')
    
    return val

data = np.loadtxt("D2/rro_Bulken.txt", encoding='latin1', converters={
    0: str2date_rro,
    1: missing_to_NaN,
    2: missing_to_NaN,
})

### YOUR CODE HERE ###

# Exercise: recreate the following figure
* Data: `D2/rr24_Bulken.txt`
* You can create bar plots with ``plt.bar``. The width of the bars can be changed by the ``width=value``-keyword argument, which with time date takes a value in the unit of days.
 
 <img src="D2/bulken_precip.png">

In [None]:
%%bash
cat D2/rr24_Bulken.txt

In [None]:
import matplotlib.pyplot as plt
import matplotlib.dates as mdates

from datetime import datetime
from matplotlib.dates import num2date, date2num

def str2date_rr24(s):
    date = datetime.strptime(s, "%d.%m.%Y")
    return date2num(date)

def missing_to_NaN(istr):
    ''' Convert a string containing a number to a float, interpreting unparsable strings as NaN '''
    try:
        val = float(istr)
    except ValueError:
        val = float('NaN')
    
    return val

data = np.loadtxt("D2/rr24_Bulken.txt", encoding='latin1', usecols=(1, 2, 3), skiprows=21, converters={
    1: str2date_rr24,
    2: missing_to_NaN,
    3: missing_to_NaN,
})

### YOUR CODE HERE ###