Introduction to Python and Jupyter Notebooks
============================================
This will be a breakdown of both the very basics of python and the Jupyter Notebook environment. You will learn python syntax, data types, functions, and more. You will also learn how to use Jupyter Notebooks. There will be included some explations of web apis and data scraping.\
Work through the notebook in order, and complete the exercises as you go.\
Note that you can and should read the comments (lines starting with #) in the code cells, as they provide explanations and hints.

Run the cell below to get started so you have the necessary libraries installed.

In [None]:
# This cell must be run before any other code cells in the notebook
import piplite

await piplite.install("requests")
await piplite.install("numpy")
await piplite.install("pandas")
await piplite.install("matplotlib")
await piplite.install("seaborn")

# Intergers, Floats, Strings, and Booleans
Nearly every programming language has a concept of integers, floats, strings, and booleans. They are the fundamental.
- Integers are whole numbers, positive or negative.
- Floats are numbers with a decimal point.
- Strings are sequences of characters. Enclosed within single or double quotes (either or as long as it is the same on either end).
- Booleans are True or False.

In [None]:
# Here is an example of a integer
x = 5

# We can perform operations on integers
y = x + 2
print(y)

In [None]:
# Floats are numbers with decimal points and are also valid for operations
x = 5.123
y = x / 2
print(y)

In [None]:
# Strings are sequences of characters enclosed in quotes
x = "Hello, World!"
y = 'Hello, again?'

# We can do a combination of single and double quotes to include quotes in a string
z = "He said, 'Hello!'"
print(z)

In [None]:
# You can also concatenate strings
x = "Hello"
y = "World"
print(x + " " + y) # NOTE: The added string with a single space in between

In [None]:
# Booleans are True or False values that can be used in logical operations
x = True
y = False

z = x or y # Try changing the 'or' to 'and' and see what happens
print(z)

In [None]:
# Comparison operators can be used to compare values and output a boolean
print(5 > 3)

In [None]:
# We can combine comparison operators with logical operators to create complex conditions
print(5 > 3 and 5 > 10)

## Exercise 1
In the cell below, create a variable called try and guess the outputs of the following expressions. Add the outputs as comments in the cell. Finally run the cell to see if you were correct.

In [None]:
print(5 > 3 or 5 > 10) # Answer: ?
print(6 <= 7) # Answer: ?
print(150 == 6) # Answer: ?
print(6 != 6) # Answer: ?

# Variables
Variables allow you to store bits of data and reference them later. They set out a box of space for you to store data. There are *techincally* no rules as to what you can name a variable, but there are some conventions.
- Variables should be named descriptively. Best practice is not use just a random letter.
- Avoid using python keywords (like 'print', 'for', 'if', etc.) as variable names. This will cause errors as you overwrite the original functionality.
- Do not use spaces in variable names. Use underscores instead.
- Variables are case-sensitive. 'my_variable' is different from 'My_Variable'.
- Don't start a variable name with a number, special character, or capital letter.

There are a couple different "cases" for naming variables:
- **snake_case**: all lowercase, words separated by underscores. (e.g. my_variable)
- **camelCase**: first word is lowercase, subsequent words are capitalized. (e.g. myVariable)
<br/>

Next we will cover scope of variables. Variables can be either **local** or **global**.
- **Local variables** are defined within a function and can only be accessed within that function.
- **Global variables** are defined outside of a function and can be accessed anywhere in the code.

In [None]:
# Here I am going to define a global variable which is in the global scope (outside of any function and can be accessed from anywhere in the code)
MY_GLOBAL = 5 # Often they are written in all caps to indicate that they are global
# To demonstrate the difference between global and local variables I am going to define a global variable called x
x = 5

# I am going to define a function, don't worry about the syntax for now I will explain it later on
# Just pay attention to the variable names and the outputs
def myFunc():
    x = 10 # This is a local variable, it is only accessible within the function 
    print(x) 

# When you think you know what the output will be, uncomment the line (remove the #) and run the code
#myFunc() # This will call the function and print the value of x

# Lists, Tuples, Sets, and Dictionaries
These are the four main ways to store collections of data in Python.
- **Lists** are ordered, mutable (changeable), and allow duplicate elements. They are defined by square brackets [ ]. (e.g. my_list = [1, 2, 3])
- **Tuples** are ordered, immutable (unchangeable), and allow duplicate elements. They are defined by parentheses ( ). (e.g. my_tuple = (1, 2, 3))
- **Sets** are unordered, mutable, and do not allow duplicate elements. They are defined by curly braces { }. (e.g. my_set = {1, 2, 3})
- **Dictionaries** are unordered, mutable (changeable), do not allow duplicate elements, and are indexed by keys. They are defined by curly braces {} and key-value pairs. (e.g. my_dict = {'name': 'John', 'age': 25})

In [None]:
# Lists are a collection of values that can be accessed by index (starting from 0)
x = [1, 2, 3, 4, 5]
print(x[0]) # This will output the first element in the list

# The values within a list can be of any type
y = [1, "Hello", True, 5.123]
print(y)

In [None]:
# The list can be modified by changing the values
x = [1, 2, 3, 4, 5]
print(x)
# Now we will change the value at index 2 (third element) to 10
x[2] = 10
print(x)

In [None]:
# We can also use negative indices to access elements from the end of the list
x = [1, 2, 3, 4, 5]
print(x[-1]) # This will output the last element in the list

In [None]:
# We can slice lists to get a subset of the values using the colon operator
x = [2, 7, 1, 8]
print(x[1:3]) # This will output the second and third elements (index 1 and 2) we had to use 3 as the end index because the end index is exclusive

In [None]:
# We can use the colon operator with only one index to get all the elements from that index to the end
x = [7, 3, 9, 1, 5, 6, 8, 7] 
print(x[2:]) # Printing all elements from index 2 (the third element) to the end

In [None]:
# You can access the length of a list using the len function
x = [100, 200, 300]
y = len(x)
print(y)

In [None]:
# If we want to add a value to the end of the list we can use something called the append method
# I will explain methods are and how they differ from functions later on
x = [4.5, 6.7, 8.9]
print(x)

# We will now append the value of 10.15 to the end of the list
x.append(10.15)
print(x)

In [None]:
# A tuple is similar to a list but it is immutable, meaning that it cannot be changed once assigned
x = (1, 2, 3)

# This will throw an error because we are trying to change the value of a tuple
# Uncomment the line to see the error
#x[2] = 10
print(x)

In [None]:
# Sets are similar to lists but they do not allow duplicate values and are unordered
x = {35, 15, 29, 35, 15}

# Think about what the output will be
print(x)

In [None]:
# Dictionaries are a collection of key-value pairs and are assigned using curly braces as well
x = {"name": "John", "age": 25, "city": "London"}

# We can access the values using the keys
print(x["name"])

# We can also change the values
# Uncomment the line to see the change
#x["age"] = 30
print(x)

In [None]:
# We can add new key-value pairs to the dictionary
x = {"name": "John", "age": 25, "city": "London"}

x["country"] = "UK"
print(x)

In [None]:
# Dictionary keys can be of any immutable type (strings, numbers, tuples) but the values can be of any type
x = {1: "Hello", (1, 2): 5.123, "another dictionary": {"name": "John", "age": 25}}
print(x[(1, 2)])
print(x["another dictionary"]["name"])

In [None]:
# Finally we can also change between the different types
# Here I am converting a list to a set
x = [1, 2, 3, 3, 4]
y = set(x)

# Think about what the output will be and then uncomment the line to see if you are correct
#print(y) 

In [None]:
# We can also convert a list to a tuple and vice versa
x = [1, 2, 3]
y = tuple(x)

# Think about what the output will be and then uncomment the line to see if you are correct
#y[2] = 10
print(y)

## Exercise 2
In the cell below create a list of numbers and then convert it to a set. Print the set to see the difference between the two.

In [None]:
# Define your list here

# Convert your list to a set


# Syntax
Syntax is essentially the grammar of coding. The rules in which you have to follow for it to work and make sense. Python is a high-level programming language, which means it is easy to read and write. It is known for its readability and clean syntax. Here are some basic syntax rules:
- Python is case-sensitive.
- Whitespace (indentation) is important. The standard is to use 4 spaces for indentation. But most IDEs (i.e. VScode, JetBrain, Rstudio) will automatically convert tabs to spaces.
- Comments are made with a hash (#) symbol. Anything after the hash will be ignored by the interpreter.

In [None]:
# Don't worry to understand what is happening in this code block, just pay attention to the syntax
for i in range(0, 11, 2):
    if i % 3 == 0: # The % operator is the modulo operator which gives the remainder of a division
        print(i) # Notice the indentation
    else:
        print("Not divisible by 3")

# If Else Statements
If we want to have a section of code run only if a certain condition is met, we can use an if statement. We can also add else and elif (else if) statements to run code if the condition is not met.

In [None]:
# An if statement is used to execute code based on a condition
x = 5
if x > 3:
    print(f"{x} is greater than 3")

In [None]:
# The if statement only captures the first condition that is true if we want to get everything that is false as well we follow it with an else statement
x = 5

if x > 10:
    print(f"{x} is greater than 10") # The f in front of the string allows us to use curly braces to insert variables into the string
else:
    print(f"{x} is not greater than 10")

In [None]:
# We can also have multiple conditions using the elif statement (short for else if)
x = 5
if x > 10:
    print(f"{x} is greater than 10")
elif x > 3:
    print(f"{x} is greater than 3 but not greater than 10")
else:
    print(f"{x} is not greater than 3")

# Loops
If we want to run a section of code multiple times, we can use loops. There are two loops in Python: for and while loops.
- **For loops** are used to iterate over a sequence (list, tuple, string, etc.) and run a block of code a specific number of times.
- **While loops** are used to run a block of code as long as a condition is true.

In [None]:
# For loops iterate over a sequence of values, this can be a list, tuple, set, dictionary, or a range
# For example, we can iterate over a list
x = ["Hello", "World", "!"]
for i in x:
    print(i)

In [None]:
# We can also iterate over a range of values using the range function
for i in range(5):
    print(i)

In [None]:
# We can also iterate over a dictionary by iterating over either the keys, values, or key-value pairs.
# Here I am iterating over the keys using the .keys() method
x = {"name": "John", "age": 25, "city": "London"}
for i in x.keys():
    print(i)

In [None]:
# We can use the .values() method to get the values
x = {"name": "John", "age": 25, "city": "London"}
for i in x.values():
    print(i)

In [None]:
# And the .items() method to get the key-value pairs
for key, value in x.items():
    print(f"{key}: {value}")

In [None]:
# A while loop will continue to execute as long as the condition is true
# NOTE: That if there is no way for the condition to become false the loop will run indefinitely
x = 0

while x < 5:
    print(x)
    x += 1 # This is the same as x = x + 1

## Exercise 3
In the cell below I have a list of numbers. I want you to check each element on 3 conditions:
1. If the number is less than 5, print "low".
2. If the number is greater than or equal to 5 and less than 10, print "medium".
3. If the number is greater than or equal to 10, print "high".
<br/>
HINT: Both for loops and if else statements will be needed.

# Functions
Functions are chunks of code that you can reuse. We have already seen some functions like print() and len(). You can also create your own functions. Functions are defined using the def keyword followed by the function name and parentheses. 
<br/>
Some functions take arguments (inputs) and return a value. There are types of arguments:
- **Positional arguments** are arguments that need to be in the correct position.
- **Keyword arguments** are arguments that are preceded by a keyword and an equals sign.

In [None]:
# Here we are going to define a function that takes two POSITIONAL arguments and return their sum
def add(x, y):
    # The return statement is used to return a value from a function and is required
    # If you do not include a return statement the function will return None (a special value in Python)
    return x + y 

# We can call the function and store the result in a variable
z = add(5, 10)
print(z)

In [None]:
# Now we will define a function that takes two KEYWORD arguments and returns a string that
# concatenates the two arguments
def combineStrings(string1, string2):
    return string1 + " " + string2

# We can call the function and pass the arguments in any order
z = combineStrings(string2="World", string1="Hello")
print(z)

In [None]:
# We can also define default values for arguments in a function
def multiply(x, y=2):
    return x * y

# If we do not pass a value for y it will default to 2
z = multiply(5)
print(z)

## Exercise 4
In the cell below I want you to create a function that takes in a list of numbers and returns the sum of all the numbers. Then call the function with a list of numbers to see if it works.
The list is predifined for you.

In [None]:
sumThis = [66.5, 23.4, 12.7, 45.6, 78.9] # Sum this list

# Define a function that takes a list of numbers and returns the sum of the numbers here!

# Objects and Classes
Python is an object-oriented programming language. Is is based on the concept of "objects". An object is a collection of attributes (variables) and methods (functions) that act on the data. A class is a blueprint for an object. It defines the data and methods that the object will have. You can create multiple objects from the same class.
<br/>
<br/>
Yes I am aware that this does not make a bunch of sense right now. It's essentially a way of capturing bits of data and funtions in "objects" that are similar to each other.

In [None]:
# We have already seen an example of a method when we used the append method on a list
# A method is a function that is associated with an object and is called using the dot operator
# Here I will use the .upper() method on a string to convert it to uppercase
x = "hello"
y = x.upper()
print(y)

In [None]:
# There are other types of methods that can be used on strings such as .lower(), .capitalize(), .title(), .strip(), .replace(), .split()
# I will demonstrate some of them here
x = " hello, world! " # Notice the spaces at the beginning and end of the string
y = x.strip() # This will remove the spaces from the beginning and end of the string
print(y)

splitString = "Hello, World!".split(",") # This will split the string at the comma and return a list
print(splitString)

In [None]:
# We can create our own object types using class definitions
# Here I am going to define a class called Employee
class Employee:
    # The __init__ method is a special method that is called when an object of the class is created
    # It is used to initialize the object with the values passed to it
    def __init__(self, name, age, salary):
        self.name = name
        self.age = age
        self.salary = salary
    
    # We can define other methods in the class that can be called on the object
    # Notice that the methods take self as the first argument, this is a reference to the object itself
    # We can use self to access the attributes of the object
    def giveRaise(self, amount):
        self.salary += amount

# Here we initialize an object of the Employee class
JohnSmith = Employee("John Smith", 25, 50000)

# We can access the attributes of the object using the dot operator
print(JohnSmith.salary)

# We can also call the methods on the object
JohnSmith.giveRaise(5000)
print(JohnSmith.salary)

## Exercise 5
Create in the cell below an object called "Dog". The object should have the following attributes:
- name
- age
<br/>
Then there should be a method that makes the dog bark. Finally create an object from the class and call the bark method.

In [None]:
# Define your Dog class here
# HINT: don't forget the __init__ method

# Read and Write Files
Python has built-in functions for reading and writing files. Python now has something called a with statement that makes it easier to open files. The with statement automatically closes the file when you are done with it. This is the preferred way to open files so you don't forget to close them.

In [None]:
# Here we create a file and write to it
with open("example.txt", "w") as file:
    file.write("Hello, World!")
    file.write("\n") # This will add a new line
    file.write("Hello, again!")

# Here we can read the contents of the file
with open("example.txt", "r") as file:
    # The read method reads the entire file and returns it as a string
    content = file.read()
    print(content)

# We can also read the file line by line
with open("example.txt", "r") as file:
    # The readlines method reads the file line by line and returns a list of lines
    lines = file.readlines()
    print(lines)

# Libraries and Modules
Python has a large standard library, but there are also many third-party libraries that you can use. Libraries are collections of functions, classes, and constants. You can import a library using the import keyword. You can also import specific functions or classes from a library. *Most* libraries are not built into Python, so you will need to install them using pip (or conda if you are using Anaconda). 
<br/>
You can install a library using the command prompt or terminal. 
```bash
pip install library_name
```
<br/>
You then import thhem into your code using the import statement.

```python
import library_name
```
However you can also import specific functions or classes from a library.
```python
from library_name import function_name
```
Finally, you can also give a library an alias when you import it.
```python
import library_name as ln
```
This is useful when you have a long library name and you don't want to type it out every time, best practice is to use the alias that is commonly used by the community. Or just the funtion you are using.

## Built-in Libraries
Python has many built-in libraries that you can use. Some of the most common ones are:
- **math**: provides mathematical functions.
- **random**: provides functions for generating random numbers.
- **datetime**: provides classes for working with dates and times.
- **os**: provides functions for interacting with the operating system.
- **re**: provides functions for working with regular expressions.
- **json**: provides functions for working with JSON data.
- And many more!

In [None]:
# Here I will import the math module which provides mathematical functions and constants (it is part of the standard library)
import math

# We can use the functions in the math module by using the dot operator
x = math.sqrt(25)
print(x)

# Or use the constants
print(math.pi)

In [None]:
# We can also import specific functions or constants from a module
from random import randint

# This will generate a random integer between 0 and 10
x = randint(0, 10)
print(x)

In [None]:
# Datetime is another module in the standard library that provides functions for working with dates and times and is very very useful
import datetime as dt # This imports the datetime class from the datetime module

# We can create a datetime object using the datetime class
x = dt.datetime(2020, 5, 17, 12, 30, 0) # This will create a datetime object for the 17th of May 2020 at 12:30 PM

# We can also use the datetime module to state the structure of the date and time
print(x.strftime("%A, %d %B %Y %I:%M %p")) # This will output the date and time in a human-readable format

# Those letters in the strftime method are called format codes and are used to format the date and time and there is a list of them that can be found in the Python documentation if you are interested
# Just search python datetime documentation in your search engine

# We can also read in strings and convert them to datetime objects using those format codes
y = "2020-05-17 12:30:00"
z = dt.datetime.strptime(y, "%Y-%m-%d %H:%M:%S") # This will convert the string to a datetime object
print(z.strftime("%B %d, %Y"))

# There is also a timedelta class in the datetime module that can be used to represent a duration of time and add or subtract it from a datetime object
# Here I will create a timedelta object representing 5 days
delta = dt.timedelta(days=5)
newDate = x + delta # This will add 5 days to the datetime object x   
print(newDate.strftime("%B %d, %Y"))

In [None]:
# Finally I will demonstrate the os module which provides functions for interacting with the operating system
import os

# We can use the os module to get information about the current working directory
print(os.getcwd())

# We can also use the os module to list the files in a directory
print(os.listdir(os.getcwd()))

# We can delete the example.txt file we created earlier IF it exists
if os.path.exists("example.txt"):
    os.remove("example.txt")

## Numpy
Numpy is a popular library for numerical computing. It provides support for large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on these arrays. Numpy is a powerful library that is used in many scientific computing applications. I can only scratch the surface of what Numpy can do in this notebook. But I will show you some of the basics.

In [None]:
# Here we will load the NumPy library
import numpy as np

# We can create a NumPy array from a list
x = np.array([1, 2, 3, 4, 5])

# That makes a one-dimensional array, we can also create multi-dimensional arrays
y = np.array([[1, 2, 3], 
              [4, 5, 6]])
print(y)

In [None]:
# We can also create arrays of zeros, ones, or random numbers
x = np.zeros(5)
print(x)

# We can also specify the shape of the array using a tuple (rows, columns)
y = np.ones((2, 3))
print(y)

In [None]:
# We can do element-wise operations on arrays, or matrix operations if the arrays are multi-dimensional
x = np.random.randint(1, 10, (3, 4))
y = np.random.randint(1, 10, (3, 4)) # This will create a random array of 3x4 elements between 1 and 10

z = x * 5 # This will multiply each element in the array by 5

# We can also do matrix multiplication using the dot method
z = np.dot(x, y.T) # This will multiply x by the transpose of y
print(z)

In [None]:
# We can also create a linearly spaced array using the linspace function
x = np.linspace(0, 10, 5) # This will create an array of 5 elements between 0 and 10
print(x.tolist()) # The tolist method converts the array to a list

## Pandas
Pandas is a popular library for data manipulation and analysis. It provides data structures like Series and DataFrame that make it easy to work with structured data. Pandas is built on top of Numpy, so it is a powerful tool for data analysis. I will show you some of the basics of Pandas in this notebook. It works with dataframes, which are essentially tables of data.

In [None]:
# Here we will load the pandas library
import pandas as pd

# We will load in a dataset that comes with the pandas library
data = pd.read_csv("https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv")

# We can view the first few rows of the dataset using the head method
data.head()

In [None]:
# We can access the columns of the dataset using the dot operator OR the bracket notation
print(data.species.unique()) # This will return the unique values in the species column
print(data["sepal_length"].mean()) # This will return the mean of the sepal_length column

In [None]:
# We can also filter the dataset using conditions
setosa = data[data.species == "setosa"] # This will return only the rows where the species is setosa

# We can also group the data by a column and perform operations on the groups
grouped = data.groupby("species").mean() # This will group the data by species and return the mean of each column for each group
grouped.head()

In [None]:
# Finally we can save the dataset to a file using the to_csv method
setosa.to_csv("setosa.csv", index=False) # This will save the setosa dataset to a file called setosa.csv without the index column

## Matplotlib and Seaborn
Matplotlib is a popular library for creating static, animated, and interactive visualizations in Python. It provides a wide variety of plots and charts, from simple line plots to complex heatmaps. Seaborn is a Python data visualization library based on Matplotlib. It provides a high-level interface for creating attractive and informative statistical graphics. I will show you some of the basics of Matplotlib and Seaborn in this notebook.

In [None]:
import matplotlib.pyplot as plt # This is the standard way to import the matplotlib library

# Here we will load in some data that from the seaborn library
data = pd.read_csv("https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv")

# We can create a scatter plot using the scatter method
plt.scatter(data.sepal_length, data.sepal_width)
plt.xlabel("Sepal Length") # This will set the label for the x-axis
plt.ylabel("Sepal Width") # This will set the label for the y-axis
plt.title("Sepal Length vs Sepal Width") # This will set the title of the plot
plt.show() # This will display the plot
plt.close() # This will close the plot and free up memory and ensure that the next plot is not displayed on the same figure

In [None]:
# We can also create a histogram using the seaborn library which is built on top of matplotlib
import seaborn as sns

sns.histplot(data.sepal_length, kde=True) # This will create a histogram of the sepal_length column with a kernel density estimate
plt.xlabel("Sepal Length")
plt.ylabel("Frequency")
plt.title("Distribution of Sepal Length")
plt.show()
plt.close()

In [None]:
# Seaborn also provides a function to create a pairplot which will create a grid of scatter plots for each pair of columns
sns.pairplot(data, hue="species") # This will create a pairplot of the data with the species column as the hue
plt.show()
plt.close()

In [None]:
# Seaborn provides a function for creating your own pallettes for plots
colors = sns.color_palette("husl", 3) # This will create a list of 3 colors from the husl palette

# We can use these colors in our plots
sns.scatterplot(
    data=data,
    x="sepal_length",
    y="sepal_width",
    hue="species",
    palette=colors
)
plt.show()
plt.close()

It is worth checking out all the different plots you can make with Matplotlib and Seaborn. They are very powerful tools for data visualization. ***AND*** include pallettes for colorblind people.

## Other Libraries
There are many other libraries that you can use in Python. Some of the most popular ones are:
- **requests**: for making HTTP requests, and web scraping.
- **xarray**: for working with multi-dimensional arrays such as NetCDFs.
- **geopandas**: for working with geospatial data, built on the pandas library.
- **scipy**: for scientific computing, statistics, and many other things.

# Putting it all together!
Here we will build a simple function to scrape weather data from ECCC and visualize it. We will then export it to a csv file. This will be a simple example of how you can use all the tools we have learned in this notebook to do something useful.

## Web APIs and data scraping
Web APIs (Application Programming Interfaces) allow you to interact with web servers and retrieve data. Many websites provide APIs that allow you to access their data in a structured way. You can use APIs to retrieve data from websites, interact with social media platforms, and much more.
<br/>
ECCC has its own API that allows you to access weather and hydrometric data. Documentation for the API can be found [here](https://api.weather.gc.ca/openapi?f=html).
<br/>
BUT it essentially works like this
<br/>
Base URL: https://api.weather.gc.ca/\
We then specify that we are accessing their collections of data.\
New URL: https://api.weather.gc.ca/collections\
We then specify that we are accessing their climatological data, assuming we know what station we want.\
New URL: https://api.weather.gc.ca/collections/climate-daily/items (we need to speficy the items if we don't want to get the metadata)\
Then we add the query parameters (i.e. the station we want, the date range, etc.)\
Final URL: https://api.weather.gc.ca/collections/climate-daily/items?STATION_NAME="insert station name here"&f="format here (json, csv, etc.)"&datetime="date range here"

## Function building

In [None]:
# First we will import the required libraries we need
import requests
import io # This is an input/output module that allows us to work with streams of data (like files)
import datetime as dt
import pandas as pd

# We will start by defining the function that will get the data from the API
def retrieve_weather_data(stationName, startDate, endDate):
    # First we want to make the start and end dates a datetime object to format them correctly for the API
    start = dt.datetime.strptime(startDate, "%Y-%m-%d")
    end = dt.datetime.strptime(endDate, "%Y-%m-%d")

    # We will use now build the URL for the API
    # The URL is made up of the base URL, the station ID, and the start and end dates
    # We will also include the format of the data we want (JSON)
    baseURL = "https://api.weather.gc.ca/collections/climate-daily/items?"
    url = f"{baseURL}f=csv&STATION_NAME={stationName}&LOCAL_YEAR={start.year}" 
    
    # We will now use the requests library to make a GET request to the API
    response = requests.get(url)

    # Which can be read in as a pandas DataFrame
    data = pd.read_csv(io.StringIO(response.text))

    # Now we will convert the LOCAL_DATE column to a datetime object
    data["LOCAL_DATE"] = pd.to_datetime(data["LOCAL_DATE"])

    return data

# Now we will call the function to get the data
banffData = retrieve_weather_data("BANFF", "2020-01-01", "2020-12-31")
banffData.head()

## Data visualization

In [None]:
# We will now plot the average temperature for each day over the year
import matplotlib.pyplot as plt
import seaborn as sns
import matplotlib.dates as mdates

# We can use the lineplot function from seaborn to create a line plot
sns.lineplot(data=banffData, x="LOCAL_DATE", y="MEAN_TEMPERATURE")

# We can also use the matplotlib functions to customize the plot
plt.xlabel("Date")
plt.ylabel("Mean Temperature (°C)")

# Here we will edit the x-axis to show the month and day
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%b')) # Formates thge x-axis to show the month and year


plt.title("Mean Temperature in Banff in 2020")

plt.show()
plt.close()

## Exporting data

In [None]:
# Finally we will save the data to a file BUT only the columns we are interested in
banffData[["STATION_NAME", "LOCAL_DATE", "MEAN_TEMPERATURE"]].to_csv("banffWeather.csv", index=False)