<H1>SI 618 Day 01 - Introduction</H1>
Dr. Chris Teplovs, University of Michigan School of Information

Copyright &copy; 2024.  This notebook may not be shared outside of the course without permission.

This notebook is a very brief introduction to Jupyter notebooks.  We will use Jupyter notebooks for all of our work in this course.

Notebook version 2024.01.10.4.CT

## Learning Objectives
By the end of this class, you should:
* confirm that you have a working Jupyter environment using Visual Studio Code (VS Code)
* be able to open and edit a Jupyter notebook
* be able to run a Jupyter notebook
* have written your first code in this class
* experimented with Copilot
* have successfully submitted an assignment to Canvas

You will be working in this notebook. When we are done, you will submit this notebook in two formats: HTML and IPYNB

Jupyter notebooks consist of two main types of "cells" or "blocks" (I use those terms interchangeably): code and markdown.  There are other types, but we won't be using them in this course.  Code blocks contain (python) code, whereas markdown blocks contain text.  Use markdown blocks to create richer narratives around your code.

For more information about what constitutes a good Jupyter notebook, please read Adam Rule, et al. [Ten simple rules for writing and sharing computational analyses in Jupyter Notebooks](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007007).

In this course, we also expect you to conform to the PEP-8 style guidelines:  [PEP 8 Style Guide for Python Code](https://pep8.org/). Adapted from the original Python Enhancement Proposal,  [PEP 8](https://peps.python.org/pep-0008/). You will earn fewer points for your assignments if you do not follow this style guide.

**Before you start**: Make sure you have selected your `.venv` Python environment.

One of the first things we want to do in our notebooks is our `import`s.  We'll import two packages (or libraries, or modules) that we'll use a bit later in the notebook:


In [1]:
import csv

### Challenge 1: Write code that prints out "Hello, world!" (don't overthink this one)

In [2]:
# insert your code here
print("Hello, world!")

Hello, world!


### Challenge 2: Sum of squares
Create a function that calculates the sum of squares of any sequence of numbers.  Test it on the following values: 1,3,5,7,9.  The answer should be 165.  We'll improve the `assert` statement during class, but for now, just use it as shown.

In [5]:
def sum_of_squares(seq):
# return sum([x**2 for x in seq]) # suggested by copilot, tested and accepted
#sum_of_squares([1,3,5,7,9])
# another implementation
    mysum = 0
    for i in seq:
        mysum += i**2
    return mysum

# testing
# print(sum_of_squares([1,3,5,7,9]))
assert sum_of_squares([1,3,5,7,9]) == 165

### Challenge 3: Documentation
Add documentation to the following function using docstrings and comments.  (Hint: see https://en.wikipedia.org/wiki/Fibonacci_sequence or leverage GitHub Copilot Chat.)

In [16]:
def f(x):
    '''
    This function takes a numeric input and returns the fibbonaci sequence value of that index.
    '''
    if x == 0:
        return 0
    elif x == 1:
        return 1
    else:
        return f(x-1) + f(x-2)
# testing
# f(1)
# f(2)
# f(3)
# myseq = [0,1,2,3,4]
# [f(x) for x in myseq]

[0, 1, 1, 2, 3]

#### Challenge 4: String manipulation, documentation and type hints
Write a function that takes a string as input and returns the number of vowels in the string.  Test it on the following string: "The quick brown fox jumped over the lazy dog."  The answer should be 11.  Document your function.  Include [type hints (see PEP-484)](https://peps.python.org/pep-0484/).

In [19]:
def count_vowels(s: str) -> int:
    '''
    the function takes a string s as input and returns the number of vowels in the string
    input: s: str
    output: int
    '''
    count = 0
    for i in s:
        vowels = "aeiou"
        if i in vowels:
            count += 1
    return count

# testing
# print(count_vowels("cow"))
# print(count_vowels("frank"))
# print(count_vowels("ooo"))


assert count_vowels("hello") == 2, f"There are two vowels in hello, not {count_vowels('hello')}"
assert count_vowels("The quick brown fox jumped over the lazy dog") == 12, f"There are 12 vowels in the quick brown fox jumped over the lazy dog, not {count_vowels('The quick brown fox jumped over the lazy dog')}"


### Challenge 5: Find the mode, write some tests.  
The mode is the most frequently occurring value.  In the case where multiple modes, you only need to print one of them.  Document and typehint your code. Test your code with an assert statement for the following values: 1, 2, 3, 3, 4, 5, 5, 5, 6, 6, 7, 8, 9, 9, 10.

Try to do this on your own without resorting to Googling for a solution.

In [29]:
# insert your code here
# def find_mode(x):
#     counts = []
#     for i in x:
#         counts.append(x.count(i))
#     return x[counts.index(max(counts))] # line suggested by copilot
# testing
# find_mode([1,2,2,3])
# find_mode(["1","1","1", "2","2"])

# in class solution;
def calculate_mode(numbers):
    d = {}
    for i in numbers:
        if i in d:
            d[i] += 1
        else:
            d[i] = 1
    return max(d, key=d.get)

# testing
# calculate_mode([1,2,2,3])
assert calculate_mode([1, 2, 3, 3, 4, 5, 5, 5, 6, 6, 7, 8, 9, 9, 10]) == 5

### Challenge 6: Calculate the mean temperature for the data in aranet4.csv
Your output should consist of the following:

```Mean temperature: X.XX```

where `X.XX` is the mean temperature rounded to two decimal places.  For example: 27.23 or 4.32 or -1.20.

**NOTE: You should use only base python and the `csv` module to solve this problem.  Do not use pandas or any other libraries.**

The data file should be located in the `data` directory that is a sibling of the directory containing this notebook.  For example, if this notebook is located in `SI_618_WN_24_Files/inclass`, then the data file should be located in `SI_618_WN_24_Files/data`.

In [33]:
filename = '../data/aranet4.csv'

In [57]:
def aranet_mean_temp(filename):
    temperature = []
    total_temp  = 0
    count = 0
    with open(filename, 'r') as file:
        csv_reader = csv.reader(file)
        next(csv_reader) # skip the header row
        for row in csv_reader:
            temperature = float(row[2]) # assumes temp is in the second column
            total_temp += temperature
            count += 1
    mean_temp = round(total_temp / count,  ndigits=2)
    return(mean_temp)

# aranet_mean_temp(filename)
assert aranet_mean_temp(filename) == 17.75, "The mean temperature should be 17.75"

Just for fun, here's how to do the above challenge using pandas:

In [58]:
import pandas as pd
aranet_df = pd.read_csv(filename)
print(f"Mean temperature: {aranet_df['Temperature(°C)'].mean():.2f}")

Mean temperature: 17.75


After we review these challenges, we'll take a look at some basic pandas DataFrame functionality.


## END OF NOTEBOOK
Remember to submit this notebook to Canvas in both HTML and IPYNB formats.  Note: if you have difficulty exporting to HTML, just submit the IPYNB file and make sure you reach out to the teaching team to get help.  You will only receive partial credit if you do not submit both formats (HTML and IPYNB) by the due date (see Canvas for due date).