# Week 3

**Topics**: Introducing functions and modules in Python. Basic introduction to pandas for data analysis, focusing on importing data and initial data exploration.

In [1]:
print('Foo')

Foo


## Functions
We use functions for a few things:
* Reduce dupliation in code - use the same function in multiple places in your code.
* Simplify code - breaking down complex code into smaller, separate, problems make the entire code more managable and maintainable. 
* Readability - named functions say specifically what they're going to do, so our program is less cluttered and easier to follow. 

### Scope
This is a new concept for us - there are certain places where variables can be defined that they will be unaccessible externally.  The variables have a specific scope in which they can be used. 
* **Global** - variables defined outside of functinos, classes, etc, in your program are accessible from everywhere. However, it's bad practice to use global variables from inside of functions as it makes it hard to follow what data is used by the function.  Side effects can be introduced.
* **Functions** - variables defined inside of functions are not visible outside of the function.  This means we don't neeed to worry about accidentally using a variable from a(nother) function when we don't mean to. 
* **Classes/Objects** - objects (instances of a class) have thier own variables/properties and functions that aren't accessible externally.
* **Modules** - modules imported like, "import pandas", have their own scope inside of "pandas" that we access via the module name, like "pandas.DataFrame".  If we were to do "from pandas import *", then all things in the pandas namespace would be populated into our global namespace and we could directly access DataFrame.  This can introduce problems, e.g., if multiple modules have things with the same name inside of them. It's better to import specific things to our global namespace if wanted... "from pandas import DataFrame" will only add the DataFrame class to our global namespace.
* And a few other places.  Try except blocks, inside of list comprehensions, etc.

What this means to us with regard to functions is that we should pass data the function needs in as arguments, create any variables in the function that we need without worrying about them polluting the namespace of our greater program, and then return the important data from the function with a return call.

### General Format
Here's how we define a function:

    def function_name(arg1, arg2, ...):
        '''function description in tripple quoted block of text.
        This is not mandatory, but is good practice.'''
        function
        code
        here
        return some_value

We can only return one object, but because that object can be a collection like a list or dictionary, we can bundle things to pass them all out.  

In [4]:
def compute_stats_on_numbers(list_of_numbers):
    sum_of_numbers = sum(list_of_numbers)
    count_of_numbers = len(list_of_numbers)
    average_of_numbers = sum_of_numbers / count_of_numbers
    return sum_of_numbers, count_of_numbers, average_of_numbers  # This is a tuple.  The () around it are implied

numbers = [1, 2, 3, 4, 5]
num_sum, num_count, num_avg = compute_stats_on_numbers(numbers)

print(f'The function says - Sum: {num_sum}, Count: {num_count}, Average: {num_avg}')

The function says - Sum: 15, Count: 5, Average: 3.0


### Positional Arguments
When we define a function with multiple arguments like this:

    def do_the_thing(pos1, pos2, pos3, ..., posN):

We must pass the function N arguments with positoins corresponding with the function definition.

### Optional Arguments 

### Keyword Arguments

### Arbitrary Arguments
We won't get into this, but look into *args and **kwargs.  You can make a funcation accept any arguments.  An example use for this is cerating your own version of the print function:

    DEBUG = True

    def debug_print(*args, **kwargs):
        if DEBUG:
            print(*args, **kwargs)

## Imprting Data

### Reading Files

### Reading Files with Pandas

### Query Database

### Query Website with requests
import requests
import pandas as pd

# URL for NDBC buoy data (e.g., Station 46042 - Monterey Bay)
url = 'https://www.ndbc.noaa.gov/data/realtime2/46042.txt'

# Send a GET request to fetch the data
response = requests.get(url)

# Check if the request was successful
`
if response.status_code == 200:
    # Read the data into a pandas DataFrame
    data = response.text.splitlines()
    headers = data[0].split()
    rows = [row.split() for row in data[2:]]
    df = pd.DataFrame(rows, columns=headers)
    
    # Convert relevant columns to numeric types
    df['YY'] = pd.to_numeric(df['YY'], errors='coerce')
    df['MM'] = pd.to_numeric(df['MM'], errors='coerce')
    df['DD'] = pd.to_numeric(df['DD'], errors='coerce')
    df['hh'] = pd.to_numeric(df['hh'], errors='coerce')
    df['mm'] = pd.to_numeric(df['mm'], errors='coerce')
    df['WTMP'] = pd.to_numeric(df['WTMP'], errors='coerce')
    
    # Convert date and time columns to a single datetime column
    df['Datetime'] = pd.to_datetime(df[['YY', 'MM', 'DD', 'hh', 'mm']])
    
    # Set the datetime column as the index
    df.set_index('Datetime', inplace=True)
    
    # Keep only the relevant columns
    df = df[['WTMP']]
    
    # Print the DataFrame
    print(df)
else:
    print(f"Failed to retrieve data: {response.status_code}")
`

# Week 3 Turtle Challenge
This week, we can use functions to isolate complex operations into little chunks that are used by other code to perform complex behavior with simple, readable, code.
  
#### *Exercise*:
* 