# Base R Part 2
In this lecture we will cover some essential concepts around functions and flow control (loops, apply functions, if-else statements). We will also explore the distinction between global and local variables. We will finish with a short introduction to simulation (i.e., iteratively changing cells by calling previous cell). At the very end, we will breifly discuss error handling (*i.e.*, try-catch).

Let'd pick up where we left off in Base R Part 1, with our data frame.

In [2]:
# Create the data frame



Remember how we can reference our columns, rows and cells 

- Column:
    - Version One: `myPpl$var_name`
    - Version One: `myPpl[, j]` where `j` is our column number 
- Row: `myPpl[i, ]` where `i` is our row number
- Cell: `myPpl[i,j]`


# Functions

Functions: once you have initialized them, they take in an input, perform a set of operations on them, and
then give you some return value.

These are helpful when you have something that you do often

- Recent example for me
    - Wrote a function to take a date and return the season
    - Wrote a function to get kelvin and return 
- Rule of thumb: if you’re copying and pasting code 3 times or more, make function
- I say if you are going to copy past ever, because even if you think it’ll only be twice it’ll probably be more

## Psuedo Code Example 

my function: y = x + 3; return y 

If you gave this function x=3, what would it return for y? 

## Let's write a function
Let's say we want to write a function that models the relationship between the probability of someone visiting a national park and the temperature (F). You know that people don't visit the park when it's very cold, nor when it's very hot. You model the relationship using the following quadratic equation 
$$v = F/100 - (F/100)^2$$
where $v$ is visits and $F$ is the temperature. 

We want to get the function the temperature $F$ and have it return the predicted number of trips taken $v$. 

In [3]:
# write function 


# test function



So we can see that as the temperature increases from 0F to 50F, the probability of someone taking a trip increases up to 25\%. However, after 50F, the probability of taking a trip begins to decrease. 

# Global vs Local Variables

In most programming languages (including R and Python), variables can have different scopes, meaning they can be either global or local. Understanding the distinction between global and local variables is crucial for writing functions and controlling the flow of your code.

- **Global Variables**: These are variables that are defined in the main body of your script and can be accessed from anywhere in the script.
- **Local Variables**: These are variables that are defined within a function and can only be accessed from within that function.

Let's look at an example to illustrate this distinction.

In [4]:
# Global variable

# Function to demonstrate local variable

# Call the function

# Try to print the local variable outside the function (this will cause an error)


# Error Handling
Now that we have purposefully thrown an errow, we can discuss error handeling. 

Error handling is an important aspect of programming that allows you to manage and respond to errors in a controlled way. In R, you can use functions like `try()` and `tryCatch()` to handle errors. These are very useful to use in loops or apply functions, when your concerned some observations may cause an error but you'd like your code to attempt all iterations. 

- **try()**: This function allows you to run a piece of code and catch any errors that occur.
    - It returns the result of the code if it runs successfully, or an error object if an error occurs.
    - By returning an error object, your able to keep moving forward.
- **tryCatch()**: This function provides more control over error handling by allowing you to specify different actions for different types of conditions (errors, warnings, messages).

## How to handle errors for now (the less technical way)

While we won't go into detailed examples here, it's important to be aware of these functions and how they can be used to make your code more robust and error-tolerant.

As a beginner programmer, encountering errors is a common and essential part of the learning process. Here are some strategies to effectively address errors:

1. **Read Error Messages Carefully**:
    - Error messages often provide valuable information about what went wrong and where. Take the time to read and understand them.

2. **Google the Error**:
    - Copy and paste the error message into a search engine. Often, you will find forums, blog posts, or documentation that address similar issues.

3. **Use Stack Overflow**:
    - Stack Overflow is a popular platform where developers ask and answer programming questions. Search for your error or ask a new question if you can't find a solution. Be sure to provide a clear and concise description of your problem, including relevant code snippets.

4. **Consult Documentation**:
    - Official documentation for the programming language or library you are using can be very helpful. It often includes examples and explanations of common errors.

5. **Cautiously Leverage AI Tools**:
    - AI tools like ChatGPT can provide code suggestions and help identify potential issues in your code. These tools can be particularly useful for beginners who are still learning the syntax and best practices.
    - However, be **very careful** to not use these resources as a crutch. If you depend on them too much, you will miss the opportunity to learn how to code for yourself!

6. **Ask for Help**:
    - Don't hesitate to ask for help from more experienced programmers, whether they are colleagues, mentors, or members of online communities. Providing a clear explanation of your problem and what you have tried so far will increase your chances of getting useful assistance and is a good exercise in identifying your problem.

7. **Practice Debugging**:
    - Debugging is a skill that improves with practice.
    - It sometimes feels like suffering, but you will become a better programming by struggling through your own errors. 




# Loops

- for loops: iterates through a task for a set number of times
- Consider these loops (psuedo code):
    - For (i in 1 through 4) { print i }
    - For (i in 1 through 4) { print i / 4}
- Can be helpful when
    - Iterating through a column of data and do something to each row
    - Construct a new column and want to construct each row by scratch
    - Simulation model

In [5]:
# Complicated code that is simplified by the loop


In [6]:
# the following loop does the exact same thing


In [7]:
# Operations or code run outside of a for loop can be run inside of a for loop


## Let's combine a loop with our earlier function

In [8]:
# First, let's create a vector of the max_temps we want to get visits for


# combining loop and our function


## Interating over a data frame
Let's recall our data fram `myPpl`. We can use a for loop to iteratively change each cell in a column. 

Let's say all three of our people, Andie, Bridger and Scott, move one mile away from their nearest park (bummer). We could do this one line at at time, like this:

In [9]:
# the [i] here is indicating which row we are editing

# print df


This worked, but there is an easier way to do it with a loop. The loop will help minimize the chance of making an error and shorten the amount of code we need to write to acheive our goal.

In [10]:
# do the same thing with a loop instead of copy and pasting 




We can see that both versions (a versus b) worked the same. However, the for loop simplifies the approach. 

# Loops for Simulation 

In [11]:
# Initialize parameters

# Create a numeric vector for the 50 years we will simulate


# Create a numeric vector for the 50 years of temperatures we will simulate

# Simulate the temperature change over the n_year 
# notice this will start in year TWO

# Create a data frame to store the results

# Print the results


In [12]:
# print tail 

In [13]:
# Plot the results


# Apply functions 
Apply functions in R are powerful tools for performing operations on data structures like vectors, lists, and data frames. They are often more efficient and concise than loops. Here is an example using the `apply()` function to calculate the mean of each column in a data frame.

Say you are studying three national parks (Yellowstone, Glacier, and Yosemite). You have been collecting data at each, and you want to quickly find the average temperature at each site. 

Let's do this two ways. First, finding it for each row one line at a time. Second, using an apply function to iterate across the rows. 


In [14]:
# Create a sample data frame


In [15]:
# Calculate the mean temperature for each park


# Print the results


Now let's use the `apply()` function to calculate the mean temperature for each park in a more efficient way.

An apply funciton lets you quickly apply a function to either: 

1. every row in the dataframe 
2. every column in the dataframe 

In our case, we want the mean of every column. 

In [16]:
# Calculate the mean temperature for each park using apply function

# Print the results


# If Else Statements 

- Sometimes you want to execute a task ONLY if a certain condition is met
- Let's return to our myPpl dataset one last time (for today):
    - Our RA did not record women and non-binary's *original* distances from parks correcly
    - All women and non-binary people are actually 0.25 miles closer to parks than thought 
- What would the correct DF look like?
    - If statements let you fix a mistake like this
    - Also demonstrates why the Boolean (true/false or indicator) variable is so powerful

In [17]:
# goes through each row and changes distance if someone is not male
