<a href="https://colab.research.google.com/github/RobJavVar/DataSciencePsychNeuro/blob/master/ExerciseSubmissions/02_models-as-testable-hypotheses.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Exercise 2: Coding Habits & Functions

This assignment will give you some practice writing your own functions and using the coding best practices discussed in the tutorial.

---

## 1. Summary statistics (4 pts)

Write a function that takes a vector of numbers `x` and returns the length, mean, and standard deviation of `x` as a new vector.

In keeping with our best practices, give the function a short but descriptive name, and use snake case if it involves multiple words.

Hint: Vectors are defined in R using the `c()` command.

In [7]:
x <- c(1,2,3)
summarize_vector <- function(x) {
    output <- c(
    length = length(x),
    mean   = mean(x),
    sd     = sd(x)
  )
  return(output)
}

summarize_vector(x)

Calculate the summary statistics of vector `v1`.

*Hint*: Note the "NA" in the vector. This is short for "not available" and is a placeholder for missing values. You'll want to look up the _is.na_ function (and the not operator _!_), as well as options for removing NA's in the functions that you will use to summarize the vector. Look at the documentation for the functions that you will use to see how to work with NA's.

In [6]:
v1  <- c(5, 11, 6, NA, 9)

summary_statistics <- function(x) {
  output <- c(
    length = sum(!is.na(x)),
    mean   = mean(x, na.rm = TRUE),
    sd     = sd(x, na.rm = TRUE),
    median    = median(x, na.rm = TRUE),
    min       = min(x, na.rm = TRUE),
    max       = max(x, na.rm = TRUE),
    sum       = sum(x, na.rm = TRUE)
  )
  return(output)
}

summary_statistics(v1)

---
## 2. T-test function (4 pts)

The formula for a t-test is:

$$ \frac{m- \mu}{ \frac{s}{\sqrt{n}}} $$

Where m is the sample mean, $\mu$ (mu) is the population mean, s is the standard deviation, and n is the sample size.

Using your function above as a starting point, write a new function `ttest_fun` that compares a vector `x` to a given population mean `mu` and calculates the t-statistic. Keep the coding best practices in mind.

Hint: You will need to add another argument for mu.

In [8]:
ttest_fun <- function(x, mu) {
  # Calculate components
  m <- mean(x, na.rm = TRUE)
  s <- sd(x, na.rm = TRUE)
  n <- sum(!is.na(x))
  
  # Calculate t-statistic
  t_stat <- (m - mu) / (s / sqrt(n))
  
  return(t_stat)
}

Use your function to compare the mean of v1 to 10.

In [9]:
ttest_fun(v1, 10)

---
## 3. Setting default values (2 pts)

Set the default value of mu to 0. Test your modified function below by supplying only `v2` as an argument.

In [10]:
ttest_fun <- function(x, mu = 0) {
  # Calculate components
  m <- mean(x, na.rm = TRUE)
  s <- sd(x, na.rm = TRUE)
  n <- sum(!is.na(x))
  
  # Calculate t-statistic
  t_stat <- (m - mu) / (s / sqrt(n))
  
  return(t_stat)
}

In [11]:
v2 <- c(3, 7, 1, NA, 8, 12)
ttest_fun(v2)

How does your result compare to R's built-in `t.test()` function?

In [12]:
t.test(v2)


	One Sample t-test

data:  v2
t = 3.2059, df = 4, p-value = 0.03272
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
  0.8306107 11.5693893
sample estimates:
mean of x 
      6.2 


When you are finished, save the notebook as Exercise2.ipynb, push it to your class GitHub repository (the one you made for Exercise 1) and send the instructors a link to your notebook via Canvas. You can send messages via Canvas by clicking "Inbox" on the left and then pressing the icon with a pencil inside a square.

**DUE:** 5pm EST, Jan 29, 2026

**IMPORTANT** Did you collaborate with anyone on this assignment? If so, list their names here.
> *Someone's Name*

**Generative AI Notice** Did you utilize GenAI tools on this assignment? If so, list the specific tool, the questions you used it for, and the prompts that you used.
> *Insert here*