# Exercise 2: Coding Habits & Functions

This assignment will give you some practice writing your own functions and using the coding best practices discussed in the tutorial.

---

## 1. Summary statistics (4 pts)

Write a function that takes a vector of numbers `x` and returns the length, mean, and standard deviation of `x` as a new vector.

In keeping with our best practices, give the function a short but descriptive name, and use snake case if it involves multiple words.

Hint: Vectors are defined in R using the `c()` command.

In [20]:
#Write your function here
#this function takes as input a vector of numbers and outputs: length, mean, st.dev as a new vector
summary_stats <- function(x) {
  #if NA is present in the vector, remove it before further analysis
  if (any(is.na(x))) {
    x <- x[!is.na(x)]
  }

  #calculate summary stats
  vec_length = length(x)
  vec_mean =  mean(x)
  vec_sd = sd(x)

  output_vector = c(vec_length, vec_mean, vec_sd)
  return(output_vector)
}

# ex_vec = c(1,2,3,4)
# print(summary_stats(ex_vec))

Calculate the summary statistics of vector `v1`.

*Hint*: Note the "NA" in the vector. This is short for "not available" and is a placeholder for missing values. You'll want to look up the _is.na_ function (and the not operator _!_), as well as options for removing NA's in the functions that you will use to summarize the vector. Look at the documentation for the functions that you will use to see how to work with NA's.

In [25]:
v1  <- c(5, 11, 6, NA, 9)
summary_stats(v1)

---
## 2. T-test function (4 pts)

The formula for a t-test is:

$$ \frac{m- \mu}{ \frac{s}{\sqrt{n}}} $$

Where m is the sample mean, $\mu$ (mu) is the population mean, s is the standard deviation, and n is the sample size.

Using your function above as a starting point, write a new function `ttest_fun` that compares a vector `x` to a given population mean `mu` and calculates the t-statistic. Keep the coding best practices in mind.

Hint: You will need to add another argument for mu.

In [28]:
#draft of ttest_fun (will paste below)
ttest_fun <- function(x, mu) {
  x_stats = summary_stats(x) #reminder (for indexing): summary_stats outputs vector length, mean, st.dev in this order

  ttest_nom = x_stats[2]-mu #I separated nominator, demoninator these to reduce the amount of nested operations in one line of code
  ttest_denom = x_stats[3]/(sqrt(x_stats[1]))

  ttest = ttest_nom/ttest_denom
  return(ttest)
}


---
---
### If you use Generative AI (optional)
If you use code-assisting generative AI (e.g., Copilot, GPT, Claude; more than simple google searching), follow the steps below. If you **do not** use generative AI, skip this section and go directly to the coding part.

Your workflow should be
1. Write pseudocode
2. Generate code using generative AI
3. Compare and verify the code


**Definition**: Pseudocode is a short description of an algorithm written in plain, structured steps that a human (or an AI!) can easily read and translate into real code. Pseudocode usually does not run as. It does not need to be perfectâ€”its purpose is to help you organize your approach and to provide clear instructions to the AI.

You may write pseudocode as comments (starting with #) inside a code cell or as plain text.

#### Step 1: Write pseudocode

Write pseudocode for `ttest_fun` below.

In [None]:
# Write out your pseudo code here

#### Step 2: Generate code using generative AI
Using your pseudocode, prompt a generative AI to create or fix the R functionso it follows the instructions. Then paste **only the generated function** below (not the full AI response).

In [None]:
# Paste generated function here

#### Step 3: Compare and verify the code

Compare your pseudocode and the AI-generated code. Answer the questions below concisely (a few bullet points is enough.)

**Question 1.** What is different between your pseudocode and generated code.
(e.g., it handled NAs differently, used a different formula arrangement, etc.)

**Question 2.** What are the input parameters and the output of the function?
(Answer to show you understand what the function takes in and returns. Be concise.)

**Question 3.** Does anything has to be changed? (Yes / No)

Write your answer here
##### Question 1.
-

##### Question 2.
-
##### Question 3.
- Yes / No

If you answers yes to the question 3, revise the code (either using GenAI or on your own). Don't forget to paste your final version in the code block below.

---
---

In [29]:
# write ttest_fun here
ttest_fun <- function(x, mu) { #inputs: vector, mu(hard-coded population mean)
  x_stats = summary_stats(x) #reminder (for indexing): summary_stats outputs vector length, mean, st.dev in this order

  ttest_nom = x_stats[2]-mu #I separated nominator & demoninator to reduce the amount of nested operations in one line of code
  ttest_denom = x_stats[3]/(sqrt(x_stats[1]))

  ttest = ttest_nom/ttest_denom
  return(ttest)
}

Use your function to compare the mean of v1 to 10.

In [32]:
ttest_fun(v1, 10)

---
## 3. Setting default values (2 pts)

Set the default value of mu to 0. Test your modified function below by supplying only `v2` as an argument.

In [33]:
# Write your modified ttest_fun here
ttest_fun <- function(x, mu = 0) { #inputs: vector, mu=0 by default, but will be overridden if specified by 2nd argument when calling ttest_fun
  x_stats = summary_stats(x) #reminder (for indexing): summary_stats outputs vector length, mean, st.dev in this order

  ttest_nom = x_stats[2]-mu #I separated nominator & demoninator to reduce the amount of nested operations in one line of code
  ttest_denom = x_stats[3]/(sqrt(x_stats[1]))

  ttest = ttest_nom/ttest_denom
  return(ttest)
}

In [36]:
v2 <- c(3, 7, 1, NA, 8, 12)
ttest_fun(v2)

How does your result compare to R's built-in `t.test()` function?

In [37]:
t.test(v2)


	One Sample t-test

data:  v2
t = 3.2059, df = 4, p-value = 0.03272
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
  0.8306107 11.5693893
sample estimates:
mean of x 
      6.2 


When you are finished, save the notebook as Exercise2.ipynb, push it to your class GitHub repository (the one you made for Exercise 1) and send the instructors a link to your notebook via Canvas. You can send messages via Canvas by clicking "Inbox" on the left and then pressing the icon with a pencil inside a square.

**DUE:** 5pm EST, Jan 29, 2026

**IMPORTANT** Did you collaborate with anyone on this assignment? If so, list their names here.
> I didn't collab with anyone, but a note on AI use: While I didn't generate pseudo code and then use AI to generate the actual code, I did use AI in 1)googling things (syntax for specific functions in R) and using the AI summary that comes up on top of Google now; 2)Colab's built-in Gemini feature to help me correct the R syntax after I wrote it. So while I didn't use genAI for actual code generation, I think I used it more than "just to google things," so I thought I'd let you know. 