## QTM 350: Data Science Computing

### Assignment 06 - AI-Assisted Programming

### Due Date: 11:59 PM on Wednesday, October 16, 2024

### Instructions

In this assignment, you will use AI tools to help you generate, refactor, and explain code. Ideally, use GitHub Copilot, but if you cannot install the software, feel free to use other available tools (free or otherwise). Please name the tools you use in your assignment.

The main idea is to use natural language as much as possible, whilst remaining attentive to any mistakes the AI tool may produce. Your code should run without issues and provide the correct results. Please feel free to include tests and screenshots.

As always, should you have any questions, please let me know.

### Tasks

1. Use the `/explain` command in Copilot to get an explanation of the following code snippet:

```python
def fibonacci(n):
    if n <= 0:
        return []
    elif n == 1:
        return [0]
    elif n == 2:
        return [0, 1]
    else:
        fib = [0, 1]
        for i in range(2, n):
            fib.append(fib[i-1] + fib[i-2])
        return fib
```
### Summary at the End of Output:

Key Points
The function handles edge cases for n <= 0, n == 1, and n == 2.
It uses a loop to efficiently generate the Fibonacci sequence for larger values of n.
This function is a straightforward implementation of generating Fibonacci numbers and is efficient for small to moderately large values of n.

2. Create a new Git repository using GitHub CLI and Copilot suggestions. The suggestions should include code to initialise the repository, add a `README.md` and a `.gitignore` file for Python projects, then add, commit, and push the changes to the repository. Please include the link to your repository below.

https://github.com/gabrielischwartz/qtm350_hw6

3. Use Copilot to refactor the following code to improve its efficiency and readability:

```python
def is_prime(num):
    if num <= 1:
        return False
    for i in range(2, num):
        if num % i == 0:
            return False
    return True

primes = []
for i in range(1, 101):
    if is_prime(i):
        primes.append(i)

print(primes)
```

### new version

```python
import math

def is_prime(num):
    if num <= 1:
        return False
    if num == 2:
        return True
    if num % 2 == 0:
        return False
    for i in range(3, int(math.sqrt(num)) + 1, 2):
        if num % i == 0:
            return False
    return True

primes = [i for i in range(1, 101) if is_prime(i)]

print(primes)
```

4. Use the `@terminal` command to generate CLI commands that create a new directory named `data_analysis`, navigate into the directory, create a new Python file called `analysis.py`, and add a shebang line (`#!/usr/bin/env python`) at the top of the file.

`mkdir data_analysis && cd data_analysis && echo '#!/usr/bin/env python' > analysis.py`

5. Write a Python function that calculates the factorial of a number. Deliberately introduce an error in the function, then use Copilot's `/fix` command to identify and correct the issue.

### version with error

```python
def factorial(num):
 if num < 0:
        raise ValueError("Factorial is not defined for negative numbers")
    elif num == 1:
        return 0
    else:
        return num * factorial(num - 1)
```

### fixed version

```python
def factorial(num):
    if num < 0:
        raise ValueError("Factorial is not defined for negative numbers")
    elif num == 0 or num == 1:
        return 1
    else:
        return num * factorial(num - 1)
```

6. Use Copilot to generate documentation comments for the following `R` factorial function:

```r
calculate_factorial <- function(n) {
  # If n is 0, return 1 as the factorial of 0 is 1
  if (n == 0) {
    return(1)
  } else {
    # Otherwise, recursively calculate the factorial
    return(n * calculate_factorial(n - 1))
  }
}
```

7. Use the `/explain` command to get an explanation for the following Python code snippet:

```python
def normalize_data(data):
    min_val = min(data)
    max_val = max(data)
    return [(x - min_val) / (max_val - min_val) for x in data]
```
The provided code defines a function named normalize_data that takes a single argument, data, which is expected to be a list of numerical values. The purpose of this function is to normalize the data, which means scaling the values so that they fall within a specific range, typically between 0 and 1.

The function begins by calculating the minimum value (min_val) and the maximum value (max_val) in the data list using Python's built-in min and max functions, respectively. These values are essential for the normalization process because they define the range of the data.

Next, the function uses a list comprehension to iterate over each element x in the data list. For each element, it applies the normalization formula: (x - min_val) / (max_val - min_val). This formula adjusts each value x by subtracting the minimum value and then dividing by the range of the data (max_val - min_val). The result is a new list where all values are scaled to fall between 0 and 1.

Finally, the function returns the new list of normalized values. This process is useful in various data processing and machine learning tasks where having data within a consistent range can improve the performance and stability of algorithms.


8. Ask Copilot to translate the following `R` code to `Python`:

```r
library(tidyverse)

# Create a sample dataframe with missing values
df <- tibble(
  A = c(1, 2, NA, 4, 5),
  B = c(NA, 2, 3, 4, 5),
  C = c(1, 2, 3, NA, 5),
  D = c(1, 2, 3, 4, 5)
)

# Define the function
count_missing_values <- function(df) {
  sapply(df, function(col) sum(is.na(col)))
}

# Test the function
result <- count_missing_values(df)
print(result)
```

```python
import pandas as pd
import numpy as np

# Create a sample dataframe with missing values
data = {
    'A': [1, 2, np.nan, 4, 5],
    'B': [np.nan, 2, 3, 4, 5],
    'C': [1, 2, 3, np.nan, 5],
    'D': [1, 2, 3, 4, 5]
}
df = pd.DataFrame(data)

# Define the function
def count_missing_values(df):
    return df.isna().sum()

# Test the function
result = count_missing_values(df)
print(result)
```

9. Write a function in both `R` and `Python` that takes a string and returns the number of vowels in the string.


```python
def count_vowels(s):
  vowels = "aeiouAEIOU"
  return sum(1 for char in s if char in vowels)

# Test the function
print(count_vowels("Hello World"))  # Output: 3
```

```r
count_vowels <- function(s) {
  vowels <- c('a', 'e', 'i', 'o', 'u', 'A', 'E', 'I', 'O', 'U')
  return(sum(strsplit(s, NULL)[[1]] %in% vowels))
}

# Test the function
print(count_vowels("Hello World"))  # Output: 3
```
10. Ask Copilot to translate the following binary code into text:

```verbatim
01011001 01101111 01110101 00100111 01110110 01100101 00100000 01100011 01101111 01101101 01110000 01101100 01100101 01110100 01100101 01100100 00100000 01110100 01101000 01100101 00100000 01100001 01110011 01110011 01101001 01100111 01101110 01101101 01100101 01101110 01110100 00101110 00100000 01000011 01101111 01101110 01100111 01110010 01100001 01110100 01110101 01101100 01100001 01110100 01101001 01101111 01101110 01110011 00100001
```

You've completed the assignment. Congratulations!