# Introduction to R and Jupyter Notebooks
------------------------

Welcome to the first module on Data Analysis and Visualization in R. In this module, we'll introduce you to the basics of R programming and how to use Jupyter Notebooks for interactive coding and data analysis.

## Learning Objectives
1. [Introduction to R and Jupyter Notebooks](#introduction)
2. [Basic R Syntax](#basic-syntax)
3. [Data Structures in R](#data-structures)
4. [Data Import/Export](#data-import-export)
5. [Practical Exercises](#exercises)
6. [Quiz](#quiz)





<a id='introduction'></a>
## 1. Introduction to R and Jupyter Notebooks

### What is R?
R is a programming language and software environment specifically designed for statistical computing and graphics. It's widely used among statisticians and data analysts for data analysis and developing statistical software.

### What are Jupyter Notebooks?
Jupyter Notebooks provide an interactive environment where you can combine code execution, rich text, mathematics, plots, and media. They are great for exploratory data analysis, sharing code, and creating interactive tutorials.

### Why Use R in Jupyter Notebooks?
- Interactive Coding: Run code snippets and see results immediately.
- Visualization: Easily create and display plots inline.
- Documentation: Combine code with explanations, making it ideal for learning and sharing.

<a id='basic-syntax'></a>
## 2. Basic R Syntax

Let's start by learning some basic R syntax.  Because we are running R within python we need to preface each code block with %%R to activate the R environment.

### Variables and Data Types
In R, you can assign values to variables using the <- operator. Or alternatively, you can use an "=" sign. eg.  x=10

In [None]:

# Step 1: Assigning numeric values
x <- 10           # Assigning an integer value to variable x
y <- 5.5          # Assigning a floating-point number to variable y

# Step 2: Assigning a character string
name <- "John Doe"  # Assigning a character string to variable name

# Step 3: Assigning a logical value
is_student <- TRUE   # Assigning a logical value (TRUE or FALSE) to variable is_student

# Step 4: Accessing the variables
# You can access the value of a variable by simply typing its name
print(x)            # Prints the value of x (10)
print(y)            # Prints the value of y (5.5)
print(name)         # Prints the value of name ("John Doe")
print(is_student)   # Prints the value of is_student (TRUE)

# Step 5: Check data types of the variables
# The class() function returns the data type of the variable
print(class(x))            # Prints the class of x (should be "numeric")
print(class(y))            # Prints the class of y (should be "numeric")
print(class(name))         # Prints the class of name (should be "character")
print(class(is_student))   # Prints the class of is_student (should be "logical")

### Basic Arithmetic Operations
You can perform arithmetic operations using the standard operators:

In [None]:
# Addition
sum <- x + y

# to get the value of the variable 'sum' you can just type:
sum

In [None]:
# or 
print(sum)

In [None]:
# or to add more context you can use paste to mix text with variables

print(paste("Sum:", sum))

In [None]:
# Subtraction
diff <- x - y
print(paste("Difference:", diff))

# Multiplication
prod <- x * y
print(paste("Product:", prod))

# Division
quot <- x / y
print(paste("Quotient:", quot))

# Exponentiation
power <- x ^ 2
print(paste("Power:", power))

<a id='data-structures'></a>
## 3. Data Structures in R

R has several data structures to store and manipulate data.

### Vectors
A vector is a sequence of data elements of the same basic type.

In [None]:
# Numeric vector
numbers <- c(1, 2, 3, 4, 5)
print(numbers)

# Character vector
fruits <- c("apple", "banana", "cherry")
print(fruits)

# Logical vector
flags <- c(TRUE, FALSE, TRUE, TRUE)
print(flags)

# Accessing elements
print(numbers[3])
print(numbers[-3])
print(numbers[2:4])

### Matrices
A matrix is a two-dimensional array where each element has the same data type.

In [None]:

# Creating a matrix
mat <- matrix(1:9, nrow = 3, ncol = 3)
print(mat)

### Lists
A list is a collection of objects that can be of different types.

In [None]:

# Creating a list
my_list <- list(name = "Alice", age = 25, scores = c(90, 85, 92))
print(my_list)



In [None]:

# Accessing list elements
print(my_list$name)
print(my_list$age)
print(my_list$scores)

### Data Frames
A data frame is a table where each column can contain different types of data.

In [None]:
# Creating a data frame
df <- data.frame(
  id = 1:3,
  name = c("John", "Jane", "Doe"),
  score = c(88, 92, 95)
)
print(df)

# Accessing Data Frame Elements
print(df$name)

In [None]:
# Accessing a row of the Data Frame
print(df[2, ])

In [None]:
# Accessing a column of the Data Frame
print(df[,2])

In [None]:
# Accessing a column of the Data Frame by name
print(df[,'score'])

In [None]:
# Accessing a column of the Data Frame by name
print(df[,'score'])

<a id='data-import-export'></a>
## 4. Data Import/Export

### Reading Data from Files
You can read data from various file formats.

In [None]:

# Reading a CSV file
# Uncomment and modify the path as needed
# data <- read.csv("data/sample_data.csv")
# head(data)

# Reading an Excel file
# Uncomment and run these lines if you have Excel files to read
install.packages("readxl")
library(readxl)
# data <- read_excel("data/sample_data.xlsx")
# head(data)

### Writing Data to Files

In [None]:
# Writing to a CSV file
# Uncomment and modify the path as needed
# write.csv(df, "data/output_data.csv", row.names = FALSE)

# Writing to an Excel file
# Uncomment and run these lines if you want to write to Excel
# install.packages("writexl")
# library(writexl)
# write_xlsx(df, "data/output_data.xlsx")

<a id='exercises'></a>
## 5. Practical Exercises

Now it's time to apply what you've learned!

### Exercise 1: Basic Calculations
Task: Create two variables, a and b, assign them the values 15 and 7 respectively. Calculate their sum, difference, product, and quotient.

In [None]:
# Your code here


### Exercise 2: Working with Vectors
Task: Create a numeric vector temps containing the following temperatures in Celsius: 23, 25, 19, 22, 20, 18, 24. Calculate the mean temperature.

In [None]:
# Your code here


### Exercise 3: Data Frame Manipulation
Task: Create a data frame students with the following data:
- Names: "Alice", "Bob", "Charlie"
- Ages: 24, 22, 23
- Scores: 90, 85, 88

Add a new column Passed which is TRUE if the score is 85 or above.

In [None]:
# Your code here

