#### Module 2: R Programming Basics  
#### PART - 1
- **Overview of R Programming**  
- **Environment Setup with R Studio**  
- **R Commands**  
- **Variables and Data Types**  
- **Control Structures**  
- **Data Structures in R:**  
  - Arrays  
  - Matrices  
  - Vectors  
  - Factors  
#### PART - 2
- **Functions in R**  
- **R Packages** 


---

## **7. Functions in R**  

### **What is a Function in R?**  
A function in R is a block of reusable code that performs a specific task. Functions help in modularizing code, improving readability, and reducing repetition.  

### **Types of Functions in R**  
1. **Built-in Functions** â€“ Predefined in R (e.g., `sum()`, `mean()`, `print()`).  
   - **Mathematical Functions**: `mean()`, `median()`, `sd()`, `sqrt()`, `abs()`  
   - **String Functions**: `toupper()`, `tolower()`, `substr()`  
2. **User-defined Functions** â€“ Created by users for specific tasks.  
3. **Anonymous Functions** â€“ Functions without a name, often used within other functions.  

### **Syntax of a Function in R**  
```r
function_name <- function(arguments) {
  # Function body
  return(value)  # Optional
}
```

### **Example of a User-defined Function**  
```r
add_numbers <- function(a, b) {
  return(a + b)
}
add_numbers(5, 10)  # Returns 15
```

### **Popular Built-in Functions in R**  

#### **7.1. Mathematical Functions**
- `sum(x)` â€“ Returns the sum of elements in `x`.  
- `mean(x)` â€“ Calculates the average (mean) of `x`.  
- `median(x)` â€“ Finds the median of `x`.  
- `sd(x)` â€“ Computes the standard deviation.  
- `var(x)` â€“ Returns the variance.  
- `round(x, digits=n)` â€“ Rounds `x` to `n` decimal places.  
- `exp(x)` â€“ Exponential function.  
- `log(x, base=n)` â€“ Logarithm with base `n`.  
- `abs(x)` â€“ Absolute value of `x`.  

#### **7.2. Statistical Functions**
- `cor(x, y)` â€“ Calculates correlation between `x` and `y`.  
- `cov(x, y)` â€“ Returns covariance.  
- `quantile(x, probs=n)` â€“ Computes quantiles of `x`.  
- `table(x)` â€“ Creates a frequency table.  

#### **7.3. Data Manipulation Functions**
- `c()` â€“ Creates a vector.  
- `length(x)` â€“ Returns the number of elements in `x`.  
- `sort(x)` â€“ Sorts `x` in ascending order.  
- `rev(x)` â€“ Reverses the elements of `x`.  
- `unique(x)` â€“ Removes duplicate values.  
- `append(x, values)` â€“ Adds values to `x`.  

#### **7.4. String Functions**
- `paste(x, y, sep=" ")` â€“ Concatenates strings.  
- `toupper(x)` â€“ Converts to uppercase.  
- `tolower(x)` â€“ Converts to lowercase.  
- `substr(x, start, stop)` â€“ Extracts substring.  
- `nchar(x)` â€“ Counts characters in a string.  

#### **7.5. Logical & Relational Functions**
- `all(x)` â€“ Checks if all elements are `TRUE`.  
- `any(x)` â€“ Checks if at least one element is `TRUE`.  
- `which(x)` â€“ Returns index of `TRUE` values.  

#### **7.6. Data Frame & List Functions**
- `data.frame()` â€“ Creates a data frame.  
- `head(x, n)` â€“ Displays the first `n` rows.  
- `tail(x, n)` â€“ Displays the last `n` rows.  
- `str(x)` â€“ Shows structure of `x`.  
- `summary(x)` â€“ Gives a summary of `x`.  
- `names(x)` â€“ Returns column names.  

#### **7.7. Control Flow Functions**
- `ifelse(condition, true_value, false_value)` â€“ Vectorized conditional statement.  
- `switch(expression, case1, case2, ...)` â€“ Switch case alternative.  
- `apply(X, MARGIN, FUN)` â€“ Applies a function over an array/matrix.  

---

## **8. R Packages**  

### **Definition of a Package in R**  
A **package** in R is a collection of functions, datasets, and documentation bundled together for specific tasks. Packages extend R's functionality and can be installed from repositories like **CRAN (Comprehensive R Archive Network)**, **Bioconductor**, or **GitHub**.  

### **Key Features of R Packages:**  
- Contain pre-written functions and datasets.  
- Help in data manipulation, visualization, machine learning, etc.  
- Can be installed, loaded, and updated easily.  

### **Basic Package Operations in R:**  

1. **Install a Package:**  
   ```r
   install.packages("ggplot2")
   ```
2. **Load a Package:**  
   ```r
   library(ggplot2)
   ```
3. **Check Installed Packages:**  
   ```r
   installed.packages()
   ```
4. **Update a Package:**  
   ```r
   update.packages("ggplot2")
   ```
5. **Remove a Package:**  
   ```r
   remove.packages("ggplot2")
   ```

### **Popular R Packages**  

#### **1. Data Manipulation**  
- **dplyr** â€“ Data wrangling and transformation  
- **data.table** â€“ High-performance data manipulation  

#### **2. Data Visualization**  
- **ggplot2** â€“ Customizable data visualization  
- **plotly** â€“ Interactive plots  

#### **3. Machine Learning**  
- **caret** â€“ Machine learning workflow  
- **xgboost** â€“ Gradient boosting for ML  

#### **4. Statistical Analysis**  
- **MASS** â€“ Various statistical functions  
- **stats** â€“ Built-in R package for statistical analysis  

#### **5. Time Series Analysis**  
- **forecast** â€“ Time series forecasting  
- **tseries** â€“ Time series analysis and modeling  

#### **6. Text Mining & NLP**  
- **tm** â€“ Text mining and preprocessing  
- **tidytext** â€“ Text analysis using tidy data principles  

#### **7. Web Scraping & APIs**  
- **rvest** â€“ Web scraping from HTML pages  
- **httr** â€“ HTTP requests for APIs  

#### **8. Spatial & GIS Analysis**  
- **sf** â€“ Spatial data handling  
- **leaflet** â€“ Interactive maps  

#### **9. Data Import & Export**  
- **readr** â€“ Fast reading of CSV files  
- **readxl** â€“ Reading Excel files  

#### **10. Interactive Dashboards & Apps**  
- **shiny** â€“ Build interactive web applications  
- **flexdashboard** â€“ Create dashboards easily  

---

## **Conclusion**  
This module provided a **comprehensive introduction** to R programming, covering setup, basic commands, data types, control structures, data structures, functions, and packages. **Mastering these concepts** will form the foundation for more advanced data analysis and machine learning in R.  


---
---

### **Summary in R Consists of:**  

The `summary()` function in R provides **descriptive statistics** based on the type of data.  

---

### **1. Summary of a Numeric Vector**  
```r
vec <- c(10, 20, 30, 40, 50)
summary(vec)
```
**Output:**  
```
Min.  1st Qu.  Median  Mean  3rd Qu.  Max.  
10     20       30      30    40      50  
```
âœ… Includes **Min, Max, Mean, Median, 1st & 3rd Quartiles**  

---

### **2. Summary of a Data Frame**  
```r
df <- data.frame(Name = c("A", "B", "C"),
                 Age = c(25, 30, 35),
                 Score = c(80, 90, 85))
summary(df)
```
**Output:**  
```
 Name     Age        Score      
 A:1    Min.   :25   Min.   :80  
 B:1    1st Qu.:27   1st Qu.:82.5  
 C:1    Median :30   Median :85  
        Mean   :30   Mean   :85  
        3rd Qu.:32   3rd Qu.:87.5  
        Max.   :35   Max.   :90  
```
âœ… Provides **min, max, mean, median, quartiles, and frequency counts** for categorical data.  

---

### **3. Summary of a Factor (Categorical Data)**  
```r
f <- factor(c("Low", "High", "Medium", "Low", "High"))
summary(f)
```
**Output:**  
```
High    Low    Medium  
  2       2       1  
```
âœ… **Counts the frequency of each category.**  

---

### **4. Summary of a Logical Vector**  
```r
log_vec <- c(TRUE, FALSE, TRUE, TRUE, FALSE)
summary(log_vec)
```
**Output:**  
```
Mode: logical  
FALSE: 2  
TRUE : 3  
```
âœ… **Counts `TRUE` and `FALSE` occurrences.**  

---

**ðŸ”¹ In Short, `summary()` provides:**  
âœ” **Numeric Data** â†’ Min, Max, Mean, Median, Quartiles  
âœ” **Factor Data** â†’ Frequency Count  
âœ” **Logical Data** â†’ TRUE/FALSE Count  
âœ” **Data Frame** â†’ Summary for Each Column  
