<a href="https://colab.research.google.com/github/zia207/r-colab/blob/main/NoteBook/R_Beginner/01-06-01-functional-programming-r-base-r.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

![alt text](http://drive.google.com/uc?export=view&id=1bLQ3nhDbZrCCqy_WCxxckOne2lgVvn3l)

# 6.1  Functional Programming with R-Base - Pure and Apply Functions

This tutorial introduces functional programming (FP) in R using only base R, focusing on **pure functions** and the **apply family** of functions (`lapply`, `sapply`, `apply`, etc.). Functional programming emphasizes pure functions, immutability, and declarative coding, avoiding side effects and mutable state. By the end, you'll understand how to write FP-style code in base R with practical examples.




## What is Functional Programming?

Functional programming is a paradigm that:

-   Treats computation as evaluating mathematical functions.

-   Uses **pure functions** (same input always gives same output, no side effects).

-   Avoids mutable state (changing variables).

-   Leverages higher-order functions (functions that take or return functions).

In R, base functions like `lapply` and `sapply` support FP by enabling declarative iteration over data.

## Setup R in Python Runtype - Install {rpy2}

{rpy2} is a Python package that provides an interface to the R programming language, allowing Python users to run R code, call R functions, and manipulate R objects directly from Python. It enables seamless integration between Python and R, leveraging R's statistical and graphical capabilities while using Python's flexibility. The package supports passing data between the two languages and is widely used for statistical analysis, data visualization, and machine learning tasks that benefit from R's specialized libraries.

In [1]:
!pip uninstall rpy2 -y
! pip install rpy2==3.5.1
%load_ext rpy2.ipython

Found existing installation: rpy2 3.5.17
Uninstalling rpy2-3.5.17:
  Successfully uninstalled rpy2-3.5.17
Collecting rpy2==3.5.1
  Downloading rpy2-3.5.1.tar.gz (201 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m201.7/201.7 kB[0m [31m7.1 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: rpy2
  Building wheel for rpy2 (setup.py) ... [?25l[?25hdone
  Created wheel for rpy2: filename=rpy2-3.5.1-cp312-cp312-linux_x86_64.whl size=316564 sha256=4cbeeb2d952c6c0f56cff0c9080259bed69568ff83ea8305ec63e587f659483c
  Stored in directory: /root/.cache/pip/wheels/00/26/d5/d5e8c0b039915e785be870270e4a9263e5058168a03513d8cc
Successfully built rpy2
Installing collected packages: rpy2
Successfully installed rpy2-3.5.1


##  Mount Google Drive

Then you must create a folder in Goole drive named "R" to install all packages permanently. Before installing R-package in Python runtime. You have to mount Google Drive and follow on-screen instruction:

In [3]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


## Pure Functions in R

A **pure function**:

-   Always produces the same output for the same input.

-   Has no side effects (e.g., no modifying global variables, no I/O operations).

In [4]:
%%R
# Pure function: Squares a number
square <- function(x) {
  x * x
}
square(4) # Returns 16, always for input 4

[1] 16


## Impure Function (Avoid in FP)

In [5]:
%%R
# Impure: Modifies global variable
counter <- 0
impure_add <- function(x) {
  counter <<- counter + 1
  x + counter
}
impure_add(5) # Output depends on counter, not just input

[1] 6


**Tip**: Ensure functions rely only on their arguments and return results without altering external state.

## The Apply Family for Functional Programming

Base R’s apply functions are key to FP, allowing you to apply a function to elements of a data structure declaratively, avoiding loops and mutable state.


### Key Apply Functions

The apply family consists of vectorized functions which minimize our need to explicitly create loops. These family is an inbuilt R package, so no need to install any packages for the execution.

-   `apply()` for matrices and data frames

-   `lapply()` for lists...output as list

-   `sapply()` for lists...output simplified

-   `tapply()` for vectors

-   `mapply()` for multi-variant

### apply

`apply()` returns a vector or array or list of values obtained by applying a function to margins of an array or matrix or dataframe. Using `apply()` is not faster than using a loop function, but it is highly compact and can be written in one line.

> apply(x,MARGIN, FUN,...)

Where:

-   **x** is the matrix, dataframe or array

-   **MARGIN** is a vector giving the subscripts which the function will be applied over. E.g., for a matrix 1 indicates rows, 2 indicates columns, c(1, 2) indicates rows and columns.

-   **FUN** is the function to be applied

-   **...** is for any other arguments to be passed to the function

In [6]:
%%R
# Crate a dataframe
df <- cbind(x1 = 1:8, x2 = 2:9, x3=3:10)
# add row names
dimnames(df)[[1]] <- letters[1:8]

Let's calculate column mean:

In [7]:
%%R
apply(df, 2, mean, trim = 0.2)

 x1  x2  x3 
4.5 5.5 6.5 


Row mean:

In [None]:
%%R
apply(df, 1, mean, trim = .2)

Get column quantile:

In [8]:
%%R
apply(df, 2, quantile, probs = c(0.10, 0.25, 0.50, 0.75, 0.90))

      x1   x2   x3
10% 1.70 2.70 3.70
25% 2.75 3.75 4.75
50% 4.50 5.50 6.50
75% 6.25 7.25 8.25
90% 7.30 8.30 9.30


### lapply

`lapply()` returns a list of the same length as X (list), each element of which is the result of applying FUN to the corresponding element of X. It loops over a list, iterating over each element in that list and then applies a function to each element of the list and finally returns a list (l stand for list).

> lapply(x, FUN, ...)

Where:

-   **x** is the list

-   **FUN** is the function to be applied

-   **...** is for any other arguments to be passed to the function

In [9]:
%%R
# Create a list
mylist<-list(A=matrix(1:9,nrow=3),B=1:5,C=c(8,5),  logic = c(TRUE,FALSE,FALSE,TRUE, TRUE))
mylist

$A
     [,1] [,2] [,3]
[1,]    1    4    7
[2,]    2    5    8
[3,]    3    6    9

$B
[1] 1 2 3 4 5

$C
[1] 8 5

$logic
[1]  TRUE FALSE FALSE  TRUE  TRUE



In [10]:
%%R
lapply(mylist, mean)

$A
[1] 5

$B
[1] 3

$C
[1] 6.5

$logic
[1] 0.6



You can see how the results are saved as a list form. We can easily unlist the results:

In [11]:
%%R
unlist(lapply(mylist,mean))

    A     B     C logic 
  5.0   3.0   6.5   0.6 


### sapply

`sapply()` is a wrapper of lapply() to simplify the result to vector or matrix.

In [12]:
%%R
sapply(mylist, mean)

    A     B     C logic 
  5.0   3.0   6.5   0.6 


### tapply

`tapply()` is used to apply a function over subsets of a vector when a dataset can be broken up into groups (via categorical variables - aka factors)

In [13]:
%%R
my.df <- data.frame(
  Landcover = rep(c("Forest", "Grassland", "Wetland"), each = 4),
  Site = rep(1:4, times = 3),
  pH = c(5.5, 6.0, 5.8, 6.2, 7.0, 6.8, 7.2, 7.1, 6.5, 6.7, 6.9, 7.0),
  SOC = c(3.2, 3.5, 3.1, 3.3, 2.5, 2.7, 2.6, 2.8, 4.0, 4.1, 4.2, 4.3)
)

We can use `tapply()` to calculate mean values of pH an SOC for land cover

In [14]:
%%R
apply(my.df[3:4], 2, function(x) tapply(x, my.df$Landcover, mean))

             pH   SOC
Forest    5.875 3.275
Grassland 7.025 2.650
Wetland   6.775 4.150


### mapply

`mapply()` is a multivariate version of `sapply()`. `mapply()` applies FUN to the first elements of each ... argument, the second elements, the third elements, and so on.

In [15]:
%%R
list( rep(2, 4), rep(3, 3), rep(4, 2))

[[1]]
[1] 2 2 2 2

[[2]]
[1] 3 3 3

[[3]]
[1] 4 4



You can see that the same function (rep) is being called repeatedly where the first argument (number vector) varies from 2 to 4, and the second argument (rep) varies from 4 to 2. Instead, you can use `mapply()`

In [16]:
%%R
mapply(rep, 2:4, 4:2)

[[1]]
[1] 2 2 2 2

[[2]]
[1] 3 3 3

[[3]]
[1] 4 4



## Writing Pure Functions with Apply

To align with FP, ensure the functions you pass to apply functions are pure. Here’s a practical example:

Process a Vector of Names:

Convert names to uppercase and sort them (like the JavaScript example from earlier).

In [17]:
%%R
users <- c("alice", "bob", "charlie")
process_users <- function(users) {
  sort(sapply(users, toupper))
}
process_users(users) # Returns c("ALICE", "BOB", "CHARLIE")

    alice       bob   charlie 
  "ALICE"     "BOB" "CHARLIE" 


-   `sapply(users, toupper)` applies the pure `toupper` function to each name.

-   `sort()` is pure, producing a new sorted vector without modifying the input.

### Avoiding Side Effect

Avoid functions that print, modify global state, or perform I/O inside apply calls. Example:

### Bad: Side Effect in Function

In [18]:
%%R
bad_function <- function(x) {
  print(paste("Processing:", x)) # Side effect
  toupper(x)
}
sapply(users, bad_function) # Prints to console, not FP-friendly

[1] "Processing: alice"
[1] "Processing: bob"
[1] "Processing: charlie"
    alice       bob   charlie 
  "ALICE"     "BOB" "CHARLIE" 


### Good: Pure Function

In [19]:
%%R
good_function <- function(x) {
  toupper(x)
}
result <- sapply(users, good_function)
print(result) # Side effect handled outside

    alice       bob   charlie 
  "ALICE"     "BOB" "CHARLIE" 


## Immutability in Base R

R doesn’t enforce immutability, but you can practice it by avoiding in-place modifications. Instead of changing a vector, create a new one:

In [20]:
%%R
numbers <- c(1, 2, 3)
doubled <- sapply(numbers, function(x) x * 2) # New vector: c(2, 4, 6)
# Original 'numbers' unchanged

## Function Composition

Combine pure functions to build complex logic. Base R doesn’t have a composition operator, but you can nest functions:

In [22]:
%%R
add_one <- function(x) x + 1
double <- function(x) x * 2
composed <- function(x) add_one(double(x))
composed(5) # Returns 11 (double(5) = 10, add_one(10) = 11)

[1] 11


## Real-World Example: Data Frame Processing

Process a data frame to filter rows and transform values using FP principles in base R.

In [23]:
%%R
# Create a data frame
data <- data.frame(
  name = c("Alice", "Bob", "Charlie"),
  age = c(25, 30, 35)
)

# Function to filter ages > 25 and uppercase names
process_data <- function(df) {
  # Filter rows (like dplyr::filter)
  filtered <- df[df$age > 25, ]
  # Uppercase names
  filtered$name <- sapply(filtered$name, toupper)
  filtered
}
process_data(data) # Returns data frame with Bob and Charlie, names in uppercase

     name age
2     BOB  30
3 CHARLIE  35


 Uses `sapply` for transformation and subsetting for filtering, both avoiding mutation of the original `data`.

##  Recursion in Base R

FP favors recursion over loops. Example: Calculate factorial recursively:

In [24]:
%%R
factorial <- function(n) {
  if (n <= 1) return(1)
  n * factorial(n - 1)
}
factorial(5) # Returns 120

[1] 120


**Note**: R lacks tail recursion optimization, so use `lapply`/`sapply` for large datasets to avoid stack overflow.

## Tips for FP in Base R

- **Keep Functions Pure**: Ensure functions passed to `lapply`/`sapply` depend only on inputs.

- **Use Anonymous Functions**: Pass `function(x) ...` to apply functions for one-off transformations.

    ```         
    sapply(numbers, function(x) x + 1) # Returns c(2, 3, 4)
    ```

-  **Simplify with `sapply`**: Use `sapply` when you want a vector output instead of a list from `lapply`.

-  **Combine Functions**: Nest apply calls for complex transformations, e.g., `sapply(lapply(...), ...)`.


## Limitations

-   **Performance**: Creating new objects for immutability can be memory-intensive.

-   **Verbosity**: Base R’s FP tools are less concise than `purrr`’s.

-   **Side Effects**: Real-world tasks (e.g., plotting) require side effects; isolate them outside pure functions.


##  Summary and Conclusion

Functional programming in base R leverages pure functions and the apply family (`lapply`, `sapply`, `apply`) to write declarative, predictable code. By avoiding side effects, practicing immutability, and using higher-order functions, you can create modular and testable programs. Start with small transformations (e.g., mapping over vectors) and scale to data frame processing for robust FP workflows.

These functions provide potent tools for applying operations across vectors, matrices, or data frames, streamlining complex operations, and enhancing code readability.

## Resources

-   **Advanced R by Hadley Wickham**: <https://adv-r.hadley.nz/functional-programming.html>

-   **R Documentation**: Search `?lapply` or `?sapply` in R for official details.