# Assessment 1 - Back Tracking algorithm

## Generic Backtracking Algorithm 


In this coursework, we want to make a generic backtracking algorithm, which can be used in a reproducible fashion to give solutions for both problems of finding all the partitions of an integer $n$ and for finding a Gray code for an integer $n$. 


In my implementation of backtracking, the algorithm requires three pieces of information (excluding the functions specific to the algorithm). The first require information is $n$, which specifies the end goal of the code, i.e. gives the integer we wish to partition.

The second input is named $Part$, meaning "partial solution", which we often initially set to zero. This is then edited throughout the code until a complete solution is found. 

The final input required is $c$, which gives information about how the partial solution should be edited. As an example, in the partitioning code, $c$ will give the integer we wish to add to the current partition. When we reach a partial solution that is rejected, this parameter is increased by 1, up to a maximum of $n$, when $c$ becomes larger than $n$, we leave the while loop of the backtracking pseudocode. 


We use the following generic algorithm, where the input reuired with first be the functions that correspond to the specific algorithm you wish to complete, these functions are as follows, where every function expects the same input $n$, $Part$ and $c$:

 - accept: Will check to see if $Part$ is a valid solution to the problem.
 - reject: Will return TRUE if the current $Part$ cannot be edited to give a valid result (for example in the partitioning algorithm, adding to a partial solution $P$ with a sum greater than $n$ will always give a result too big, so the partial solution should be completely discarded)
 - first: This will give information as to how the partial solution should be edited in the next step.
 - second: Updates the current solution by editing according to what $c$ is. In the Wikipedia pseudocode, this is called next, but I have changed this as next is already a function in R. 


In [1]:
backtrack <- function(accept,
                      reject,
                      output,
                      first,
                      second,
                      n,
                      Part,
                      c) {
    if (reject(n,Part,c) == TRUE){
        #if the current partial solution can never be valid, we want to leave the current loop  
        return()
    }
    if (accept(n,Part,c) == TRUE){
        #if the Part is a valid solution, output Part
        output(n,Part,c)
        return()
    }
    #set the first amendment parameter
    s <- first(n,Part,c) 
    while (s <= n){
        #update the partial solution using the amendment parameter
        new_part <- second(n,Part,s)
        #run backtrack again with the new partial solution
        backtrack(accept, reject, output, first, second, n, new_part,s)
        #increase ammendment parameter 
        s <- s+1
    }
}


## Partitioning 

For the partitioning algorithm, we want to find all possible integer partitions of a given value $n$. Below are the functions "reject, accept, output, first, second" specific to the partitioning algorithm. I will give a brief explanation of what each function aims to do:

 - reject_partition: This function checks if the current partial solution has a sum greater than $n$, if it does the solution is invalid and adding to it will never give a valid solution, so we want to reject it, so the function will output FALSE.
 - accept_partition: This function checks if the sum of the partial solution is equal to $n$, if it is, then this is a valid solution and so accept is set to TRUE.
 - output_partition: This will print the partial solution.
 - first_partition: Sets the number that we want to add to the current partial solution, which is the input $c$, at some point in the code this is called $s$, the impact is the same.
 - second_partition: This function updates the current partial solution by adding the last input ($c$) to the partial solution. 


In [2]:
reject_partition <- function(n, Part, c) {
  #reject if the sum of P exceeds n
  if (sum(Part) > n) {
    return(TRUE)
  }
  return(FALSE)
}

accept_partition <- function(n, Part, c) {
  #accept if the sum of P equals n
  if (sum(Part) == n) {
    return(TRUE)
  }
  return(FALSE)
}

output_partition <- function(n,P,s) {
  #prints any valid partition
  print(P)
}

first_partition <- function(n, P, c) {
  #the first number added to the partition is c
  return(c)
}

second_partition <- function(n, P, s) {
  #extends current partition by adding s
  return(c(P, s))
}

Below we have an implementation of the backtracking code for the partitioning algorithm, the algorithm has been tested and gives correct solutions for integers up to 20. Here, the input for the first partial solution is just integer(0). 

In [3]:
find_part <- function(n){
    backtrack(accept_partition, reject_partition,output_partition,first_partition,second_partition, n, integer(0),1)
}
find_part(5)

[1] 1 1 1 1 1
[1] 1 1 1 2
[1] 1 1 3
[1] 1 2 2
[1] 1 4
[1] 2 3
[1] 5


## Gray code

For this algorithm, we want to find a Gray code for an integer $n$. In this case, I have chosen to assume that the initial partial solution $Part$ will be a matrix with 1 row and $n$ columns where the entries are all zero. The input $c$ in this case gives the column that we want to edit in the next partial solution. The general approach of the algorithm is that it will "flip" the $c^{th}$ entry of the last row in the partial solution and add that as a new row to the partial solution. The code will then run backtrack on the the new partial solution, and if it is not a valid flip, will remove that row from the matrix and repeat the process with the $(c+1)^{th}$ entry.

The below code gives the reject function used for the Gray code version of the backtracking code. The idea of this function is that it ensures that the current partial solution does not have any repeated rows, and that each row has exactly 1 entry different from the row directly above it. This is done using the "no_repeats" and "one_change" vectors respectively. 

In [4]:
reject_gray <- function(n, Part ,c){
    #Find the number of rows    
    size_mat <- nrow(Part)
    #Ensure we do not reject at the first step
    if (size_mat == 1){
        return(FALSE)
    }  

    #set up vector to check only one chnge at each step
    one_change <- c()
    for (i in 1:size_mat-1){
        #checkes that all but one of the row entries are the same
        if (sum(Part[i,] == Part[i+1,]) == (n-1)){
            one_change <- c(one_change, TRUE)
        }
        else{
            one_change <- c(one_change, FALSE)
        }
    }
    #set up vector to find identical rows 
    no_repeats <- c()
    for (i in 1:(size_mat-1)){
        #for loop to check the final row (gray code currently being checked) with every other row used in the gray code thus far
        if (sum(Part[size_mat,] == Part[i,]) == n){
            #the sum will be n if the rows are identical, in which case we reject
            no_repeats <- c(no_repeats, TRUE)
        }
        else{
            no_repeats <- c(no_repeats, FALSE)
        }
    }
    if ((sum(no_repeats) == 0) & (sum(one_change) == (size_mat-1))){
        #we dont want to reject if every row is different, and only one change happens after each row
        return(FALSE)
    }
    else{
        #rejects if any two rows are the same or more than one change happens between consecutive rows
        return(TRUE)
    }   
}



The following code gives the accept and output functions specific to the Gray code. The accept function will return true when the partial solution has $2^n$ rows and so is a complete Gray code. 

For the output function, the priority of the function is to print the accepted Gray code. However, I found that when running, the code would continue until it found every possible Gray code. To prevent this, I have added a break line to the function, as we only need to print one full solution. As a result of this, after the code outputs a (correct) gray code for the integer $n$, it gives an error message due to the break. I could not find a way to remove this issue. 

In [5]:
accept_gray <- function(n, Part ,s){
    #checks the size of the matrix, will only accept a complete gray code  
    mat_size = nrow(Part)
    if (mat_size == 2^n){
        return(TRUE)
    }
    else{
        return(FALSE)
    }
}

output_gray <- function(n,Part,s){
    #prints the completed gray code
    print(Part)
    #forces the code to stop after finding one complete Gray code
    break
}


Finally, the below code gives the first and second functions specific to the Gray code. The first function always returns 1, as we will always want to start by editing the first entry in the row, if this is not a valid option the code will then move along the columns until it finds a valid flip. 

The second function appends the current partial matrix. It does this by making a copy of the last row, flipping the $s^{th}$ entry of the row and adding it to the current solution. The use of $s$ here instead of $c$ is to ensure the while loop works.

In [6]:
first_gray <- function(n, Part, s) {
    return(1)
}

second_gray <- function(n,Part,s){
    #removes last row if change was invalid
    if (reject_gray(n,Part,s) == TRUE){
        Part <- Part[-nrow(Part),]
    }
    size_mat <- nrow(Part)
    #isolates last row
    last_row <- Part[size_mat,]
    #changes the s entry of the final row
    last_row[s] <- 1-last_row[s]
    #adds the edited row to the matrix 
    return(rbind(Part,last_row))
}


Below we have an example of the algorithm being implemented to find a gray code of length 3 in the same style as the partitioning code. Here, the initial $Part$ input is a one row matrix with $n$ zeros. The output is then a matrix with $2^n$ rows and $n$ columns. The error message, as explained earlier is related to the break line in the output function. With the break line, this code is capable of finding a Gray code for integers up to (and including 10), without the break it is only successful for integers up to 4.  

In [7]:
find_gray <- function(n) {
    backtrack(accept_gray,
              reject_gray,
              output_gray,
              first_gray,
              second_gray,
              n,
              matrix(rep(0,n), nrow = 1),
              1)
}
find_gray(3)

         [,1] [,2] [,3]
            0    0    0
last_row    1    0    0
last_row    1    1    0
last_row    0    1    0
last_row    0    1    1
last_row    1    1    1
last_row    1    0    1
last_row    0    0    1


ERROR: Error in output(n, Part, c): no loop for break/next, jumping to top level


## Conclusions 

Overall, I think my implementation of the code worked well in terms of reusability, particularly the generic backtracking algorithm, as I was able to use exactly the same backtracking code for both partitioning and the Gray code. I also feel the use of $s$ is works well as it is more intuitive to have the while loop run until $s$ exceeds the input parameter $n$. 

However, I think the use of the $c$ parameter may reduce the reusability of the code for two reasons. This first being that if the problem being solved using backtracking had a more complicated input than just an integer $n$, then the code would not run well. Also it would have been preferable the code could have not required $s$ as an input, but instead could have set it up within the code automatically.

My code is also relatively slow, particularly when running the Gray code, as there is likely ways to ensure that the code does not start considering potential solutions which will lead nowhere sooner. As an example, in the partitioning code, if I were to ask it to find all the partitions of $4$, my current code would test $1, 1, 1, 2$ and $1, 1, 1 ,3$, so it would be preferable if the code could identify immediately that it should not continue working with partial solutions with three 1's earlier. 

The final problem my code would need to address is the error in the Gray code due to the break line which prevents overrunning. A way to address this problem may be to approach the backtracking in a more similar way to the pseudocode used in the Wikipedia page. The main difference between that pseudocode and my algorithm is the placement of the "next" function, as in my code the function is before the backtracking step, which worked better with my interpretation of the variable $s$, however, in the time available I could not rework my code to fit the alternative pseudocode, but is something I would have liked to have completed. 