By Kunal Jha

## Question 2  {-}

### Rigorous Restatement {-}
A *purchase distribution of n* is any sequence of distinct integers positive integers between 1 and n such that

$\forall$*x*,*y* $\in$ *purchase distribution of n*: x $\neq$ y $\implies$ |*m*[x] - *m*[y]| $\ge$ *k*

For example, each of (1, 3, 5, 6), (2, 7), and (10) is a purchase distribution of 10, so long as the *m* values of each of the integers in the purchase distribution differ from each other by at least k.

*Profit on a purchase distribution* ($l_1$, $l_2$,..., $l_k$) is $\sum_{i=1}^{k}p[l_i]$

*Max-profit on n* is the profit on an optimal purchase distribution of *n*. The problem asks us to print an optimal purchase distribution of *n* and the max-profit on *n*.

### Clever Observation (aka Optimal Substructure) {-}
At each of the locations from 1 to *n*, you have a choice of either opening a resturant there or not opening a resturant there. If you choose not to open a resturant in a location *j*, for instance, then the max-profit you obtain from location *j* is simply the max-profit obtainable from location *j-1*. However, if you choose to open a resturant at location *j*, your max-profit become the expected profit from opening at location *j*, which is *p*[*j*], plus the max-profit from location *p*[*i*], where *i* is the first location that comes before *j* such that locations *j* and *i* are at least *k* miles apart. 

How should we choose to open a resturant for any location between 1 and *n*? We should decide to open based on 

*max-profit*(*n*) = *max*{*max-profit*(*n*-1), *p*[*n*] + *max-profit*(*i*)}, where *i* is the first location that comes before *n* such that locations *n* and *i* are at least *k* miles apart. 

If *max-profit*(*n*) = *max-profit*(*n*-1), we don't open a resturant at location *n*. Otherwise, we do and the profit is *p*[*n*] + *max-profit*(*i*). Of course, *max-profit*(0) is 0.

We also observe that $i_1 \le i_2 \le \dots \le i_n$, that is, the optimal previous location to open a resturant for the *n*th index is at the (*n* - 1)th location or after. This allows us to compute *c*, an array storing the index of the optimal previous location to open a resturant assuming that you open it in an index *n*.

This reveals the optimal substructure of the problem, i.e., how the problem is expressed in terms of sub-problems. This optimal substructure inspires the notation and the recurrence below.

### Notation {-}
Let *y*[*i*] be the max-profit on *i*. Let *R*[*i*] = 1 if a resturant is opened at location *i* and 0 otherwise.

### Recurrence {-}

$$
\mathrm{y}[i] = \begin{cases}
    0 & \text{if } i = 0 \\ 
    max(y[i-1], p[i] + y[j])  & \text{otherwise.}
\end{cases}
$$

where *j* is the maximum value of the set {$x | x \in [1:i-1] \wedge m[i] - m[x] \ge k$}

In [7]:
### Pseudocode
# Pre-condition: n >= 1, k >= 1, all elements in m and p are greater than or equal to 0
#                All elements in c are indices within m in the range 1 to n
# Post-condition: Returns y[n] and R[1...n] as defined above
function findProfandPos(m::Vector, p::Vector, c::Vector, n::Integer)
    y = Vector{Integer}(undef, n)
    R = Vector{Integer}(undef, n)
    for i in 1:n
        if i-1 == 0
            dontOpen = 0
        else
            dontOpen = y[i-1]
        end
    
        
        open = p[i] + y[c[i]]
        
        if dontOpen > open
            y[i] = dontOpen
            R[i] = 0
        else
            y[i] = open
            R[i] = 1
        end
    end
    return (y[n], R)
end

# Pre-condition: n >= 1, k >= 1, all elements in m are greater than or equal to 0
# Post-condition: Returns array c which contains at each index i, the max(j) for all 1 <= j <= i-1 such that
#                m[i] - m[j] >= k
function findc(m::Vector, n::Integer, k::Integer)
    c = Vector{Integer}(undef, n)
    c[1] = 0
    for i in 2:n
        j = c[i-1]
        while m[j+1] < m[i] - k
            j = j + 1
        end
        c[i] = j 
    end
    return c
end

# Pre-condition: n >= 1, loc is an array with only 1 or 0 as values, 
#                All elements in m are greater than or equal to 0
# Post-condition: Returns an array with the locations of every place you are making a resturant
function reportLoc(loc::Vector, m::Vector, c::Vector, n::Integer)
    j = n
    x = zeros(0)
    while j >= 1
        if loc[j] == 1
            append!(x, m[j])
            j = c[j]
        else
            j = j -1
        end
    end
    return x
end

reportLoc (generic function with 1 method)

In [8]:
# Pre-condition: All elements in m and p are greater than or equal to 0. n >= 1
# Post-condition: Returns the maximum profit attainable and
#                 Returns an array with the locations of every place you are making a resturant
function main(m::Vector, p::Vector, n::Integer, k::Integer)
    c = findC(m,n,k)
    profit, R = findProfandPos(m, p, c, n)
    pos = reportLoc(R, m, c, n)
    return (profit, pos)
end

main (generic function with 2 methods)

### Time Complexity {-}
The *findProfandPos*, *reportLoc*, and *findC* methods all take *O*(*n*) time and space. ***main* takes *O*(*n*) time and space as well**, since it simply makes calls to all of these methods sequentially.

## Question 3 {-}

### Rigorous Restatement {-}

A *printed paragraph of n* is any number of lines such that all of the words 1 to *n* are printed in order across the lines and, for all lines with words 

*i*, $\dots$, *j*:  (*j* - *i*) + $\sum_{k=i}^{j}l_k \le M$ 

The *extraSpace*(*i*,*j*) of a line with words *i* to *j* = *M* - *j* + *i* - $\sum_{k=i}^{j}l_k$.

Let 
$$
\mathrm{spaceCost}(i,j) = \begin{cases}
    0 & \text{if j = n (i.e. on last line)}  \\ 
    \infty & \text{if words i through j don't fit on a line}  \\
    (extraSpace(i,j))^3  & \text{otherwise.}
\end{cases}
$$

A *neatly printed paragraph of n* is a paragraph such that the sum of the spaceCost for each line in the paragraph is at a minimum.

### Clever Observation (Optimal Substructure) {-}

We are asked to neatly print a paragraph of words 1 through *n*. We are told that the true last line does not have a cost for extra space. We also assign any line with words *i* through *j* such that words *i* through *j* do not fit on the line to have a cost of $\infty$, meaning every word will fit on a line.

Consider an optimal solution for printing words 1 through *n*. If *i* is the index of the first word on the last line, we know words 1 through *i* - 1 must be neatly printed. We can verify this through a simple proof by contradiction:

*Assume that words 1 through n are neatly printed and words 1 through i - 1 are not. If we replace words 1 through i - 1 with an optimal printing of words 1 through i - 1, then the net extra space for the printing of words 1 through n would be lower. However, this contradicts the assumption that words 1 through n were printed neatly. Hence, a contradiction arises, proving the claim.*

We can apply this logic to every *k*th line such that 2 $\le$ *k* $\le$ *n*. Thus, the problem exhibits optimal substructure in that every line can be viewed as the optimal way to print that line given its cost constraints.

Let *c*(*j*) be the cost of printing words 1 through *j* optimally. If the last line contains words *i* through *j*, then *c*(*j*) = *c*(*i*-1) + *spaceCost*(*i*,*j*). As a base case, we set *c*(0) = 0 so that *c*(1) = *extraSpace*(1,1). 

We want to find out which word begins the last line of the words 1 through *j*, so we try each word *i* in the range of 1 to *j* to determine the best solution.

We also observe that at most $\lceil M/2 \rceil$ words fit on a line (each word is at least 1 character with one space after it). Since a line from words *i* to *j* contains *j* - *i* + 1 words, if *j* - *i* + 1 > $\lceil M/2 \rceil$ then spaceCost[i,j] = $\infty$. As such, we only need to calculate spaceCost[*i*, *j*] for *j* - *i* + 1 $\le$ $\lceil M/2 \rceil$. This reduces space required for computation.

We additionally observe that we can calculate the spaceCost in *O*(1) time and space if we know the extraSpace on each line. We can determine this by observing that for any line containing words *i* through *j*, iterating backwards from *j* to *i* shows that extraSpace[*i*,*j*] = extraSpace[*i*+1, *j*] - $l_i$ - 1. We initialize extraSpace[*j*,*j*] to be $M - l_j$

### Notation {-}
Let *c*[*j*] be the cost of printing words 1 through *j*.
Let *p*[*i*] be an array such that in an optimal solution of printing words *i* through *n*, *p*[*i*] is the index where you would need a line break.

### Recurrence {-}
$$
\mathrm{c}[j] = \begin{cases}
    0 & \text{if j = 0}  \\ 
    \min_{1 \le i \le j}{c[i-1] + spaceCost(i,j)}  & \text{otherwise.}
\end{cases}
$$

In [9]:
### Pseudo-code
# Pre-condition: n >= 1, M >= 1, l is an array of integers greater than 0
# Post-condition: Returns an array c as defined above 
#                 Returns an array p with each value being an index of the word a line starts on
function printNeatly(l::Array, n::Integer, M::Integer)
    c = Array{Float64}(undef, n)
    p = Array{Float64}(undef, n)
    
    for j in 1:n    # Calculate c[j] and p[j] 
        c[j] = Inf
        extraSpace = M - l[j] # initialize extra space
        
        for i in reverse(max(1, j - ceil(M/2) + 1): j) # iterate in reverse
            if i == j # base case of starting at j, extraSpace[j,j] = M - l[j]
                if c[i-1] + spaceCost(extraSpace, i, j, n) < c[j]
                    c[j] = c[i-1] + spaceCost(extraSpace, i, j, n)
                    p[j] = i
                end
            else # every other case, extraSpace[i,j] = extraSpace[i+1, j] - l[i] - 1
                extraSpace = extraSpace - l[i] - 1
                if c[i-1] + spaceCost(extraSpace, i, j, n) < c[j]
                    c[j] = c[i-1] + spaceCost(extraSpace, i, j, n)
                    p[j] = i
                end
            end
        end
    end
    
    return (c, p)
end

# Pre-condition: True
# Post-condition: Returns the spacefor a line from i to j as defined above
function spaceCost(extraSpace::Integer, i::Integer, j::Integer, n::Integer)
    if extraSpace < 0 # words i through j don't fit on the line
        return Inf
    elseif extraSpace >= 0 && j == n # true last line
        return 0
    else # everything else
        return extraSpace^3
    end
end

# Pre-condition: the length of words is greater than or equal to 0
#                1 <= j <= length of words
# Post-condition: Prints a paragraph of words neatly
function printParagraph(words::Array, p::Array, j::Integer)
    i = p[j]
    if i == 1
        k = 1
    else
        k = printParagraph(words, p, i - 1) + 1
    end
    
    lineK = ""
    for wIdx in i:j # words i through j will appear on the kth line
        lineK = lineK * words[wIdx] * " " # concatonating to the line
    end
    
    chop(lineK, tail=1) # removing last extra space
    println(lineK) # printing line
    return k
end

# Pre-condition: n >= 1, M >= 1
# Post-condition: Prints a paragraph of words neatly
function main(words::Array, n::Integer, M::Integer)
    l = Array{Integer}(undef, n) # initialize and create array with length of each word in words
    for i in 1:n 
        push!(l, length(words[i])) # add the length of word i to the array of word lengths
    end
    
    c, p = printNeatly(l, n, M) # obtain the p array
    k = printParagraph(words, p, n) # print the paragraph
end

main (generic function with 2 methods)

### Time Complexity {-}
*printNeatly* takes *O*($\frac{nM}{2}$) = *O*(*nM*) time to run. Since we are only storing arrays *c* and *p* of length *n*, it takes $\Theta$(*n*) space.

*spaceCost* takes $\Theta$(1) time and space.

The recurrence for *printParagraph* is *T*(*n*) $\le$ *T*(*n* - *M*) + *O*(*M*). This is because we are printing at most M characters on a line and recursing at most $\frac{n}{M}$ times. 

The solutions to this recurrence is *T*(*n*) = *O*(*n*). We verify this through substitution:

*T*(*n*) $\le$ *T*(*n* - *M*) + *O*(*M*)

= *a*(*n* - *M*) + *cM*               (for some *a*, *c* $\in$ The set of positive Reals)
         
= *an* - *aM* + *cM*
         
$\le$ *an*      (if *a* = *c*)
         
= *O*(*n*)

Thus, the *printParagraph* method takes *O*(*n*) time and *O*(1) space. $\blacksquare$

*main* takes *O*(*n*) time and space to create an array with the length of each word. It then makes calls to *printNeatly* and *printParagraph*. Thus, **our main method takes *O*(*nM*) time and *O*(*n*) space**.