# Advanced Certification Program in Computational Data Science
## A program by IISc and TalentSprint
### Mini Project Solution Notebook: Linear Algebra and Calculus

**DISCLAIMER:** THIS NOTEBOOK IS PROVIDED ONLY AS A REFERENCE SOLUTION NOTEBOOK FOR THE MINI-PROJECT. THERE MAY BE OTHER POSSIBLE APPROACHES/METHODS TO ACHIEVE THE SAME RESULTS.

## Problem Statement

 The task is to advise a petroleum company on how to meet the demands of their customers for motor oil, diesel oil and gasoline.

## Learning Objectives

At the end of the Mini-project, you will be able to

* create arrays and matrices in python
* understand the concepts of linear equations
* solve the system of linear equations

### Data

From a barrel of crude oil, in one day, factory $A$ can produce
* 20 gallons of motor oil,
* 10 gallons of diesel oil, and
* 5 gallons of gasoline

Similarly, factory $B$ can produce
* 4 gallons of motor oil,
* 14 gallons of diesel oil, and
* 5 gallons of gasoline

while factory $C$ can produce
* 4 gallons of motor oil,
* 5 gallons of diesel oil, and
* 12 gallons of gasoline

There is also waste in the form of paraffin, among other things. Factory $A$ has 3 gallons of paraffin to dispose of per barrel of crude, factory $B$ 5 gallons, and factory $C$ 2 gallons.

**Note:** Your conclusion should include a discussion of the nature of the terms *unique*, *no solution*, *overdetermined* and *underdetermined* as they apply in the context of the oil plants.

## Grading = 10 Points

### Create an array

Create an array of size 2x3 with arbitrary values.

In [None]:
import numpy as np
a = np.array([[1, 2, 3],
              [4, 5, 6]])
a

### Create the system of Linear Equation

Suppose the current daily demand from distributors is 6600 gallons of motor oil, 5100 gallons of diesel oil and 3100 of gasoline.

Set up the system of equations which describes the above situation. Please include the units as well.

Let the number of barrels used by factory $A$, $B$ and $C$ are $x$, $y$ and $z$ respectively.

Then the system of linear equations will be

$$Motor\ oil:\ \ \ 20x + 4y + 4z = 6600$$

$$Diesel\ oil:\ \ \ 10x + 14y + 5z = 5100$$

$$Gasoline:\ \ \ 5x + 5y + 12z = 3100$$

### Solve the system of Linear Equation (2 points)

How many barrels of crude oil each plant should get in order to meet the demand as a group. Remember that we can only provide each plant with an integral number of barrels.

In [None]:
from scipy.linalg import solve
A = np.array([[20, 4, 4],
              [10, 14, 5],
              [5, 5, 12]])
b = np.array([6600, 5100, 3100])
X1 = solve(A,b)
X1

Therefore, the number of barrels required by factory $A$, $B$ and $C$ are 288, 129 and 85 respectively.

Suppose the total demand for all products **doubled**. What would the solution now be? How does it compare to the original solution? Why, mathematically, should this have been expected?

In [None]:
A = np.array([[20, 4, 4],
              [10, 14, 5],
              [5, 5, 12]])
b = np.array([6600 * 2, 5100 * 2, 3100 * 2])
X2 = solve(A,b)
X2

From above it can be seen that as the demand doubled the number of barrels required also gets doubled.

Suppose that the company acquires another group of distributors and that the daily demand of this group is 2000 gallons of motor oil, 4000 gallons of gasoline, and 4000 gallons of diesel oil. How would you set up production of just this supply? Are there any options (more than one way)?

In [None]:
A = np.array([[20, 4, 4],
              [10, 14, 5],
              [5, 5, 12]])
b = np.array([2000, 4000, 4000])
X3 = solve(A,b)
X3

From above it can be seen that for the demand (2000, 4000, 4000), the barrel requirement for factories $A$, $B$ and $C$ are 13, 188 and 250 respectively. As it is a *unique* solution, this is the only way to fulfill the demand.

Next, calculate the needs of each factory (in barrels of crude, as usual) to meet the total demand of both groups of distributors. When you have done this, compare your answer to results already obtained. What mathematical conclusion can you draw?

In [None]:
A = np.array([[20, 4, 4],
              [10, 14, 5],
              [5, 5, 12]])
b = np.array([6600 + 2000, 5100 + 4000, 3100 + 4000])
X4 = solve(A,b)
X4

In [None]:
X1 + X3 == X4

From above we can conclude that the number of barrels required by the plants to meet the total demand (6600 + 2000, 5100 + 4000, 3100 + 4000) is equal to the sum of number of barrels required by the plants to meet the demands individually.

### Sensitivity and Robustness (1 point)

In real life applications, constants are rarely ever exactly equal to their stated value; certain amounts of uncertainty are always present. This is part of the reason for the science of statistics. In the above model, the daily productions for the plants would be averages over a period of time. Explore what effect small changes in the parameters have on the output.

To do this, pick any 3 coefficients, one at a time, and increase or decrease them by 3%. For each case , note what effect this has on the solution, as a percentage change. Can you draw any overall conclusion?

In [None]:
# Increasing the coefficient value for factory A by 3%
A = np.array([[20 * 1.03, 4, 4],
              [10 * 1.03, 14, 5],
              [5 * 1.03, 5, 12]])

b = np.array([6600, 5100, 3100])
X5 = solve(A,b)
X5

In [None]:
# Percentage change in solution
(X5 - X1) / X1

From above it can be seen that after increasing the coefficients of factory $A$ by 3%, barrel requirement for factory $A$ has decreased by around 3% while for factory $B$ and $C$ it is constant.

Now, lets decrease the coefficient by 3%.

In [None]:
# Decreasing the coefficient value for factory A by 3%
A = np.array([[20 * 0.97, 4, 4],
              [10 * 0.97, 14, 5],
              [5 * 0.97, 5, 12]])

b = np.array([6600, 5100, 3100])
X6 = solve(A,b)
X6

In [None]:
# Percentage change in solution
(X6 - X1) / X1

After decreasing the coefficient by 3% the barrel requirement for factory $A$ has increased by around 3% while for factory $B$ and $C$ it still remains constant.

Similarly, for factory $B$ we can do the same.

In [None]:
# Increasing the coefficient value for factory B by 3%
A = np.array([[20, 4 * 1.03, 4],
              [10, 14 * 1.03, 5],
              [5, 5 * 1.03, 12]])

b = np.array([6600, 5100, 3100])
X7 = solve(A,b)
X7

In [None]:
# Percentage change in solution
(X7 - X1) / X1

In [None]:
# Decreasing the coefficient value for factory B by 3%
A = np.array([[20, 4 * 0.97, 4], [10, 14 * 0.97, 5], [5, 5 * 0.97, 12]])
b = np.array([6600, 5100, 3100])
X8 = solve(A,b)
X8

In [None]:
# Percentage change in solution
(X8 - X1) / X1

The results are similar for factory $B$ as well.

So, we conclude that if we increase the value of coefficient by n% then its output is decreased by approximately n% and vice-versa.

### A Plant Off-Line (1 point)

Suppose factory $C$ is shut down by the EPA (Environmental Protection Agency) temporarily for excessive emissions into the atmosphere. If your demand is as it was originally (6600,5100,3100), what would you now say about the companies ability to meet it? What do you recommend they schedule for production now?

Let the number of barrels used by factory $A$ and $B$ are $x$ and $y$ respectively.

Then the system of linear equations will be

$$20x + 4y = 6600$$

$$10x + 14y = 5100$$

$$5x + 5y = 3100$$

The above system has 3 equations and 2 unknowns (x and y), which is *overdetermined* because number of linearly independent equations in the system are more than the unknowns.

In [None]:
# For the first and second equations
A = np.array([[20, 4], [10, 14]])
b = np.array([6600, 5100])
X9 = solve(A,b)
X9

In [None]:
# For the first and third equations
A = np.array([[20, 4], [5, 5]])
b = np.array([6600, 3100])
X10 = solve(A,b)
X10

In [None]:
# For the second and third equations
A = np.array([[10, 14], [5, 5]])
b = np.array([5100, 3100])
X11 = solve(A,b)
X11

There is one solution for each pair of linear equations: for the first and second equations (300, 150), for the first and third (258, 363), and for the second and third (895, -275). We can discard the third one as number of barrels can't be negative. However, there is no exact solution that satisfies all three simultaneously.

### Buying another plant

####(Note the following given information. You will see questions in continuation to this, in the subsequent sections)


This situation has caused enough concern that the CEO is considering buying another plant, identical to the third, and use it permanently. Assuming that all 4 plants are on line, what production do you recommend to meet the current demand (5000, 8500, 10000)? In general, what can you say about any increased flexibility that the 4th plant might provide?

Let the number of barrels used by factory $A$, $B$, $C$ and $D$ are $x$, $y$, $z$ and $w$ respectively.

Then the system of linear equations will be

$$20x + 4y + 4z + 4w = 5000$$

$$10x + 14y + 5z + 5w = 8500$$

$$5x + 5y + 12z + 12w = 10000$$

The above system of linear equation has fewer equations than variables, hence it is *underdetermined* and cannot have a unique solution. In this case, there are either infinitely many solutions or no exact solution. We can solve it by keeping $w$ as constant and using [rref](http://linear.ups.edu/html/section-RREF.html) form to solve the system of linear equation.

To know about rref implementation in python refer [here](https://docs.sympy.org/latest/tutorial/matrices.html#rref).

In [None]:
import sympy as sy

# create symbol 'w'
w = sy.Symbol("w")
A_aug = sy.Matrix([[20, 4, 4, 5000-4*w],
                   [10, 14, 5, 8500-5*w],
                   [5, 5, 12, 10000-12*w]])
# show rref form
A_aug.rref()

From the above result, it can be seen that 4th plant will share the number of barrels required by the 3rd plant only, while the requirement of 1st and 2nd plant will remain unaffected.

### Calculate the amount of Paraffin supplied (1 point)

The company has just found a candle company that will buy its paraffin. Under the current conditions (i.e, after buying another plant) for demand (5000, 8500, 10000), how much can be supplied to them per day?

According to the problem statement, factory $A$ has 3 gallons of paraffin to dispose of per barrel of crude oil, factory $B$ 5 gallons, and factory $C$ 2 gallons.

In [None]:
import sympy as sy

# create symbol 'w'
w = sy.Symbol("w")
A_aug = sy.Matrix([[20, 4, 4, 5000-4*w],
                   [10, 14, 5, 8500-5*w],
                   [5, 5, 12, 10000-12*w]])
# show rref form
A_aug.rref()

In [None]:
paraffin_from_A = np.ceil(195/4) * 3
paraffin_from_B = np.ceil(1325/4) * 5
paraffin_from_C_and_D = 675 * 2
paraffin_supplied_per_day = paraffin_from_A + paraffin_from_B + paraffin_from_C_and_D
paraffin_supplied_per_day

### Selling the first plant (1 point)

The management is also considering selling the first plant due to aging equipment and high workman's compensation costs for the state it is located in. They would like to know what this would do to their production capability. Specifically, they would like an example of a demand they could not meet with only plants 2 and 3, and also what effect having plant 4 has (recall it is identical to plant 3). They would also like an example of a demand that they could meet with just plants 2 and 3. Any general statements you could make here would be helpful.

Let the number of barrels used by factory $B$, $C$ and $D$ are $y$, $z$ and $w$ respectively.

When considering only plants 2 and 3, and demand (5000, 8500, 10000) then we have

$$4y + 4z = 5000$$

$$14y + 5z = 8500$$

$$5y + 12z = 10000$$

The system becomes *overdetermined*.

In [None]:
# For the first and second equations
A = np.array([[4, 4], [14, 5]])
b = np.array([5000, 8500])
X12 = solve(A,b)
X12

In [None]:
# For the first and third equations
A = np.array([[4, 4], [5, 12]])
b = np.array([5000, 10000])
X13 = solve(A,b)
X13

In [None]:
# For the second and third equations
A = np.array([[14, 5], [5, 12]])
b = np.array([8500, 10000])
X14 = solve(A,b)
X14

There is one solution for each pair of linear equations: for the first and second equations (250, 1000), for the first and third (715, 536), and for the second and third (364, 682). However, there is no exact solution that satisfies all three simultaneously.

Taking 4th plant into consideration.
Let the number of barrels used by factory $B$, $C$ and $D$ are $y$, $z$ and $w$ respectively.

Then for demand (5000, 8500, 10000) the system of linear equations will be

$$4y + 4z + 4w = 5000$$

$$14y + 5z + 5w = 8500$$

$$5y + 12z + 12w = 10000$$

Solve it using rref form.

In [None]:
import sympy as sy
A_aug = sy.Matrix([[4, 4, 4, 5000],
                   [14, 5, 5, 8500],
                   [5, 12, 12, 10000]])
A_aug.rref()

From above it can be seen that the above system of equation has *no exact solution*.

Now, changing demand to (6600, 5100, 3100) and solving the system of equation using rref form.

In [None]:
import sympy as sy
A_aug = sy.Matrix([[4, 4, 4, 6600],
                   [14, 5, 5, 5100],
                   [5, 12, 12, 3100]])
A_aug.rref()

Therefore, irrespective of the demand the system has *no exact solution*.

### Set rates for Products (1 point)

Company wants to set the rates of motor oil, diesel oil, and gasoline. For this purpose they have few suggestions given as follows:

* 100, 66, 102 Rupees per gallon,

* 104, 64, 100 Rupees per gallon,

* 102, 68, 98 Rupees per gallon, and

* 96, 68, 100 Rupees per gallon

for motor oil, diesel oil, and gasoline respectively.

Using matrix multiplication, find the rates which results in maximum total price.

Let $M$ denotes the matrix such that rows represents different plants (A, B and C), columns represents different products (motor oil, diesel oil and gasoline) and each value represents production of that product from one barrel of crude oil for that plant.

$$M = \begin{bmatrix}
20 & 10 & 5 \\
4 & 14 & 5  \\
 4 & 5 & 12 \end{bmatrix}$$

Also, $R$ is a matrix having different rates as its columns.

$$R = \begin{bmatrix}
100 & 104 & 102 & 96 \\
66 & 64 & 68 & 68  \\
102 & 100 & 98 & 100 \end{bmatrix}$$

In [None]:
# Matrix multiplication
M = np.array([[20, 10, 5],
              [4, 14, 5],
              [4, 5, 12]])

R = np.array([[100, 104, 102, 96],
              [66, 64, 68, 70],
              [102, 100, 98, 100]])

M @ R

The resultant matrix contains total prices for each plant, taking different rates into consideration.

In order to get the total price we need to sum up the prices for each plant and take those rates for which the total price is maximum.

In [None]:
# Total price for each rate
total_price = sum(M @ R)
print("Price: ", total_price)

# Get index of highest value in array
index = np.argmax(total_price)

# Extract desired rates
rate = R[:,index]
print("Desired rate: ", rate)

Hence, the desired rates are 102, 68, 98 Rupees per gallon for motor oil, diesel oil, and gasoline respectively.

### Marginal Cost (1 point)

The total cost $C(x)$ in Rupees, associated with the production of $x$ gallons of gasoline is given by

$$C(x) = 0.005 x^3 – 0.02 x^2 + 30x + 5000$$

Find the marginal cost when $22$ gallons are produced, where by marginal cost we
mean the instantaneous rate of change of total cost at any level of output.

**Solution:** Since marginal cost is the rate of change of total cost with respect to the
output, we have

Marginal cost (MC) given by,

$$\frac{dC}{dx} = 0.005(3 x^2) - 0.02(2 x) + 30$$

When $x = 22$, $$MC = 0.015(22^2) - 0.04(22) + 30$$

$$= 7.26 – 0.88 + 30 = 36.38$$

Hence, the required marginal cost is around  36.38 Rupees.

We can also compute it using derivative function of `scipy` package.

In [None]:
from scipy.misc import derivative

f = lambda x: 0.005 * x**3 - 0.02 * x**2 + 30 * x + 5000
x = 22
derivative(f, x)

### Marginal Revenue (1 point)

The total revenue in Rupees received from the sale of $x$ gallons of a motor oil is given by

$$R(x) = 3x^2 + 36x + 5.$$

Find the marginal revenue, when $x = 28$, where by marginal revenue we mean the rate of change of total revenue with respect to the number of items sold at an instant.

**Solution:** Since marginal revenue is the rate of change of total revenue with respect to the number of units sold, we have

Marginal Revenue (MR) given by,

$$\frac{dR}{dx} = 6x + 36$$

When $x = 28$,

$$MR = 6(28) + 36 = 204$$

Hence, the required marginal revenue is 204 Rupees.

We can also compute it using derivative function of `scipy` package.

In [None]:
from scipy.misc import derivative

f = lambda x: 3 * x**2 + 36 * x + 5
x = 28
derivative(f,x)

### Pour crude oil in tank (1 point)

In a cylindrical tank of radius 10 meter, crude oil is being poured at the rate of 314 cubic meter per hour. Then find

* the rate at which the height of crude oil is increasing in the tank, and
* the height of crude oil in tank after 2 hours.

**Solution:** Let $m$ be the rate at which the height of the tank is increasing.

Then the height of crude oil in cylinder, increasing with time t, is given by

$$h = m \times t$$

And volume of crude oil inside cylinder is given by,

$$V = \pi r^2 h$$

Then, rate of change of volume is given by,

$$\frac{dV}{dt} = \pi r^2 m$$

In [None]:
import numpy as np
import sympy as sym

# radius
r = 10
# time symbol
t = sym.Symbol('t')
# height rate symbol
m = sym.Symbol('m')
# height of tank
h = m * t
# volume of tank
v = np.pi * r**2 * h

# derivative(v,t)
sym.Derivative(v, t).doit()

In [None]:
# Given volume flow rate = 314 cubic meter/hr, solve for m
m = solve([[314.16]], [[314]])
print("Rate at which height of tank is increasing: ", m)

Therefore, the rate at which the height of the tank is increasing is $1 meter/hour$.

Now, let's calculate the height after two hours:

In [None]:
t = 2
h_2 = m * t
print("Height of crude oil in tank after 2 hours: ", h_2)