<a href="https://colab.research.google.com/github/soumik-mukherjee/amplify-gatsby-webapp-template/blob/master/M0_Mini_Project_02_Linear_Algebra_and_Calculus.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Advanced Certification Program in Computational Data Science
## A program by IISc and TalentSprint
### Mini Project Notebook 2: Linear Algebra and Calculus
(Ungraded Mini-Project)

## Problem Statement

 The task is to advise a petroleum company on how to meet the demands of their customers for motor oil, diesel oil and gasoline.

## Learning Objectives

At the end of the experiment, you will be able to

* create arrays and matrices in python
* understand the concepts of linear equations
* solve the system of linear equations

### Data

From a barrel of crude oil, in one day, factory $A$ can produce 
* 20 gallons of motor oil, 
* 10 gallons of diesel oil, and 
* 5 gallons of gasoline 

Similarly, factory $B$ can produce 
* 4 gallons of motor oil, 
* 14 gallons of diesel oil, and 
* 5 gallons of gasoline
 
while factory $C$ can produce 
* 4 gallons of motor oil,
* 5 gallons of diesel oil, and 
* 12 gallons of gasoline 

There is also waste in the form of paraffin, among other things. Factory $A$ has 3 gallons of paraffin to dispose of per barrel of crude, factory $B$ 5 gallons, and factory $C$ 2 gallons.

**Note:** Your conclusion should include a discussion of the nature of the terms *unique*, *no solution*, *overdetermined* and *underdetermined* as they apply in the context of the oil plants.

### Create an array 

Create an array of size 2x3 with arbitrary values.

In [None]:
import numpy as np
data = np.random.rand(2,3)
data

array([[0.99300416, 0.69038323, 0.87039337],
       [0.34831392, 0.14372821, 0.0513761 ]])

In [None]:
# All other imports
import sympy as sy
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_theme(style="darkgrid")

### Create the system of Linear Equation

Suppose the current daily demand from distributors is 6600 gallons of motor oil, 5100 gallons of diesel oil and 3100 of gasoline.

Set up the system of equations which describes the above situation. Please include the units as well.

Let the number of barrels used by factory $A$, $B$ and $C$ are $x$, $y$ and $z$ respectively.

Then the system of linear equations will be

$$Motor\ oil:\ \ \ 20x + 4y + 4z = 6600$$

$$Diesel\ oil:\ \ \ 10x + 14y + 5z = 5100$$

$$Gasoline:\ \ \ 5x + 5y + 12z = 3100$$

### Solve the system of Linear Equation

How many barrels of crude oil each plant should get in order to meet the demand as a group. Remember that we can only provide each plant with an integral number of barrels.

#### Solution Prep

Lets initialize a solution vector 
$\mathbf{s} = [s_0, s_1, s_2, s_3]^\top$ 

that will store our results. We want to see all results at the same time and be able to compare __visually__, for e.g. between $s_1$ and $s_3$.

<br />
The vector $\mathbf{s}$ has $4$ dimensions as, we have $4$ scenarios, in this problem section. So $s_i$ is, in turn, the solution vector for the $i^{th}$ scenario

The scenarios being,
* $s_0$ - Solve the number of barrels problem given plant production rates & demand
* $s_1$ - Solve same when demand is doubled
* $s_2$ - Solve for new demand, treated in isolation from previous demand
* $s_3$ - Solve for additional demand

<br />

Each scenario vector $s_i$ will have two dimensions, the first being $x$, representing the float values that solve the equations in that scenario, and, the other $y$ being the nearest highest integer, i.e. ceil function or $y = \lceil x \rceil\$

<br />
Finally, both the vectors $x,y$ are $3$ dimensional, there being $3$ plants for which we have to solve, so,

${x} \in \mathbb{R}^3, {y} \in \mathbb{N}^3$

Summarising our solution vector $s \in \mathbb{R}^{4 \times 2 \times 3}$ as below,

$
\begin{bmatrix} 
\begin{bmatrix} x_0 , x_1, x_2 \end{bmatrix}^\top, \begin{bmatrix} y_0 , y_1, y_2 \end{bmatrix}^\top \\ \begin{bmatrix} x_3 , x_4, x_5 \end{bmatrix}^\top, \begin{bmatrix} y_3 , y_4, y_5 \end{bmatrix}^\top \\ \begin{bmatrix} x_6 , x_7, x_8 \end{bmatrix}^\top, \begin{bmatrix} y_6 , y_7, y_8 \end{bmatrix}^\top \\ \begin{bmatrix} x_9 , x_{10}, x_{11} \end{bmatrix}^\top, \begin{bmatrix} y_9 , y_{10}, y_{11} \end{bmatrix}^\top \\ \end{bmatrix}
$

where, $y_i = \lceil x_i \rceil$

In the following step we initialize our solution vector $s$ with zeroes

In [None]:
solutions = np.zeros((4,2,3))
solutions

array([[[0., 0., 0.],
        [0., 0., 0.]],

       [[0., 0., 0.],
        [0., 0., 0.]],

       [[0., 0., 0.],
        [0., 0., 0.]],

       [[0., 0., 0.],
        [0., 0., 0.]]])

In [None]:
solutions.shape

(4, 2, 3)

#### Scenaro 1 Solution

Given the set of linear equations, represented as $Ax = B$, loading our coefficients matrix $A$

In [None]:
A = np.array([[20, 4, 4], [10, 14, 5], [5, 5, 12]])
A

array([[20,  4,  4],
       [10, 14,  5],
       [ 5,  5, 12]])

For a solution to exist, $\det A \neq 0$ must be satisfied

In [None]:
np.linalg.det(A) != 0

True

Loading terms matrix $B$

In [None]:
B = np.array([6600, 5100, 3100])
B

array([6600, 5100, 3100])

Solving for $x$ given, $A$ and $B$, and setting it to $x$ of $s_{0}$

In [None]:
solutions[0,0] = np.linalg.solve(A, B)
solutions

array([[[287.25, 128.75,  85.  ],
        [  0.  ,   0.  ,   0.  ]],

       [[  0.  ,   0.  ,   0.  ],
        [  0.  ,   0.  ,   0.  ]],

       [[  0.  ,   0.  ,   0.  ],
        [  0.  ,   0.  ,   0.  ]],

       [[  0.  ,   0.  ,   0.  ],
        [  0.  ,   0.  ,   0.  ]]])

calculating and setting $y$ of $s_0$

In [None]:
solutions[0,1] = np.ceil(solutions[0,0])
solutions

array([[[287.25, 128.75,  85.  ],
        [288.  , 129.  ,  85.  ]],

       [[  0.  ,   0.  ,   0.  ],
        [  0.  ,   0.  ,   0.  ]],

       [[  0.  ,   0.  ,   0.  ],
        [  0.  ,   0.  ,   0.  ]],

       [[  0.  ,   0.  ,   0.  ],
        [  0.  ,   0.  ,   0.  ]]])

***

Suppose the total demand for all products **doubled**. What would the solution now be? How does it compare to the original solution? Why, mathematically, should this have been expected?

#### Scenaro 2 Solution

Only the terms, i.e. matrix $B$ will change.
We solve for $x$ given same $A$, then calculate $y$ and set on $s_{1}$

In [None]:
twice_B = 2 * B
twice_B

array([13200, 10200,  6200])

In [None]:
solutions[1, 0] = np.linalg.solve(A, twice_B)
solutions[1, 1] = np.ceil(solutions[1, 0])
solutions

array([[[287.25, 128.75,  85.  ],
        [288.  , 129.  ,  85.  ]],

       [[574.5 , 257.5 , 170.  ],
        [575.  , 258.  , 170.  ]],

       [[  0.  ,   0.  ,   0.  ],
        [  0.  ,   0.  ,   0.  ]],

       [[  0.  ,   0.  ,   0.  ],
        [  0.  ,   0.  ,   0.  ]]])

Let's compare vectors $x$ of $s_0$ and $s_1$. It looks like $s_0$ has been scaled, so we will calculate its proportion to $s1$

In [None]:
solutions[1, 0]/solutions[0, 0]

array([2., 2., 2.])

#### Scenario 2 Conclusion

This concludes that given $s$ is the unique solution vector of the system of linear equations represented by $Ax = B$, scaling vector $B$ will have the same effect on vector $s$
***

Suppose that the company acquires another group of distributors and that the daily demand of this group is 2000 gallons of motor oil, 4000 gallons of gasoline, and 4000 gallons of diesel oil. How would you set up production of just this supply? Are there any options (more than one way)?

#### Scenario 3 Solution

We follow the same steps, set $B_{new}$, re-solve for $x$ & $y$ and set on $s_2$

<br />
We will need $s_2$ later

In [None]:
new_B = np.array([2000, 4000, 4000])
solutions[2, 0] = np.linalg.solve(A, new_B)
solutions[2, 1] = np.ceil(solutions[2, 0])
solutions

array([[[287.25, 128.75,  85.  ],
        [288.  , 129.  ,  85.  ]],

       [[574.5 , 257.5 , 170.  ],
        [575.  , 258.  , 170.  ]],

       [[ 12.5 , 187.5 , 250.  ],
        [ 13.  , 188.  , 250.  ]],

       [[  0.  ,   0.  ,   0.  ],
        [  0.  ,   0.  ,   0.  ]]])

***

Next, calculate the needs of each factory (in barrels of crude, as usual) to meet the total demand of both groups of distributors. When you have done this, compare your answer to results already obtained. What mathematical conclusion can you draw?

#### Scenario 4 Solution

Set $B_{combined}$ and rest same

In [None]:
combined_B = B + new_B
solutions[3, 0] = np.linalg.solve(A, combined_B)
solutions[3, 1] = np.ceil(solutions[3, 0])
solutions

array([[[287.25, 128.75,  85.  ],
        [288.  , 129.  ,  85.  ]],

       [[574.5 , 257.5 , 170.  ],
        [575.  , 258.  , 170.  ]],

       [[ 12.5 , 187.5 , 250.  ],
        [ 13.  , 188.  , 250.  ]],

       [[299.75, 316.25, 335.  ],
        [300.  , 317.  , 335.  ]]])

Visual comparison hints $s_3$ = $s_0$ + $s_2$

<br />
Let's confirm this

In [None]:
np.equal(solutions[0, 0] + solutions[2, 0], solutions[3, 0])

array([ True,  True,  True])

#### Scenario 4 Conclusion

This concludes that vector $dot$ products are $distributive$, i.e.
$A.(s_0 + s_2) = B_{combined} = B + B_{new} = A.s_0 + A.s_2$
***

### Sensitivity and Robustness

In real life applications, constants are rarely ever exactly equal to their stated value; certain amounts of uncertainty are always present. This is part of the reason for the science of statistics. In the above model, the daily productions for the plants would be averages over a period of time. Explore what effect small changes in the parameters have on the output.

To do this, pick any 3 coefficients, one at a time, and increase or decrease them by 3%. For each case , note what effect this has on the solution, as a percentage change. Can you draw any overall conclusion?

In [None]:
# YOUR CODE HERE
steps = np.arange(-0.03, 0.03, 0.001)
steps

array([-3.00000000e-02, -2.90000000e-02, -2.80000000e-02, -2.70000000e-02,
       -2.60000000e-02, -2.50000000e-02, -2.40000000e-02, -2.30000000e-02,
       -2.20000000e-02, -2.10000000e-02, -2.00000000e-02, -1.90000000e-02,
       -1.80000000e-02, -1.70000000e-02, -1.60000000e-02, -1.50000000e-02,
       -1.40000000e-02, -1.30000000e-02, -1.20000000e-02, -1.10000000e-02,
       -1.00000000e-02, -9.00000000e-03, -8.00000000e-03, -7.00000000e-03,
       -6.00000000e-03, -5.00000000e-03, -4.00000000e-03, -3.00000000e-03,
       -2.00000000e-03, -1.00000000e-03,  2.77555756e-17,  1.00000000e-03,
        2.00000000e-03,  3.00000000e-03,  4.00000000e-03,  5.00000000e-03,
        6.00000000e-03,  7.00000000e-03,  8.00000000e-03,  9.00000000e-03,
        1.00000000e-02,  1.10000000e-02,  1.20000000e-02,  1.30000000e-02,
        1.40000000e-02,  1.50000000e-02,  1.60000000e-02,  1.70000000e-02,
        1.80000000e-02,  1.90000000e-02,  2.00000000e-02,  2.10000000e-02,
        2.20000000e-02,  

In [None]:
steps.shape

(60,)

In [None]:
sensitive_coefficients_A_mo = 

#### Sensitivity of Production to coefficients

array([286., 127.,  95.])

#### Sensitivity of Input costs(volume) to coefficients

### A Plant Off-Line

Suppose factory $C$ is shut down by the EPA (Environmental Protection Agency) temporarily for excessive emissions into the atmosphere. If your demand is as it was originally (6600, 5100, 3100), what would you now say about the companies ability to meet it? What do you recommend they schedule for production now?

In [None]:
# Code here

### Buying another plant

This situation has caused enough concern that the CEO is considering buying another plant, identical to the third, and using it permanently. Assuming that all 4 plants are on line, what production do you recommend to meet the current demand (5000, 8500, 10000)? In general, what can you say about any increased flexibility that the 4th plant might provide?

Let the number of barrels used by factory $A$, $B$, $C$ and $D$ are $x$, $y$, $z$ and $w$ respectively.

Then the system of linear equations will be

$$20x + 4y + 4z + 4w = 5000$$

$$10x + 14y + 5z + 5w = 8500$$

$$5x + 5y + 12z + 12w = 10000$$

The above system of linear equation has fewer equations than variables, hence it is *underdetermined* and cannot have a unique solution. In this case, there are either infinitely many solutions or no exact solution. We can solve it by keeping $w$ as constant and using [rref](http://linear.ups.edu/html/section-RREF.html) form to solve the system of linear equation.

To know about rref implementation in python refer [here](https://docs.sympy.org/latest/tutorial/matrices.html#rref).

In [None]:
# create symbol 'w'
w = sy.Symbol("w")            
A_aug = sy.Matrix([[20, 4, 4, 5000-4*w], 
                   [10, 14, 5, 8500-5*w], 
                   [5, 5, 12, 10000-12*w]])
# show rref form
A_aug.rref()                  

From the above result, it can be seen that 4th plant will share the number of barrels required by the 3rd plant only, while the requirement of 1st and 2nd plant will remain unaffected.

### Calculate the amount of Paraffin supplied

The company has just found a candle company that will buy its paraffin. Under the current conditions (i.e, after buying another plant) for demand (5000, 8500, 10000), how much can be supplied to them per day?

According to the problem statement, factory $A$ has 3 gallons of paraffin to dispose of per barrel of crude oil, factory $B$ 5 gallons, and factory $C$ 2 gallons.

In [None]:
# YOUR CODE HERE

### Selling the first plant

The management is also considering selling the first plant due to aging equipment and high workman's compensation costs for the state it is located in. They would like to know what this would do to their production capability. Specifically, they would like an example of a demand they could not meet with only plants 2 and 3, and also what effect having plant 4 has (recall it is identical to plant 3). They would also like an example of a demand that they could meet with just plants 2 and 3. Any general statements you could make here would be helpful.

Let the number of barrels used by factory $B$, $C$ and $D$ are $y$, $z$ and $w$ respectively.

When considering only plants 2 and 3, and demand (5000, 8500, 10000) then we have

$$4y + 4z = 5000$$

$$14y + 5z = 8500$$

$$5y + 12z = 10000$$

In [None]:
# YOUR CODE HERE

Taking 4th plant into consideration.
Let the number of barrels used by factory $B$, $C$ and $D$ are $y$, $z$ and $w$ respectively.

Then for demand (5000, 8500, 10000) the system of linear equations will be

$$4y + 4z + 4w = 5000$$

$$14y + 5z + 5w = 8500$$

$$5y + 12z + 12w = 10000$$

Solve it using rref form.

In [None]:
# YOUR CODE HERE

Now, changing demand to (6600, 5100, 3100) and solving the system of equation using rref form.

In [None]:
# YOUR CODE HERE

### Set rates for Products

Company wants to set the rates of motor oil, diesel oil, and gasoline. For this purpose they have few suggestions given as follows: 

* 100, 66, 102 Rupees per gallon,

* 104, 64, 100 Rupees per gallon,

* 102, 68, 98 Rupees per gallon, and

* 96, 68, 100 Rupees per gallon 

for motor oil, diesel oil, and gasoline respectively.

Using matrix multiplication, find the rates which results in maximum total price.

Let $M$ denotes the matrix such that rows represents different plants (A, B and C), columns represents different products (motor oil, diesel oil and gasoline) and each value represents production of that product from one barrel of crude oil for that plant.

$$M = \begin{bmatrix}
20 & 10 & 5 \\ 
4 & 14 & 5  \\ 
 4 & 5 & 12  
\end{bmatrix}$$

Also, $R$ is a matrix having different rates as its columns.

$$R = \begin{bmatrix}
100 & 104 & 102 & 96 \\ 
66 & 64 & 68 & 68  \\ 
102 & 100 & 98 & 100  
\end{bmatrix}$$

In [None]:
# YOUR CODE HERE
m_data = [[20, 10, 5], [4, 14, 5], [4, 5, 12]]
m = np.array(m_data)
m

In [None]:
r_data = [[100, 104, 102, 96], [66, 64, 68, 68], [102, 100, 98, 100]]
r = np.array(r_data)
r

Dot product of M and R will give us total price across plants (i.e. rows) and scenarios (ie. columns)

In [None]:
total_price = m.dot(r)
total_price

In [None]:
total_price_across_plants = np.array(total_price.sum(axis=0), dtype = np.int32)
total_price_across_plants

In [None]:
max_total_price = np.max(total_price_across_plants)
max_total_price

In [None]:
scenario = np.where(total_price_across_plants == max_total_price)
print("The scenario which gives maximum pricing : {}".format(scenario[0][0]))

### Marginal Cost

The total cost $C(x)$ in Rupees, associated with the production of $x$ gallons of gasoline is given by 

$$C(x) = 0.005 x^3 – 0.02 x^2 + 30x + 5000$$

Find the marginal cost when $22$ gallons are produced, where, marginal cost means the instantaneous rate of change of total cost at any level of output.

In [None]:
# YOUR CODE HERE

### Marginal Revenue

The total revenue in Rupees received from the sale of $x$ gallons of a motor oil is given by $$R(x) = 3x^2 + 36x + 5.$$ 

Find the marginal revenue, when $x = 28$, where, marginal revenue means the rate of change of total revenue with respect to the number of items sold at an instant.

In [None]:
# YOUR CODE HERE
# MR(x) = 6x + 36 = dR/dx

from sympy import Symbol, diff
  
x = Symbol('x')
R = 3 * x**2 + 36 * x + 5
print("Revenue function : {} ".format(R))
   
MR = diff(R, x)  
      
print("Marginal Revenue function : {} ".format(MR))

# Marginal revenue when x = 28
print("Marginal revenue is : {}".format(MR.evalf(subs = {x: 28})))



### Pour crude oil in tank

In a cylindrical tank of radius 10 meter, crude oil is being poured at the rate of 314 cubic meter per hour. Then find

* the rate at which the height of the tank is increasing, and
* the height of crude oil in tank after 2 hours.

In [None]:
# YOUR CODE HERE