# Advanced Certification Program in Computational Data Science
## A program by IISc and TalentSprint
### Mini Project Notebook: Linear Algebra and Calculus

## Problem Statement

 The task is to advise a petroleum company on how to meet the demands of their customers for motor oil, diesel oil and gasoline.

## Learning Objectives

At the end of the experiment, you will be able to

* create arrays and matrices in python
* understand the concepts of linear equations
* solve the system of linear equations

### Data

From a barrel of crude oil, in one day, factory $A$ can produce
* 20 gallons of motor oil,
* 10 gallons of diesel oil, and
* 5 gallons of gasoline

Similarly, factory $B$ can produce
* 4 gallons of motor oil,
* 14 gallons of diesel oil, and
* 5 gallons of gasoline

while factory $C$ can produce
* 4 gallons of motor oil,
* 5 gallons of diesel oil, and
* 12 gallons of gasoline

There is also waste in the form of paraffin, among other things. Factory $A$ has 3 gallons of paraffin to dispose of per barrel of crude, factory $B$ 5 gallons, and factory $C$ 2 gallons.

**Note:** Your conclusion should include a discussion of the nature of the terms *unique*, *no solution*, *overdetermined* and *underdetermined* as they apply in the context of the oil plants.

## Grading = 10 Points

### Create an array

Create an array of size 2x3 with arbitrary values.

In [None]:
# YOUR CODE HERE
import numpy as np
Array_A = np.array([[1,2,3],[4,5,6]])
Array_A

array([[1, 2, 3],
       [4, 5, 6]])

### Create the system of Linear Equations

Suppose the current daily demand from distributors is 6600 gallons of motor oil, 5100 gallons of diesel oil and 3100 of gasoline.

Set up the system of equations which describes the above situation. Please include the units as well.

Let the number of barrels used by factory $A$, $B$ and $C$ are $x$, $y$ and $z$ respectively.

Then the system of linear equations will be

$$Motor\ oil:\ \ \ 20x + 4y + 4z = 6600$$

$$Diesel\ oil:\ \ \ 10x + 14y + 5z = 5100$$

$$Gasoline:\ \ \ 5x + 5y + 12z = 3100$$

### Solve the system of Linear Equation (2 points)

How many barrels of crude oil each plant should get in order to meet the demand as a group. Remember that we can only provide each plant with an integral number of barrels.

In [None]:
# YOUR CODE HERE
import scipy
Matrix_oils = np.array([[20,4,4],[10,14,5],[5,5,12]])
Matrix_demand = np.array([6600,5100,3100])
solution = scipy.linalg.solve(Matrix_oils,Matrix_demand)
print("unique solution")
solution.astype(int)



unique solution


array([287, 128,  85])

Suppose the total demand for all products **doubled**. What would the solution now be? How does it compare to the original solution? Why, mathematically, should this have been expected?

In [None]:
# YOUR CODE HERE
Matrix_demand_double = np.array([6600*2,5100*2,3100*2])
solution_doubled = scipy.linalg.solve(Matrix_oils,Matrix_demand_double)

ratio = solution_doubled/solution
print("ratio=",ratio,"solution=",solution_doubled.astype(int))
#scalar multiplication of a vector ( A B C factories)
print(" prodution is  multiplied by the same scalar multiple as  magnitude of demand")

ratio= [2. 2. 2.] solution= [574 257 170]
 prodution is  multiplied by the same scalar multiple as  magnitude of demand


Suppose that the company acquires another group of distributors and that the daily demand of this group is 2000 gallons of motor oil, 4000 gallons of gasoline, and 4000 gallons of diesel oil. How would you set up production of just this supply? Are there any options (more than one way)?

In [None]:
# YOUR CODE HERE
Matrix_demand_dist2 = np.array([2000,4000,4000])
#Method 1
solution_dist2 = scipy.linalg.solve(Matrix_oils,Matrix_demand_dist2)
#Method 2
solution_dist2_way2 = np.linalg.solve(Matrix_oils,Matrix_demand_dist2)
solution_dist2.astype(int),solution_dist2_way2.astype(int)
#Method 3
import sympy as sy
A = sy.Matrix([[20,4,4,2000],
            [10,14,5,4000],
            [5,5,12,4000]])
B=A.rref()
c=list(np.array(B[0]))
print("solution3:",round(c[0][3]),round(c[1][3]),round(c[2][3]))

#two methods : numpy.linalg.solve(), scipy.linalg.solve()
print("Unique solution: In order to balance the demand for factories A B and C is ",solution_dist2_way2.astype(int))

solution3: 12 188 250
Unique solution: In order to balance the demand for factories A B and C is  [ 12 187 250]


(Matrix([
 [1, 0, 0,  25/2],
 [0, 1, 0, 375/2],
 [0, 0, 1,   250]]),
 (0, 1, 2))

Next, calculate the needs of each factory (in barrels of crude, as usual) to meet the total demand of both groups of distributors. When you have done this, compare your answer to results already obtained. What mathematical conclusion can you draw?

In [None]:
# YOUR CODE HERE
Matrix_demand_total = np.array([6600+2000,5100+4000,3100+4000])
solution_total= scipy.linalg.solve(Matrix_oils,Matrix_demand_total)
solution.astype(int),solution_dist2_way2.astype(int),solution_total.astype(int)
print("solution=",solution_total.astype(int))
print("It is a unique solution. Total demand is the sum of dist1 and dist 2 , Solution is also the sum ")

solution= [299 316 335]
It is a unique solution. Total demand is the sum of dist1 and dist 2 , Solution is also the sum 


It shows that solution for total is sum of original solution(A B C ) and distributor D . And it is unique solution

### Sensitivity and Robustness (1 point)

In real life applications, constants are rarely ever exactly equal to their stated value; certain amounts of uncertainty are always present. This is part of the reason for the science of statistics. In the above model, the daily productions for the plants would be averages over a period of time. Explore what effect small changes in the parameters have on the output.

To do this, pick any 3 coefficients, one at a time, and increase or decrease them by 3%. For each case , note what effect this has on the solution, as a percentage change. Can you draw any overall conclusion?

In [None]:
# YOUR CODE HERE
def per_ch(old, new):
     pc = round((new - old) / abs(old) * 100, 2)
     print(f"from {old} to {new}   -> {pc}% change")
#Factory A_coeff_3%_increase
Matrix_oils_new_A = np.array([[20*1.03,4,4],[10*1.03,14,5],[5*1.03,5,12]])
solution_new_A= scipy.linalg.solve(Matrix_oils_new_A,Matrix_demand)
print(solution_new_A.round(2))
pc_inc_forA_inc = per_ch(solution[0],solution_new_A[0])
#Factory B_coeff_3%_increase
Matrix_oils_new_B = np.array([[20,4*1.03,4],[10,14*1.03,5],[5,5*1.03,12]])
solution_new_B= scipy.linalg.solve(Matrix_oils_new_B,Matrix_demand)
print(solution_new_B.round(2))
pc_inc_forB_inc = per_ch(solution[1],solution_new_B[1])
#Factory B_coeff_3%_deccrease
Matrix_oils_new_b = np.array([[20,4*.97,4],[10,14*.97,5],[5,5*.97,12]])
solution_new_b= scipy.linalg.solve(Matrix_oils_new_b,Matrix_demand)
print(solution_new_b.round(2))
pc_inc_forB_dec = per_ch(solution[1],solution_new_b[1])
pc_inc_forA_inc,pc_inc_forB_inc,pc_inc_forB_dec
#intuition
print("3%  increase in the barells of a factory will decrease the its production by 3% and vice versa")

[278.88 128.75  85.  ]
from 287.25 to 278.88349514563106   -> -2.91% change
[287.25 125.    85.  ]
from 128.75 to 125.0   -> -2.91% change
[287.25 132.73  85.  ]
from 128.75 to 132.7319587628866   -> 3.09% change
3%  increase in the barells of a factory will decrease the its production by 3% and vice versa


"coeffients and solutions are inversely proportions , ie., increase in the coeff decrease the output of the factory and decrease in coeff increase the output of the factory"

### A Plant Off-Line (1 point)

Suppose factory $C$ is shut down by the EPA (Environmental Protection Agency) temporarily for excessive emissions into the atmosphere. If your demand is as it was originally (6600, 5100, 3100), what would you now say about the companies ability to meet it? What do you recommend they schedule for production now?

In [None]:
# YOUR CODE HERE
import numpy as np
from scipy.linalg import solve
Matrix_oils_2plants = np.array([[20,4],[10,14],[5,5]])
Matrix_demand = np.array([6600,5100,3100])
Produce= np.array([[20, 4,],
              [10, 14],[5,5]])
Demand = np.array([6600,5100,3100])
try:
  solution_2plants = np.linalg.solve(Matrix_oils_2plants,Matrix_demand)
except Exception as e:
  print(" we cannot find the solution for the reason: ",e)

 we cannot find the solution for the reason:  Last 2 dimensions of the array must be square


In [None]:
import numpy as np
from scipy.linalg import solve

Produce= np.array([[20, 4,],
              [10, 14],[5,5]])
Demand = np.array([6600,5100,3100])
A_pinv = np.linalg.pinv( Produce)
#print(A_pinv)
x = np.dot(A_pinv,Demand)
print("solution is ",x)
print("solution is overdetermined , we  used psuedo method"  )


solution is  [299.47204969 168.47826087]
solution is overdetermined , we  used psuedo method


### Buying another plant

####(Note the following given information. You will see questions in continuation to this, in the subsequent sections)

This situation has caused enough concern that the CEO is considering buying another plant, identical to the third, and using it permanently. Assuming that all 4 plants are on line, what production do you recommend to meet the current demand (5000, 8500, 10000)? In general, what can you say about any increased flexibility that the 4th plant might provide?

Let the number of barrels used by factory $A$, $B$, $C$ and $D$ are $x$, $y$, $z$ and $w$ respectively.

Then the system of linear equations will be

$$20x + 4y + 4z + 4w = 5000$$

$$10x + 14y + 5z + 5w = 8500$$

$$5x + 5y + 12z + 12w = 10000$$

The above system of linear equation has fewer equations than variables, hence it is *underdetermined* and cannot have a unique solution. In this case, there are either infinitely many solutions or no exact solution. We can solve it by keeping $w$ as constant and using [rref](http://linear.ups.edu/html/section-RREF.html) form to solve the system of linear equation.

To know about rref implementation in python refer [here](https://docs.sympy.org/latest/tutorial/matrices.html#rref).

In [None]:
import sympy as sy

# create symbol 'w'
w = sy.Symbol("w")
A_aug = sy.Matrix([[20, 4, 4, 5000-4*w],
                   [10, 14, 5, 8500-5*w],
                   [5, 5, 12, 10000-12*w]])

# show rref form
A_aug.rref()

(Matrix([
 [1, 0, 0,   195/4],
 [0, 1, 0,  1325/4],
 [0, 0, 1, 675 - w]]),
 (0, 1, 2))

From the above result, it can be seen that 4th plant will share the number of barrels required by the 3rd plant only, while the requirement of 1st and 2nd plant will remain unaffected.

### Calculate the amount of Paraffin supplied (1 point)

The company has just found a candle company that will buy its paraffin. Under the current conditions (i.e, after buying another plant) for demand (5000, 8500, 10000), how much can be supplied to them per day?

According to the problem statement, factory $A$ has 3 gallons of paraffin to dispose of per barrel of crude oil, factory $B$ 5 gallons, and factory $C$ 2 gallons.

In [None]:
# YOUR CODE HERE
demand=sy.Matrix([[20, 4, 4, 5000],
                   [10, 14, 5, 8500],
                   [5, 5, 12, 10000],
                  ])
B= demand.rref()
c=list(np.array(B[0]))

Supply_per_day_A = c[0][3]*3
Supply_per_day_B = c[1][3]*5
Supply_per_day_C = c[2][3]*2
total_parrafin = Supply_per_day_A+Supply_per_day_B+Supply_per_day_C
print("Total paraffin suppied is",total_parrafin.round(2))





Total paraffin suppied is 3152.50


### Selling the first plant (1 point)

The management is also considering selling the first plant due to aging equipment and high workman's compensation costs for the state it is located in. They would like to know what this would do to their production capability. Specifically, they would like an example of a demand they could not meet with only plants 2 and 3, and also what effect having plant 4 has (recall it is identical to plant 3). They would also like an example of a demand that they could meet with just plants 2 and 3. Any general statements you could make here would be helpful.

Let the number of barrels used by factory $B$, $C$ and $D$ are $y$, $z$ and $w$ respectively.

When considering only plants 2 and 3, and demand (5000, 8500, 10000) then we have

$$4y + 4z = 5000$$

$$14y + 5z = 8500$$

$$5y + 12z = 10000$$

In [None]:
# YOUR CODE HERE
import numpy as np
from scipy.linalg import solve
Matrix_oils_B_C= np.array([[4,4],[14,5],[5,12]])
Matrix_demand_B_C = np.array([5000,8500,10000])
A_aug =sy.Matrix( [[4,4,5000],[14,5,8500],[5,12,10000]])
solution_B_C = A_aug.rref()

print("solution is ",solution_B_C)


solution is  (Matrix([
[1, 0, 0],
[0, 1, 0],
[0, 0, 1]]), (0, 1, 2))


In [None]:
import numpy as np
from scipy.linalg import solve
Matrix_oils_B_C= np.array([[4,4],[14,5],[5,12]])
Matrix_demand_B_C = np.array([5000,8500,10000])
A_pinv = np.linalg.pinv( Matrix_oils_B_C)
x = np.dot(A_pinv,Matrix_demand_B_C)
print("solution is ",x)
print("solution is underdetermined , we  used psuedo method"  )

solution is  [369.30178881 695.03750721]
solution is underdetermined , we  used psuedo method


Taking 4th plant into consideration.
Let the number of barrels used by factory $B$, $C$ and $D$ are $y$, $z$ and $w$ respectively.

Then for demand (5000, 8500, 10000) the system of linear equations will be

$$4y + 4z + 4w = 5000$$

$$14y + 5z + 5w = 8500$$

$$5y + 12z + 12w = 10000$$

Solve it using rref form.

In [None]:
#Method 1
demand_BCD =sy.Matrix([[4,4,4, 5000],
                      [14, 5,5, 8500],
                      [ 5, 12,12,10000]])
print(demand_BCD.rref())
print("Since column 2 and column 3 are linearly dependent , It has many solutions")

(Matrix([
[1, 0, 0, 0],
[0, 1, 1, 0],
[0, 0, 0, 1]]), (0, 1, 3))
Since column 2 and column 3 are linearly dependent , It has many solutions


In [None]:
#Method 2
Matrix_oils_BDC = np.array([[4,4,4],[14,5,5],[5,12,12]])
Matrix_demand_BCD = np.array([5000,8500,1000])
solution_BCD= scipy.linalg.solve(Matrix_oils_BDC,Matrix_demand_BCD)
solution_BCD.astype(int)


  solution_BCD= scipy.linalg.solve(Matrix_oils_BDC,Matrix_demand_BCD)


array([                 585, -6944361663183177728,  6944361663183177728])

In [None]:
#Method 3
Matrix_oils_BDC = np.array([[4,4,4],[14,5,5],[5,12,12]])
Matrix_demand_BCD = np.array([5000,8500,1000])
A_pinv = np.linalg.pinv( Matrix_oils_BDC)
x = np.dot(A_pinv,Matrix_demand_BCD)
print(x)

[699.67597319 -74.73700564 -74.73700564]


Now, changing demand to (6600, 5100, 3100) and solving the system of equation using rref form.

In [None]:
# YOUR CODE HERE
demand_BCD =sy.Matrix([[4,4,4, 6600],
                      [14, 5,5, 5100],
                      [ 5, 12,12,3100]])
print(demand_BCD.rref())
print("Since column 2 and column 3 are linearly dependent , It has many solutions")

(Matrix([
[1, 0, 0, 0],
[0, 1, 1, 0],
[0, 0, 0, 1]]), (0, 1, 3))
Since column 2 and column 3 are linearly dependent , It has many solutions


### Set rates for Products (1 point)

Company wants to set the rates of motor oil, diesel oil, and gasoline. For this purpose they have few suggestions given as follows:

* 100, 66, 102 Rupees per gallon,

* 104, 64, 100 Rupees per gallon,

* 102, 68, 98 Rupees per gallon, and

* 96, 68, 100 Rupees per gallon

for motor oil, diesel oil, and gasoline respectively.

Using matrix multiplication, find the rates which result in maximum total price.

Let $M$ denote the matrix such that rows represents different plants (A, B and C), columns represents different products (motor oil, diesel oil and gasoline) and each value represents production of that product from one barrel of crude oil for that plant.

$$M = \begin{bmatrix}
20 & 10 & 5 \\
4 & 14 & 5  \\
 4 & 5 & 12  
\end{bmatrix}$$

Also, $R$ is a matrix having different rates as its columns.

$$R = \begin{bmatrix}
100 & 104 & 102 & 96 \\
66 & 64 & 68 & 68  \\
102 & 100 & 98 & 100  
\end{bmatrix}$$

In [None]:
# YOUR CODE HERE
M=np.array([[20, 10, 5],
              [4, 14, 5],
              [4, 5, 12]])
M
R=np.array([[100,104,102,96],
           [66,64,68,68],
           [102,100,98,100]])
R
Cost=np.dot(M,R)
print("Cost array:",Cost.sum(axis=0))
print("rate combination which result in maximum total price is",R[:,2], "and the mazx cost is :",max(Cost.sum(axis=0)))


Cost array: [6958 6968 6984 6860]
rate combination which result in maximum total price is [102  68  98] and the mazx cost is : 6984


### Marginal Cost (1 point)

The total cost $C(x)$ in Rupees, associated with the production of $x$ gallons of gasoline is given by

$$C(x) = 0.005 x^3 – 0.02 x^2 + 30x + 5000$$

Find the marginal cost when $22$ gallons are produced, where, marginal cost means the instantaneous rate of change of total cost at any level of output.

In [None]:
# YOUR CODE HERE
#differenciation at 22
from sympy import *
x= symbols('x')
C=0.005*x**3-0.02*x**2+30*x+5000
gx = C.diff(x)
print("Rate of change of total cost at any level",gx)
hx=gx.subs(x,22)
print("marginal cost when  22  gallons are produced",hx)

Rate of change of total cost at any level 0.015*x**2 - 0.04*x + 30
marginal cost when  22  gallons are produced 36.3800000000000


### Marginal Revenue (1 point)

The total revenue in Rupees received from the sale of $x$ gallons of a motor oil is given by $$R(x) = 3x^2 + 36x + 5.$$

Find the marginal revenue, when $x = 28$, where, marginal revenue means the rate of change of total revenue with respect to the number of items sold at an instant.

In [None]:
# YOUR CODE HERE
from sympy import *
x= symbols('x')
C=3*x**2+36*x+5
gx = C.diff(x)
print("diff equation:",gx)
hx=gx.subs(x,28)
print("marginal revenue when x=28 ", hx)

diff equation: 6*x + 36
marginal revenue when x=28  204


### Pouring crude oil in tank (1 point)

In a cylindrical tank of radius 10 meter, crude oil is being poured at the rate of 314 cubic meter per hour. Then find

* the rate at which the height of crude oil is increasing in the tank, and
* the height of crude oil in tank after 2 hours.

In [None]:
# YOUR CODE HERE
from sympy import *
r,h= symbols('r,h')
C=np.pi*r**2*h
gx = C.diff(h)
rate_of_height = 314/gx
print("the rate at which the height of crude oil is increasing in the tank:",rate_of_height.subs(r,10))
print("height of crude oil in tank after 2 hours:",rate_of_height.subs(r,10)*2)

the rate at which the height of crude oil is increasing in the tank: 0.999493042617103
height of crude oil in tank after 2 hours: 1.99898608523421
