<a href="https://colab.research.google.com/github/rajiv-ranjan/cds-mini-projects/blob/Archana/M4_NB_MiniProject_1_Linear_Algebra_and_Calculus.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Advanced Certification Program in Computational Data Science
## A program by IISc and TalentSprint
### Mini Project Notebook: Linear Algebra and Calculus

## Problem Statement

 The task is to advise a petroleum company on how to meet the demands of their customers for motor oil, diesel oil and gasoline.

## Learning Objectives

At the end of the experiment, you will be able to

* create arrays and matrices in python
* understand the concepts of linear equations
* solve the system of linear equations

### Data

From a barrel of crude oil, in one day, factory $A$ can produce
* 20 gallons of motor oil,
* 10 gallons of diesel oil, and
* 5 gallons of gasoline

Similarly, factory $B$ can produce
* 4 gallons of motor oil,
* 14 gallons of diesel oil, and
* 5 gallons of gasoline

while factory $C$ can produce
* 4 gallons of motor oil,
* 5 gallons of diesel oil, and
* 12 gallons of gasoline

There is also waste in the form of paraffin, among other things. Factory $A$ has 3 gallons of paraffin to dispose of per barrel of crude, factory $B$ 5 gallons, and factory $C$ 2 gallons.

**Note:** Your conclusion should include a discussion of the nature of the terms *unique*, *no solution*, *overdetermined* and *underdetermined* as they apply in the context of the oil plants.

## Grading = 10 Points

In [1]:
import numpy as np
from scipy.linalg import solve

### Create an array

Create an array of size 2x3 with arbitrary values.

In [2]:
arr = np.array([[1,2,3],[4,5,6]])
arr

array([[1, 2, 3],
       [4, 5, 6]])

### Create the system of Linear Equations

Suppose the current daily demand from distributors is 6600 gallons of motor oil, 5100 gallons of diesel oil and 3100 of gasoline.

Set up the system of equations which describes the above situation. Please include the units as well.

Let the number of barrels used by factory $A$, $B$ and $C$ are $x$, $y$ and $z$ respectively.

Then the system of linear equations will be

$$Motor\ oil:\ \ \ 20x + 4y + 4z = 6600$$

$$Diesel\ oil:\ \ \ 10x + 14y + 5z = 5100$$

$$Gasoline:\ \ \ 5x + 5y + 12z = 3100$$

### Solve the system of Linear Equation (2 points)

How many barrels of crude oil each plant should get in order to meet the demand as a group. Remember that we can only provide each plant with an integral number of barrels.

In [3]:
# YOUR CODE HERE
arr1 = np.array([[20,4,4],[10,14,5],[5,5,12]])

arr2 = np.array([6600,5100,3100])
no_of_barrels = np.round(solve(arr1, arr2)).astype(int)

print(f"Number of barrels required by factory A: {no_of_barrels[0]}")
print(f"Number of barrels required by factory B: {no_of_barrels[1]}")
print(f"Number of barrels required by factory C: {no_of_barrels[2]}")

Number of barrels required by factory A: 287
Number of barrels required by factory B: 129
Number of barrels required by factory C: 85


Suppose the total demand for all products **doubled**. What would the solution now be? How does it compare to the original solution? Why, mathematically, should this have been expected?

In [4]:
# YOUR CODE HERE
double_products = 2 * arr2
double_barrels = np.round(solve(arr1, double_products)).astype(int)

print(f"Number of barrels required by factory A: {double_barrels[0]}")
print(f"Number of barrels required by factory B: {double_barrels[1]}")
print(f"Number of barrels required by factory C: {double_barrels[2]}")

Number of barrels required by factory A: 574
Number of barrels required by factory B: 258
Number of barrels required by factory C: 170


The expected result should be that each factory's barrels also double because the system of linear equations scales linearly.

Suppose that the company acquires another group of distributors and that the daily demand of this group is 2000 gallons of motor oil, 4000 gallons of gasoline, and 4000 gallons of diesel oil. How would you set up production of just this supply? Are there any options (more than one way)?

In [5]:
new_arr2 = np.array([2000,4000,4000])
new_barrels = np.round(solve(arr1, new_arr2)).astype(int)

print("Number of barrels for new demand :")
print(f"Factory A: {new_barrels[0]}")
print(f"Factory B: {new_barrels[1]}")
print(f"Factory C: {new_barrels[2]}")

Number of barrels for new demand :
Factory A: 12
Factory B: 188
Factory C: 250


Next, calculate the needs of each factory (in barrels of crude, as usual) to meet the total demand of both groups of distributors. When you have done this, compare your answer to results already obtained. What mathematical conclusion can you draw?

In [6]:
# YOUR CODE HERE
tot_demand = arr2 + new_arr2
total_barrels = np.round(solve(arr1, tot_demand)).astype(int)

result_obtained = np.concatenate((no_of_barrels, new_barrels))

print("Number of barrels for total demand of distributers :")
print(f"Factory A: {total_barrels[0]}")
print(f"Factory B: {total_barrels[1]}")
print(f"Factory C: {total_barrels[2]}")

print("\nResult obtained earlier :")
print(f"Factory A: {result_obtained[0]}")
print(f"Factory B: {result_obtained[1]}")
print(f"Factory C: {result_obtained[2]}")

Number of barrels for total demand of distributers :
Factory A: 300
Factory B: 316
Factory C: 335

Result obtained earlier :
Factory A: 287
Factory B: 129
Factory C: 85


Both results are clearly identical

### Sensitivity and Robustness (1 point)

In real life applications, constants are rarely ever exactly equal to their stated value; certain amounts of uncertainty are always present. This is part of the reason for the science of statistics. In the above model, the daily productions for the plants would be averages over a period of time. Explore what effect small changes in the parameters have on the output.

To do this, pick any 3 coefficients, one at a time, and increase or decrease them by 3%. For each case , note what effect this has on the solution, as a percentage change. Can you draw any overall conclusion?

In [7]:
A = np.array([
    [20, 4, 4],
    [10, 14, 5],
    [5, 5, 12]
])

B = np.array([6600, 5100, 3100])

# Function to compute percentage change in solution due to parameter variation
def compute_percentage_change(A, B, param_indices, variation):
    original_solution = solve(A, B)
    results = []

    for idx in param_indices:
        A_modified = A.copy()

        # Apply variation (+3% or -3%)
        A_modified.flat[idx] *= (1 + variation)

        # Solve with modified coefficients
        modified_solution = solve(A_modified, B)

        # Compute percentage change in solution
        percentage_change = ((modified_solution - original_solution) / original_solution) * 100
        results.append((idx, percentage_change))

    return results

# Indices of coefficients to vary (e.g., A[0,0], A[1,1], A[2,2] in flattened form)
param_indices = [0, 4, 8]  # Corresponds to A[0,0], A[1,1], and A[2,2]

# Increase by 3%
results_increase = compute_percentage_change(A, B, param_indices, 0.03)

# Decrease by 3%
results_decrease = compute_percentage_change(A, B, param_indices, -0.03)

# Display results
print("Percentage change when coefficients are increased by 3%:")
for idx, change in results_increase:
    print(f"Coefficient {idx}: {change}")

print("\nPercentage change when coefficients are decreased by 3%:")
for idx, change in results_decrease:
    print(f"Coefficient {idx}: {change}")


Percentage change when coefficients are increased by 3%:
Coefficient 0: [0. 0. 0.]
Coefficient 4: [0. 0. 0.]
Coefficient 8: [0. 0. 0.]

Percentage change when coefficients are decreased by 3%:
Coefficient 0: [ 6.33584404 -9.39084875 -2.99460502]
Coefficient 4: [-0.57569007 10.09174312 -5.5585537 ]
Coefficient 8: [-0.49318248 -1.83387271 11.11111111]


### A Plant Off-Line (1 point)

Suppose factory $C$ is shut down by the EPA (Environmental Protection Agency) temporarily for excessive emissions into the atmosphere. If your demand is as it was originally (6600, 5100, 3100), what would you now say about the companies ability to meet it? What do you recommend they schedule for production now?

In [12]:
Arr = np.array([
        [20, 4],
        [10, 14],
        [5, 5]])

demand_arr = np.array([6600, 5100, 3100])

solution, residuals, rank, s = np.linalg.lstsq(Arr, B, rcond=None)
solution_rounded = np.round(solution)

# @ matrix multiplication operator
resulting_production = Arr @ solution_rounded

print("Solution (exact):", solution)
print("Rounded solution (practical barrels):", solution_rounded)
print("Resulting production with rounded solution:", resulting_production)
print("Residuals (unmet demand):", B - resulting_production)

Solution (exact): [299.47204969 168.47826087]
Rounded solution (practical barrels): [299. 168.]
Resulting production with rounded solution: [6652. 5342. 2335.]
Residuals (unmet demand): [ -52. -242.  765.]


### Buying another plant

####(Note the following given information. You will see questions in continuation to this, in the subsequent sections)

This situation has caused enough concern that the CEO is considering buying another plant, identical to the third, and using it permanently. Assuming that all 4 plants are on line, what production do you recommend to meet the current demand (5000, 8500, 10000)? In general, what can you say about any increased flexibility that the 4th plant might provide?

Let the number of barrels used by factory $A$, $B$, $C$ and $D$ are $x$, $y$, $z$ and $w$ respectively.

Then the system of linear equations will be

$$20x + 4y + 4z + 4w = 5000$$

$$10x + 14y + 5z + 5w = 8500$$

$$5x + 5y + 12z + 12w = 10000$$

The above system of linear equation has fewer equations than variables, hence it is *underdetermined* and cannot have a unique solution. In this case, there are either infinitely many solutions or no exact solution. We can solve it by keeping $w$ as constant and using [rref](http://linear.ups.edu/html/section-RREF.html) form to solve the system of linear equation.

To know about rref implementation in python refer [here](https://docs.sympy.org/latest/tutorial/matrices.html#rref).

In [10]:
import sympy as sy

# create symbol 'w'
w = sy.Symbol("w")
A_aug = sy.Matrix([[20, 4, 4, 5000-4*w],
                   [10, 14, 5, 8500-5*w],
                   [5, 5, 12, 10000-12*w]])
# show rref form
A_aug.rref()

(Matrix([
 [1, 0, 0,   195/4],
 [0, 1, 0,  1325/4],
 [0, 0, 1, 675 - w]]),
 (0, 1, 2))

From the above result, it can be seen that 4th plant will share the number of barrels required by the 3rd plant only, while the requirement of 1st and 2nd plant will remain unaffected.

### Calculate the amount of Paraffin supplied (1 point)

The company has just found a candle company that will buy its paraffin. Under the current conditions (i.e, after buying another plant) for demand (5000, 8500, 10000), how much can be supplied to them per day?

According to the problem statement, factory $A$ has 3 gallons of paraffin to dispose of per barrel of crude oil, factory $B$ 5 gallons, and factory $C$ 2 gallons.

In [15]:
A = np.array([
    [20, 4, 4, 4],
    [10, 14, 5, 5],
    [5, 5, 12, 12]])

B = np.array([5000, 8500, 10000])

# Solve for x, y, z, w
solution = np.linalg.lstsq(A, B, rcond=None)[0]

# Compute the total paraffin supplied
paraffin_per_barrel = np.array([3, 5, 2, 2])  # Paraffin gallons per barrel for each factory
total_paraffin = np.dot(paraffin_per_barrel, solution)

# Print the solution and total paraffin
print("Number of barrels allocated to each factory (x, y, z, w):", solution)
print("Total paraffin supplied:", total_paraffin)

Number of barrels allocated to each factory (x, y, z, w): [ 48.75 331.25 337.5  337.5 ]
Total paraffin supplied: 3152.500000000002


### Selling the first plant (1 point)

The management is also considering selling the first plant due to aging equipment and high workman's compensation costs for the state it is located in. They would like to know what this would do to their production capability. Specifically, they would like an example of a demand they could not meet with only plants 2 and 3, and also what effect having plant 4 has (recall it is identical to plant 3). They would also like an example of a demand that they could meet with just plants 2 and 3. Any general statements you could make here would be helpful.

Let the number of barrels used by factory $B$, $C$ and $D$ are $y$, $z$ and $w$ respectively.

When considering only plants 2 and 3, and demand (5000, 8500, 10000) then we have

$$4y + 4z = 5000$$

$$14y + 5z = 8500$$

$$5y + 12z = 10000$$

In [17]:
A_bc = np.array([
    [4, 4],
    [14, 5],
    [5, 12]
])
B = np.array([5000, 8500, 10000])

# Attempt to solve using least squares
solution_bc, residuals, rank, s = np.linalg.lstsq(A_bc, B, rcond=None)

# Check feasibility (residuals should be close to 0)
print("Plants B and C")
print("Solution (y, z):", solution_bc)
print("Residuals:", residuals)

demand_met_bc = np.allclose(A_bc @ solution_bc, B, atol=1e-3)
print(f"Demands met with Plants B and C: {demand_met_bc}")

Plants B and C
Solution (y, z): [369.30178881 695.03750721]
Residuals: [607616.84939411]
Demands met with Plants B and C: False


Taking 4th plant into consideration.
Let the number of barrels used by factory $B$, $C$ and $D$ are $y$, $z$ and $w$ respectively.

Then for demand (5000, 8500, 10000) the system of linear equations will be

$$4y + 4z + 4w = 5000$$

$$14y + 5z + 5w = 8500$$

$$5y + 12z + 12w = 10000$$

Solve it using rref form.

In [20]:
# Define the augmented matrix
A_aug = sy.Matrix([
    [4, 4, 4, 5000],
    [14, 5, 5, 8500],
    [5, 12, 12, 10000]
])

# Compute the RREF
A_aug.rref()
#A_rref, pivot_columns = A_aug.rref()

# Print the RREF result
#print("RREF of the augmented matrix:")
#print(A_rref)

(Matrix([
 [1, 0, 0, 0],
 [0, 1, 1, 0],
 [0, 0, 0, 1]]),
 (0, 1, 3))

Now, changing demand to (6600, 5100, 3100) and solving the system of equation using rref form.

In [21]:
A_aug = sy.Matrix([
    [4, 4, 4, 6600],
    [14, 5, 5, 5100],
    [5, 12, 12, 3100]
])

A_aug.rref()

(Matrix([
 [1, 0, 0, 0],
 [0, 1, 1, 0],
 [0, 0, 0, 1]]),
 (0, 1, 3))

the matrix confirms that it is mathematically impossible to meet the given demand with the current constraints of Plants B, C, and D.

### Set rates for Products (1 point)

Company wants to set the rates of motor oil, diesel oil, and gasoline. For this purpose they have few suggestions given as follows:

* 100, 66, 102 Rupees per gallon,

* 104, 64, 100 Rupees per gallon,

* 102, 68, 98 Rupees per gallon, and

* 96, 68, 100 Rupees per gallon

for motor oil, diesel oil, and gasoline respectively.

Using matrix multiplication, find the rates which result in maximum total price.

Let $M$ denote the matrix such that rows represents different plants (A, B and C), columns represents different products (motor oil, diesel oil and gasoline) and each value represents production of that product from one barrel of crude oil for that plant.

$$M = \begin{bmatrix}
20 & 10 & 5 \\
4 & 14 & 5  \\
 4 & 5 & 12  
\end{bmatrix}$$

Also, $R$ is a matrix having different rates as its columns.

$$R = \begin{bmatrix}
100 & 104 & 102 & 96 \\
66 & 64 & 68 & 68  \\
102 & 100 & 98 & 100  
\end{bmatrix}$$

In [22]:
# production matrix M
M = np.array([
    [20, 10, 5],
    [4, 14, 5],
    [4, 5, 12]
])

# rate matrix R
R = np.array([
    [100, 104, 102, 96],
    [66, 64, 68, 68],
    [102, 100, 98, 100]
])

# Perform matrix multiplication to calculate total prices
total_prices = M @ R

# Find the rates that result in the maximum total price
max_total_price = np.max(total_prices, axis=1)
best_rates_indices = np.argmax(total_prices, axis=1)

# Display results
total_prices, max_total_price, best_rates_indices


(array([[3170, 3220, 3210, 3100],
        [1834, 1812, 1850, 1836],
        [1954, 1936, 1924, 1924]]),
 array([3220, 1850, 1954]),
 array([1, 2, 0]))

Best Rate Combinations for Each Plant:

Plant A prefers rates: [104, 64, 100] (2nd column of R).

Plant B prefers rates: [102, 68, 98] (3rd column of R).

Plant C prefers rates: [100, 66, 102] (1st column of R).

### Marginal Cost (1 point)

The total cost $C(x)$ in Rupees, associated with the production of $x$ gallons of gasoline is given by

$$C(x) = 0.005 x^3 – 0.02 x^2 + 30x + 5000$$

Find the marginal cost when $22$ gallons are produced, where, marginal cost means the instantaneous rate of change of total cost at any level of output.

In [25]:
# Define the cost function C(x)
x = sy.Symbol('x')
C = 0.005 * x**3 - 0.02 * x**2 + 30 * x + 5000

# Calculate the derivative of C(x), which is the marginal cost
marginal_cost = sy.diff(C, x)

# Evaluate the marginal cost when x = 22
marginal_cost_at_22 = marginal_cost.subs(x, 22)

print(f"Marginal cost function : {marginal_cost}")
print(f"Marginal cost at 22 gallons : {marginal_cost_at_22}")

Marginal cost function : 0.015*x**2 - 0.04*x + 30
Marginal cost at 22 gallons : 36.3800000000000


The marginal cost when 22 gallons are produced is 36.38 Rupees per gallon.

### Marginal Revenue (1 point)

The total revenue in Rupees received from the sale of $x$ gallons of a motor oil is given by $$R(x) = 3x^2 + 36x + 5.$$

Find the marginal revenue, when $x = 28$, where, marginal revenue means the rate of change of total revenue with respect to the number of items sold at an instant.

In [26]:
x = sy.Symbol('x')
R = 3 * x**2 + 36 * x + 5

# Differentiate R(x) to find the marginal revenue
marginal_revenue = sy.diff(R, x)

# Evaluate the marginal revenue when x = 28
marginal_revenue_at_28 = marginal_revenue.subs(x, 28)

# Print the results
print(f"Marginal Revenue Function: {marginal_revenue}")
print(f"Marginal Revenue at 28 gallons: {marginal_revenue_at_28}")

Marginal Revenue Function: 6*x + 36
Marginal Revenue at 28 gallons: 204


The marginal revenue when
𝑥
=
28
x=28 gallons are sold is 204 Rupees per gallon.

### Pouring crude oil in tank (1 point)

In a cylindrical tank of radius 10 meter, crude oil is being poured at the rate of 314 cubic meter per hour. Then find

* the rate at which the height of crude oil is increasing in the tank, and
* the height of crude oil in tank after 2 hours.

In [27]:
r = 10  #radius in meters
dV_dt = 314  #rate of oil poured in cubic meters per hour
t = 2  #time in hours

# Calculate dh/dt
dh_dt = dV_dt / (sy.pi * r**2)

# Calculate height after 2 hours
height_after_2_hours = dh_dt * t

# Display results
print("Rate of height increase (dh/dt):", dh_dt, "m/hour")
print("Height of crude oil after 2 hours:", height_after_2_hours, "meters")

Rate of height increase (dh/dt): 157/(50*pi) m/hour
Height of crude oil after 2 hours: 157/(25*pi) meters
