# Assignment 6

The Kozeny-Carmen (K-C) relationship is a model that relates porosity to permeability through a proportionality constant

$$
m \propto \frac{\phi^3}{(1 - \phi)^2} = f(\phi)
$$

The file [poro_perm.csv](poro_perm.csv) contains two columns of data corresponding to porosity and permeablity measurements for a reservoir.  Your assignment is to write a least squares algorithm using Python with the numpy library to determine the proportionality constant $m$ for the K-C relationship.

If your not familar with least squares fitting, see this [Wikipedia article](https://en.wikipedia.org/wiki/Least_squares) for more details.  Essentially, the goal is to minimize the error between an assumed model (in this case the K-C model) and the data.  After some derivations, this leads to the following matrix-vector equation

$$
\mathbf{A}^\intercal \mathbf{A} \vec{x} = \mathbf{A}^\intercal \vec{b},
$$

which has the solution

$$
\vec{x} =\left(\mathbf{A}^\intercal \mathbf{A}\right)^{-1} \mathbf{A}^\intercal \vec{b}.
$$

For this problem, 

$$
\mathbf{A} =
\begin{bmatrix}
1 & f(\phi_1) \\
1 & f(\phi_2) \\
\vdots & \vdots \\
1 & f(\phi_N)
\end{bmatrix},
$$

and

$$
\vec{b} = \left\lbrace
\begin{matrix}
\kappa_1 \\
\kappa_2 \\
\vdots \\
\kappa_N
\end{matrix}
\right\rbrace
$$

where $\kappa_i$ is the permiability data and the subscripts $i=1, 2, \ldots N$ indicate rows in the [poro_perm.dat](poro_perm.dat) file.

The solution vector $\vec{x}$ will have two values, $(\kappa_0,m)$ corresponding to the intercept and slope of a line, i.e.

$$
\kappa = \kappa_0+ m f(\phi)
$$

You will notice for this data that $\kappa_0 \ne 0$, which may or may not make sense depending on the standard deviation of the data.  It may be desirable to have the ability to force the fit to pass through the origin.  One "trick" to force this behavoir is to simply append negated values of the K-C model and permiability data to $\mathbf{A}$ and $\vec{b}$.

Complete the class below by implementing the functions `kc_model`, `fit`, and `fit_through_zero`.  Use Numpy data structures and operations.  It's completely unnecessary to have any `for`-loops or `if`-statements in the code.  For solving the linear system of equations, see [`numpy.linalg.solve()`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.solve.html).  Also, matrix-vector dot products are carried out using the [`numpy.dot()`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.dot.html)

In [16]:
import numpy as np
import pandas as pd

class KozenyCarmen():
    
    def __init__(self, filename):
        
        # Complete the line of code below to read `filename` as
        # a csv file into a Pandas DataFrame.
        
        # self.df = 
        
        # Calls function to add 'KC model' column to the DataFrame
        self.kc_model()
        
        return
    
    def kc_model(self):
        # This function should add a column to `self.df` called "KC model"
        # whose values are 𝜙^3 / (1 - 𝜙)^2
        
        #self.df['KC model'] = 
        
        return 
    
    def least_squares(self, A, b):
        # This should return the least squares solution for
        # an arbitrary matrix A and and right-hand side vector b
        return     
    
    def fit(self):
        # This function should call use extract the appropriate columns from self.df 
        # to compute and return the solution to the least squares problem.
        
        #Return both the slope and intercept values
        return
    
    def fit_through_zero(self):
        # This function should call use extract the appropriate columns from self.df 
        # to compute and return the solution to the least squares problem forcing
        # the intercept to be zero
        
        #Return only the slope value 
        return