#Matrix and Covariance

The `mat_handler.py` module contains `matrix` class, which is the backbone of `pyemu`.  The `matrix` class overloads all common mathematical operators and also uses an "auto-align" functionality to line up matrix objects for multiplication, addition, etc. 



In [1]:
from __future__ import print_function
import os
import numpy as np
from pyemu import Matrix, Cov

Here is the most basic instantiation of the `matrix` class:

In [2]:
m = Matrix()

Here we will generate a `matrix` object with a random ndarray

In [3]:
a = np.random.random((5, 5))
row_names = []
[row_names.append("row_{0:02d}".format(i)) for i in range(5)]
col_names = []
[col_names.append("col_{0:02d}".format(i)) for i in range(5)]
m = Matrix(x=a, row_names=row_names, col_names=col_names)
print(m)

row names: ['row_00', 'row_01', 'row_02', 'row_03', 'row_04']
col names: ['col_00', 'col_01', 'col_02', 'col_03', 'col_04']
[[ 0.3081309   0.79692701  0.98529707  0.31022483  0.45741098]
 [ 0.58665671  0.21098139  0.28957681  0.9345593   0.80868814]
 [ 0.0352743   0.88926795  0.82849891  0.15444307  0.71035009]
 [ 0.47402196  0.36725735  0.94682494  0.87774732  0.08436884]
 [ 0.12497742  0.38282169  0.85471852  0.65347694  0.82813489]]


#File I/O with `matrix`
`matrix` supports several PEST-compatible I/O routines as well as some others:

In [4]:
ascii_name = "mat_test.mat"
m.to_ascii(ascii_name)
m2 = Matrix.from_ascii(ascii_name)
print(m2)

row names: ['row_00', 'row_01', 'row_02', 'row_03', 'row_04']
col names: ['col_00', 'col_01', 'col_02', 'col_03', 'col_04']
[[ 0.3081309   0.79692701  0.98529707  0.31022483  0.45741098]
 [ 0.58665671  0.21098139  0.28957681  0.9345593   0.80868814]
 [ 0.0352743   0.88926795  0.82849891  0.15444307  0.71035009]
 [ 0.47402196  0.36725735  0.94682494  0.87774732  0.08436884]
 [ 0.12497742  0.38282169  0.85471852  0.65347694  0.82813489]]


In [5]:
bin_name = "mat_test.bin"
m.to_binary(bin_name)
m3 = Matrix.from_binary(bin_name)
print(m3)

row names: [u'row_00', u'row_01', u'row_02', u'row_03', u'row_04']
col names: [u'col_00', u'col_01', u'col_02', u'col_03', u'col_04']
[[ 0.3081309   0.79692701  0.98529707  0.31022483  0.45741098]
 [ 0.58665671  0.21098139  0.28957681  0.9345593   0.80868814]
 [ 0.0352743   0.88926795  0.82849891  0.15444307  0.71035009]
 [ 0.47402196  0.36725735  0.94682494  0.87774732  0.08436884]
 [ 0.12497742  0.38282169  0.85471852  0.65347694  0.82813489]]


`Matrix` also implements a `to_dataframe()` and a `to_sparse`, which return `pandas dataframe` and a `scipy.sparse` (compressed sparse row) objects, respectively:

In [6]:
print(type(m.to_dataframe()))
print(type(m.to_sparse()))
m.to_dataframe() #looks really nice in the notebook!

<class 'pandas.core.frame.DataFrame'>
<class 'scipy.sparse.csr.csr_matrix'>


Unnamed: 0,col_00,col_01,col_02,col_03,col_04
row_00,0.308131,0.796927,0.985297,0.310225,0.457411
row_01,0.586657,0.210981,0.289577,0.934559,0.808688
row_02,0.035274,0.889268,0.828499,0.154443,0.71035
row_03,0.474022,0.367257,0.946825,0.877747,0.084369
row_04,0.124977,0.382822,0.854719,0.653477,0.828135


#Convience methods of `Matrix`

several cool things are implemented in `Matrix` and accessed through `@property` decorated methods.  For example, the SVD components of a `Matrix` object are simply accessed by name.  The SVD routine is called on demand and the components are cast to `Matrix` objects, all opaque to the user:

In [7]:
print(m.s) #the singular values of m cast into a matrix object.  the SVD() is called on demand...
m.s.to_ascii("test_sv.mat") #save the singular values to a PEST-compatible ASCII file

row names: ['sing_val_1', 'sing_val_2', 'sing_val_3', 'sing_val_4', 'sing_val_5']
col names: ['sing_val_1', 'sing_val_2', 'sing_val_3', 'sing_val_4', 'sing_val_5']
[[ 2.8857439 ]
 [ 1.03096916]
 [ 0.73390013]
 [ 0.33988266]
 [ 0.08801381]]


In [8]:
m.v.to_ascii("test_v.mat") #the right singular vectors of m.
m.u.to_dataframe()# a data frame of the left singular vectors of m

Unnamed: 0,left_sing_vec_1,left_sing_vec_2,left_sing_vec_3,left_sing_vec_4,left_sing_vec_5
row_00,-0.467034,0.372711,-0.205052,0.37254,-0.679804
row_01,-0.408992,-0.673322,0.438243,0.432469,0.016634
row_02,-0.440049,0.569139,0.286897,0.155638,0.61311
row_03,-0.440987,-0.285314,-0.777889,-0.085838,0.334134
row_04,-0.475912,-0.048989,0.280134,-0.801619,-0.223695


The `Matrix` inverse operation is accessed the same way, but requires a square matrix:

In [9]:
m.inv.to_dataframe()

Unnamed: 0,col_00,col_01,col_02,col_03,col_04
row_00,4.879749,0.951985,-3.70631,-1.898254,-0.252348
row_01,-3.005933,0.521678,4.094081,1.671013,-2.531158
row_02,2.670607,-1.007511,-2.624456,-0.778713,1.839288
row_03,-4.389092,0.315372,3.232537,2.461402,-0.907241
row_04,1.360202,0.406171,-1.17531,-1.624554,1.233262


#Manipulating `Matrix` shape
`Matrix` has lots of functionality to support getting submatrices by row and col names:

In [10]:

print(m.get(row_names="row_00",col_names=["col_01","col_03"]))

row names: ['row_00']
col names: ['col_01', 'col_03']
[[ 0.79692701  0.31022483]]


`extract()` calls `get()` then `drop()`:

In [11]:
from copy import deepcopy
m_copy = deepcopy(m)
sub_m = m_copy.extract(row_names="row_00",col_names=["col_01","col_03"])
m_copy.to_dataframe()
sub_m.to_dataframe()

Unnamed: 0,col_01,col_03
row_00,0.796927,0.310225


#Operator overloading
The operator overloading uses the auto-align functionality as well as the `isdiagonal` flag for super easy linear algebra.  The "inner join" of the two objects is found and the rows and cols are aligned appropriately:

In [12]:
#a new matrix object that is not "aligned" with m
row_names = ["row_03","row_02","row_00"]
col_names = ["col_01","col_10","col_100"]
m_mix = Matrix(x=np.random.random((3,3)),row_names=row_names,col_names=col_names)
m_mix.to_dataframe()


Unnamed: 0,col_01,col_10,col_100
row_03,0.149962,0.139037,0.878849
row_02,0.639239,0.609701,0.357757
row_00,0.292393,0.104705,0.795974


In [13]:
m.to_dataframe()

Unnamed: 0,col_00,col_01,col_02,col_03,col_04
row_00,0.308131,0.796927,0.985297,0.310225,0.457411
row_01,0.586657,0.210981,0.289577,0.934559,0.808688
row_02,0.035274,0.889268,0.828499,0.154443,0.71035
row_03,0.474022,0.367257,0.946825,0.877747,0.084369
row_04,0.124977,0.382822,0.854719,0.653477,0.828135


In [14]:
prod = m * m_mix.T
prod.to_dataframe()

Unnamed: 0,row_03,row_02,row_00
row_00,0.119509,0.509427,0.233016
row_01,0.031639,0.134868,0.061689
row_02,0.133357,0.568455,0.260015
row_03,0.055075,0.234765,0.107383
row_04,0.057409,0.244715,0.111934


In [15]:
prod2 = m_mix.T * m
prod2.to_dataframe()

Unnamed: 0,col_00,col_01,col_02,col_03,col_04
col_01,0.183729,0.856545,0.95969,0.321062,0.600479
col_10,0.119676,0.676692,0.739946,0.248685,0.492725
col_100,0.674477,1.275239,1.912788,1.073591,0.692367


In [16]:
(m_mix + m).to_dataframe()

Unnamed: 0,col_01
row_03,0.517219
row_02,1.528507
row_00,1.08932


#The `Cov` derived type
The `Cov` type is designed specifically to handle covariance matrices.  It makes some assumptions, such as the symmetry (and accordingly that row_names == col_names). 

In [17]:
c = Cov(m.newx,m.row_names)

The `Cov` class supports several additional I/O routines, including the PEST uncertainty file (.unc):

In [18]:
c.to_uncfile("test.unc")

In [19]:
c1 = Cov.from_uncfile("test.unc")
print(c1)

row names: ['row_00', 'row_01', 'row_02', 'row_03', 'row_04']
col names: ['row_00', 'row_01', 'row_02', 'row_03', 'row_04']
[[ 0.3081309   0.79692701  0.98529707  0.31022483  0.45741098]
 [ 0.58665671  0.21098139  0.28957681  0.9345593   0.80868814]
 [ 0.0352743   0.88926795  0.82849891  0.15444307  0.71035009]
 [ 0.47402196  0.36725735  0.94682494  0.87774732  0.08436884]
 [ 0.12497742  0.38282169  0.85471852  0.65347694  0.82813489]]


We can also build `cov` objects implied by pest control file parameter bounds or observation weights:

In [22]:
parcov = Cov.from_parbounds(os.path.join("henry","pest.pst"))
obscov = Cov.from_obsweights(os.path.join("henry","pest.pst"))

In [23]:
parcov.to_dataframe() #to_dataframe for diagonal types builds a full matrix dataframe - can be costly

Unnamed: 0,global_k,mult1,mult2,kr01c01,kr01c02,kr01c03,kr01c04,kr01c05,kr01c06,kr01c07,...,kr10c51,kr10c52,kr10c53,kr10c54,kr10c55,kr10c56,kr10c57,kr10c58,kr10c59,kr10c60
global_k,0.003076,0.000000,0.000000,0.00,0.00,0.00,0.00,0.00,0.00,0.00,...,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00
mult1,0.000000,0.003076,0.000000,0.00,0.00,0.00,0.00,0.00,0.00,0.00,...,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00
mult2,0.000000,0.000000,0.022655,0.00,0.00,0.00,0.00,0.00,0.00,0.00,...,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00
kr01c01,0.000000,0.000000,0.000000,0.25,0.00,0.00,0.00,0.00,0.00,0.00,...,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00
kr01c02,0.000000,0.000000,0.000000,0.00,0.25,0.00,0.00,0.00,0.00,0.00,...,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00
kr01c03,0.000000,0.000000,0.000000,0.00,0.00,0.25,0.00,0.00,0.00,0.00,...,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00
kr01c04,0.000000,0.000000,0.000000,0.00,0.00,0.00,0.25,0.00,0.00,0.00,...,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00
kr01c05,0.000000,0.000000,0.000000,0.00,0.00,0.00,0.00,0.25,0.00,0.00,...,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00
kr01c06,0.000000,0.000000,0.000000,0.00,0.00,0.00,0.00,0.00,0.25,0.00,...,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00
kr01c07,0.000000,0.000000,0.000000,0.00,0.00,0.00,0.00,0.00,0.00,0.25,...,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00


In [24]:
obscov.to_dataframe()# notice the zero-weight obs have been assigned a really large uncertainty

Unnamed: 0,h_obs01_1,h_obs01_2,h_obs02_1,h_obs02_2,h_obs03_1,h_obs03_2,h_obs04_1,h_obs04_2,h_obs05_1,h_obs05_2,...,c_obs12_2,c_obs13_1,c_obs13_2,c_obs14_1,c_obs14_2,c_obs15_1,c_obs15_2,pd_one,pd_ten,pd_half
h_obs01_1,0.000043,0.000000e+00,0.000000,0.000000e+00,0.000000,0.000000e+00,0.000000,0.000000e+00,0.000000,0.000000e+00,...,0.000000e+00,0.0000,0.000000e+00,0.00000,0.000000e+00,0.000000,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00
h_obs01_2,0.000000,1.000000e+60,0.000000,0.000000e+00,0.000000,0.000000e+00,0.000000,0.000000e+00,0.000000,0.000000e+00,...,0.000000e+00,0.0000,0.000000e+00,0.00000,0.000000e+00,0.000000,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00
h_obs02_1,0.000000,0.000000e+00,0.000043,0.000000e+00,0.000000,0.000000e+00,0.000000,0.000000e+00,0.000000,0.000000e+00,...,0.000000e+00,0.0000,0.000000e+00,0.00000,0.000000e+00,0.000000,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00
h_obs02_2,0.000000,0.000000e+00,0.000000,1.000000e+60,0.000000,0.000000e+00,0.000000,0.000000e+00,0.000000,0.000000e+00,...,0.000000e+00,0.0000,0.000000e+00,0.00000,0.000000e+00,0.000000,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00
h_obs03_1,0.000000,0.000000e+00,0.000000,0.000000e+00,0.000043,0.000000e+00,0.000000,0.000000e+00,0.000000,0.000000e+00,...,0.000000e+00,0.0000,0.000000e+00,0.00000,0.000000e+00,0.000000,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00
h_obs03_2,0.000000,0.000000e+00,0.000000,0.000000e+00,0.000000,1.000000e+60,0.000000,0.000000e+00,0.000000,0.000000e+00,...,0.000000e+00,0.0000,0.000000e+00,0.00000,0.000000e+00,0.000000,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00
h_obs04_1,0.000000,0.000000e+00,0.000000,0.000000e+00,0.000000,0.000000e+00,0.000043,0.000000e+00,0.000000,0.000000e+00,...,0.000000e+00,0.0000,0.000000e+00,0.00000,0.000000e+00,0.000000,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00
h_obs04_2,0.000000,0.000000e+00,0.000000,0.000000e+00,0.000000,0.000000e+00,0.000000,1.000000e+60,0.000000,0.000000e+00,...,0.000000e+00,0.0000,0.000000e+00,0.00000,0.000000e+00,0.000000,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00
h_obs05_1,0.000000,0.000000e+00,0.000000,0.000000e+00,0.000000,0.000000e+00,0.000000,0.000000e+00,0.000043,0.000000e+00,...,0.000000e+00,0.0000,0.000000e+00,0.00000,0.000000e+00,0.000000,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00
h_obs05_2,0.000000,0.000000e+00,0.000000,0.000000e+00,0.000000,0.000000e+00,0.000000,0.000000e+00,0.000000,1.000000e+60,...,0.000000e+00,0.0000,0.000000e+00,0.00000,0.000000e+00,0.000000,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00
