This notebook is a walk through of the basics of matrix rank over arbitrary fields.  It assumes that the reader knows about gaussian elimination for square invertible matrices, and perhaps cramer's rule as well.  The idea is to take the nice properties we know for invertible (i.e. full rank, square) matrices and to see how we can use them to develop properties for less ideal matrices (i.e. not square / and/or not full rank).  

We start by considering rank to mean the column rank, and in step **2** prove that the column rank must be equal to the row rank.  



for this entire notebook, we have  
$\mathbf A \in \mathbb F^\text{m x n}$  

**0. (column) Rank Definition**  

(from page 29 of *A Terse Introduction to Linear Algebra*)  
*The column space of matrix* $\mathbf A$... *is the subspace of* $\mathbb F^{m}_c$ *spanned by the columns of* $\mathbf A$.  *The dimension of this space is called the column rank of* $\mathbf A$.  

**tbc**  on what  $\mathbb F^{m}_c$ means, in particular the c... i think its linear combinations of columns in this vector space   
In particular we can characterize this dimension as the cardinality of the minimal set of (linearly independent) vectors that generate the column space.    

**1. Right matrix multiplication cannot increase rank**  

$\mathbf {A}  =\bigg[\begin{array}{c|c|c|c|c} \mathbf a_1 & \mathbf a_2 &\cdots & \mathbf a_{n-1} & \mathbf a_{n}\end{array}\bigg]$   


with  
$\mathbf B \in \mathbb F^\text{n x r}$  

we know  
$\text{rank}\big(\mathbf A\big) = \text{minimal generating set for column space / dimension of span of A's image}$  
$\text{span}\big(\mathbf A\big) = \sum_{j=1}^n \alpha_j \mathbf a_j$  
where $\alpha_j$ are (unrestricted) scalars that exist in our field  

now  
$\mathbf {AB}  $  
$= \mathbf A \bigg[\begin{array}{c|c|c|c|c} 
\mathbf b_1 & \mathbf b_2 &\cdots & \mathbf b_{r-1} & \mathbf b_{r} 
\end{array}\bigg]   $  
$=  \bigg[\begin{array}{c|c|c|c|c} \mathbf A \mathbf b_1 & \mathbf A\mathbf b_2 &\cdots & \mathbf A\mathbf b_{r-1} & \mathbf A\mathbf b_{r} 
\end{array}\bigg] $  
$= \bigg[\begin{array}{c|c|c|c|c} \sum_{k=1}^n b_{k,1} \mathbf a_j & \sum_{k=1}^n b_{k,2} \mathbf a_j &\cdots & \sum_{k=1}^n b_{k,r-1} \mathbf a_j & \sum_{k=1}^n b_{k,} \mathbf a_j
\end{array}\bigg]$  

so  
$\text{span}\big(\mathbf {AB}\big) = \beta_1\big(\sum_{k=1}^n b_{k,1} \mathbf a_j\big) +\beta_2\big(\sum_{k=1}^n b_{k,2} \mathbf a_j\big) + ... \beta_r\big(\sum_{k=1}^n b_{k,r} \mathbf a_j\big) \subset \sum_{j=1}^n \alpha_j \mathbf a_j =\text{span}\big(\mathbf A\big)   $


(where $\subset$ denotes subset, not 'proper subset' per se.)  The inclusion follows because for any choices of $\beta_k$ and $b_{k,j}$ or if the reader prefers, $\beta_k\cdot b_{k,j}$, which must exist in our field, the sum  given by 
$\sum_{k=1}^r \beta_k\cdot b_{k,j}$ must exist in our field (closure) and we can select $\alpha_j :=\sum_{k=1}^r \beta_k\cdot b_{k,j}$ if we like.  

now the matrix $\big(\mathbf {AB}\big)$ has a minimal generating set and this cardinality is given by the rank.  And we conclude that  
$\text{rank}\big(\mathbf {AB}\big)\leq \text{rank}\big(\mathbf {A}\big)$   
because every generator of $\big(\mathbf {AB}\big)$ (which are subsets of the span) may be written as a linear combination of the generators of $\big(\mathbf {A}\big)$ (which generate its entire span).  For avoidance of doubt, if 
$\mathbf C:= \big(\mathbf {AB}\big)$ had more linearly independent columns / generators than $\mathbf A$ (suppose 
$k_c \gt k_a$ we'd have (we take WLOG that the first $k_c$ and $k_r$ columns are linearly independent)  

**the below ending is long and needs cleaned up**  


$\mathbf 0 = \gamma_1 \mathbf c_1 + \gamma_2 \mathbf c_2 +... + \gamma_{k_c} \mathbf c_{k_c}$   
implies $\gamma_i =0$ in all cases by the definition of linearly independent.  But 

$\mathbf 0 = \gamma_1 \big(\sum_{j=1}^{k_a} \alpha_{1,j} \mathbf a_j\big) + \gamma_2 \big(\sum_{j=1}^{k_a} \alpha_{2,j} \mathbf a_j\big) +... + \gamma_{k_c} \big(\sum_{j=1}^{k_a} \alpha_{k_a,c} \mathbf a_j\big)$   

and letting invertible $\mathbf S$   
$\mathbf S := \bigg[\begin{array}{c|c|c|c|c|c} \mathbf a_1 & \cdots & \mathbf a_{k_a} & \mathbf s_{{k_a} +1}&\cdots & \mathbf s_{n}\end{array}\bigg]$   

i.e. $\mathbf S$ takes the (any maximal sized collection of) linearly indendent columns in $\mathbf A$ and the extends them with additional vectors to create a basis, 

but we have  

$\mathbf 0$  
$\mathbf S^{-1}\mathbf 0 $  
$= \gamma_1 \mathbf S^{-1}\mathbf c_1 + \gamma_2 \mathbf S^{-1}\mathbf c_2 +... + \gamma_{k_c} \mathbf S^{-1}\mathbf c_{k_c}$  
$=  \gamma_1 \mathbf S^{-1}\big(\sum_{j=1}^{k_a} \alpha_{1,j} \mathbf a_j\big) + \gamma_2 \mathbf S^{-1}\big(\sum_{j=1}^{k_a} \alpha_{2,j} \mathbf a_j\big) +... + \gamma_{k_c} \mathbf S^{-1}\big(\sum_{j=1}^{k_a} \alpha_{k_a,c} \mathbf a_j\big)$   
$=  \gamma_1 \big(\sum_{j=1}^{k_a} \alpha_{1,j} \mathbf S^{-1}\mathbf a_j\big) + \gamma_2 \big(\sum_{j=1}^{k_a} \alpha_{2,j} \mathbf S^{-1}\mathbf a_j\big) +... + \gamma_{k_c} \big(\sum_{j=1}^{k_a} \alpha_{k_a,c} \mathbf S^{-1}\mathbf a_j\big)$   
$=  \gamma_1 \big(\sum_{j=1}^{k_a} \alpha_{1,j} \mathbf e_j\big) + \gamma_2 \big(\sum_{j=1}^{k_a} \alpha_{2,j}\mathbf e_j\big) +... + \gamma_{k_c} \big(\sum_{j=1}^{k_a} \alpha_{k_a,c} \mathbf e_j\big)$   

where we have asserted that the first $k_a$ standard basis vectors (i.e. in effect living in a $k_a$ dimensional space) have no non-trivial linear combination to create the zero vector an obvious contradiction.  The above is equivalent to asserting that there is no $\mathbf x \neq \mathbf 0$ such that 

$\bigg[\begin{array}{c|c|c|c} \sum_{j=1}^{k_a} \alpha_{1,j} \mathbf e_j & \big(\sum_{j=1}^{k_a} \alpha_{2,j}\mathbf e_j\big) &\cdots & \sum_{j=1}^{k_a} \alpha_{k_a,c} \mathbf e_j\end{array}\bigg]\mathbf x = \mathbf 0$   

where we can ignore rows $\gt a_k$ because they are all zeros.  In effect the above matrix is short and fat, lives in a vector space of dimension at most $a_k$ (in the maximal case, generated by the $a_k$ standard basis vectors) and hence has $a_c - a_k \geq a_c - \text{linearly independent columns} \gt 0$ linearly dependent columns, an obvious contradiction.   


**2. corollary: right multiplication by an invertible matrix does not change rank**  

$\mathbf B \in \mathbb F^\text{n x n}$ *is invertible*, then the above tells us  

$\text{rank}\big(\mathbf {AB}\big)\leq \text{rank}\big(\mathbf {A}\big)$  
re-run the arument on $\mathbf C:= \mathbf {AB}$ and $\mathbf B^{-1}$.  This gives us  

$\text{rank}\big(\mathbf {CB}^{-1}\big)\leq \text{rank}\big(\mathbf {C}\big)$  
or  
$\text{rank}\big(\mathbf {A}\big) =\text{rank}\big(\mathbf {ABB}^{-1}\big)= \text{rank}\big(\mathbf {CB}^{-1}\big)\leq \text{rank}\big(\mathbf {C}\big) = \text{rank}\big(\mathbf {AB}\big)$  

and we conclude   
$\text{rank}\big(\mathbf {A}\big)\leq \text{rank}\big(\mathbf {AB}\big)\leq \text{rank}\big(\mathbf {A}\big)$    

or  
$\text{rank}\big(\mathbf {A}\big) = \text{rank}\big(\mathbf {AB}\big)$  
for invertible $\mathbf B$  



**3.**  $\text{Row Rank}\big(\mathbf A\big) = \text{Column Rank}\big(\mathbf A\big) = \text{Rank}\big(\mathbf A\big)$     
note: $\text{Row Rank}\big(\mathbf A\big) = \text{Column Rank}\big(\mathbf A^T\big)$  
by obvious rules of transposition.  We also *already* know that this is true in the special case of $\mathbf A$ being square and invertible by, say, properties of determinants.  (And we know this is true in the special case of the zero matrix being the only matrix with rank zero.)  


In particular consider the case of $\text{Column Rank}\big(\mathbf A\big)  = r \leq n$  

consider a well chosen idempotent matrix (more on this a bit later)  

$\mathbf P = \mathbf S \mathbf D \mathbf S^{-1} = \mathbf S \begin{bmatrix}\mathbf I_r &\mathbf {00}^T\\ \mathbf {00}^T &\mathbf {00}_{n-r}^T\end{bmatrix} \mathbf S^{-1}$  

$\mathbf P^2 = \mathbf P$  
so it is idempotent  


we assume WLOG that the first $r$ columns of $\mathbf A$ are linearly independent (for notational ease -- using the above, we can also explicitly multiply by a permutation matrix to effect this without changing the rank... and if that makes the reader uncomfortable, there is an obvious spot in the proof where we can multiply again by the transpose of said permutation matrix to invert this)  


using the above we have 

$\text{row rank}\Big(\mathbf A\Big)$  
$=\text{rank}\Big(\mathbf A^T\Big)$  
$=\text{rank}\Big(\mathbf A^T\big(\mathbf S^{-1}\big)^T\Big)$  
$=\text{row rank}\Big(\mathbf S^{-1}\mathbf A\Big)$  
$=\text{row rank}\Big(\bigg[\begin{array}{c|c|c|c|c|c|c} \mathbf S^{-1}\mathbf a_1 & \mathbf S^{-1}\mathbf a_2 &\cdots & \mathbf S^{-1}\mathbf a_{r}& \mathbf S^{-1}\mathbf a_{r+1}&  \cdots & \mathbf S^{-1}\mathbf a_{n}\end{array}\bigg]\Big)$   
$=\text{row rank}\Big(\bigg[\begin{array}{c|c|c|c|c|c|c} \mathbf S^{-1}\mathbf a_1 & \mathbf S^{-1}\mathbf a_2 &\cdots & \mathbf S^{-1}\mathbf a_{r}& \mathbf S^{-1}\sum_{j=1}^r \gamma_{r+1,j}\mathbf a_j&  \cdots & \mathbf S^{-1}\sum_{j=1}^r \gamma_{r+1,j}\mathbf a_j\end{array}\bigg]\Big)$    
$=\text{row rank}\Big(\bigg[\begin{array}{c|c|c|c|c|c|c} \mathbf e_1 & \mathbf e_2 &\cdots & \mathbf e_r & \sum{j=1}\gamma_{r+1,j}\mathbf e_j&  \cdots & \sum{j=1}\gamma_{r+1,j}\mathbf e_j\end{array}\bigg]\Big)$    
$\leq r$  
$=\text{col rank}\Big(\mathbf A\Big)$  
because the above matrix has all zeros on rows $r+1, r+2,..., n$  so the number of linearly independent rows is at most the first $r$ rows.  


now if we re-run the above argument on $\big(\mathbf A^T\big)$ which has (column) rank $r'$ and we again choose a suitable projection  

$\mathbf P^{(2)} = \mathbf V\begin{bmatrix}\mathbf I_{r'} &\mathbf {00}^T\\ \mathbf {00}^T &\mathbf {00}_{n-r}^T\end{bmatrix} \mathbf V^{-1}$  

then we recover  
$\text{row rank}\Big(\mathbf A^T\Big)$  
$=\text{rank}\Big(\mathbf A\Big)$  
$=\text{rank}\Big(\mathbf A^T\big(\mathbf V^{-1}\big)^T\Big)$  
$=\text{row rank}\Big(\mathbf V^{-1}\mathbf A\Big)$  
$\leq r'$  
$=\text{row rank}\Big(\mathbf A\Big)$  

putting these inequalities together gives  

$r'=\text{row rank}\Big(\mathbf A^T\Big) \leq \text{col rank}\Big(\mathbf A\Big) = r \leq \text{row rank}\Big(\mathbf A\Big)  =r'  $   

which is to say  
$r' = r$ or  
$\text{Row Rank}\big(\mathbf A\big) = \text{Column Rank}\big(\mathbf A\big) = \text{Rank}\big(\mathbf A\big)$    

**explanation of the idempotent matrix**  



**4. corollary: left multiplication by an invertible matrix does not change rank**    
again with invertible $\mathbf B$ 

$\text{rank}\big(\mathbf B\mathbf A \big)$  
$= \text{row rank}\big(\mathbf B\mathbf A \big)$  using (3)  
$=\text{col rank}\big(\mathbf A^T \mathbf B^T\big)$     
$=\text{rank}\big(\mathbf A^T \mathbf B^T\big)$  using (3)    
$=\text{rank}\big(\mathbf A^T\big)$  using (2)    
$=\text{rank}\big(\mathbf A\big)$  using (3)    




**5. Dimension of (right) nullspace plus column rank = n = number of columns**  

the dimension of the nullspace of $\mathbf A$ is the maximal set size of the number of linearly independent $\mathbf x_k$ such that  

$\mathbf A\mathbf x_k = \mathbf 0$  


suppose that the nullspace dimension is $r$, then construct invertible $\mathbf S$      

$\mathbf S := \bigg[\begin{array}{c|c|c|c|c|c} \mathbf x_1 & \cdots & \mathbf x_{r} & \mathbf s_{r +1}&\cdots & \mathbf s_{n}\end{array}\bigg]$  

and again consider the idempotent matrix  

$\mathbf P = \mathbf S \mathbf D \mathbf S^{-1} = \mathbf S \begin{bmatrix}\mathbf I_r &\mathbf {00}^T\\ \mathbf {00}^T &\mathbf {00}_{n-r}^T\end{bmatrix} \mathbf S^{-1}$   

in particular, notice  

$\mathbf S \mathbf D = \bigg[\begin{array}{c|c|c|c|c|c} \mathbf x_1 & \cdots & \mathbf x_{r} & \mathbf 0 &\cdots & \mathbf 0\end{array}\bigg]$  


This implies  

$\mathbf A\mathbf P = \Big(\mathbf A\big(\mathbf {SD}\big)\Big)\mathbf S^{-1} = \mathbf 0$  

now consider the complement  
$\mathbf P^c:= \mathbf I_n - \mathbf P$  
this too is idempotent because  
$\big(\mathbf P^c\big)^2= \mathbf I_n - 2\mathbf P + \mathbf P^2 = \mathbf I_n - 2\mathbf P + \mathbf P = \mathbf I_n - \mathbf P = \mathbf P^c$   

note this also implies  

$\mathbf P^c:= \mathbf I_n - \mathbf P   = \mathbf S\mathbf I \mathbf S^{-1} - \mathbf {SDS}^{-1} = \mathbf S\big(\mathbf I - \mathbf D\big)\mathbf S^{-1} = \mathbf S \begin{bmatrix}\mathbf {00}_r &\mathbf {00}^T\\ \mathbf {00}^T&\mathbf I_{n-r}\end{bmatrix} \mathbf S^{-1}$  

we can also observe that  
$\Big(\mathbf S \begin{bmatrix}\mathbf {00}_r &\mathbf {00}^T\\ \mathbf {00}^T&\mathbf I_{n-r}\end{bmatrix} \mathbf S^{-1}\Big)^2 = \Big(\mathbf S \begin{bmatrix}\mathbf {00}_r &\mathbf {00}^T\\ \mathbf {00}^T&\mathbf I_{n-r}\end{bmatrix}^2 \mathbf S^{-1}\Big) = \Big(\mathbf S \begin{bmatrix}\mathbf {00}_r &\mathbf {00}^T\\ \mathbf {00}^T&\mathbf I_{n-r}\end{bmatrix} \mathbf S^{-1}\Big)$  

and since multiplication by an invertible matrix does not change rank, we have (i.e. by examining the diagonal matrix above)    
$\text{rank}\big(\mathbf P^c\big) = n-r$  


- - - -   
*quick dimensional note* for the less pleasant case of non-square $\mathbf A$  
(this section may be skipped though the reader may review it in case of confusion about dimensions)    

$\mathbf {A}  =\bigg[\begin{array}{c|c|c|c|c} \mathbf a_1 & \mathbf a_2 &\cdots & \mathbf a_{n-1} & \mathbf a_{n}\end{array}\bigg]= \mathbf A\mathbf I_n = \bigg[\begin{array}{c|c|c|c} 
\mathbf A\mathbf e_1 & \mathbf A\mathbf e_2 &\cdots & \mathbf A \mathbf e_{n} 
\end{array}\bigg]= \bigg[\begin{array}{c|c|c|c|c} \mathbf a_1 & \mathbf a_2 &\cdots & \mathbf a_{n-1} & \mathbf a_{n}\end{array}\bigg] =\mathbf A$   
- - - -   

so we have  
$\mathbf A = \mathbf A\mathbf I_n = \mathbf A\big(\mathbf P + \mathbf P^c\big) =  \mathbf A\mathbf P + \mathbf A\mathbf P^c = \mathbf 0 + \mathbf A\mathbf P^c = \mathbf A\mathbf P^c$  

or more simply  
$\mathbf A = \mathbf A\mathbf P^c$  

but incorporating our earlier results, we immediately have  
$\text{rank}\big(\mathbf A\big) = \text{rank}\big(\mathbf A\mathbf P^c\big)   \leq \text{rank}\big(\mathbf P^c\big) = n - r$  


However, we can tighten this to show that it is in fact the equality case, with  

$\mathbf Z: = \mathbf {AS}$  

$\text{rank}\Big(\mathbf Z\Big) = \text{rank}\Big(\mathbf {AS}\Big)$   
because multiplication by an invertible matrix does not change rank.  
Further, the linearly independent set of $\mathbf x_k$ in the (right) nullspace of $\mathbf A$ has a linearly independent set of the same cardinality  
$\mathbf x_k = \mathbf S^{-1}\mathbf y_k$  or  
$\mathbf S\mathbf x_k = \mathbf y_k$  
(where our invertible map associates one and only one vector with each $\mathbf x_k$, and being invertible preserves the dimension of this set -- in particular  

$\text{dim}\Big(\text{null}\big(\mathbf Z\big)\Big)\geq  \text{dim}\Big(\text{null}\big(\mathbf A\big)\Big)$  
but an identical argument on $\mathbf A = \mathbf {ZS}^{-1}$  gives  
$\text{dim}\Big(\text{null}\big(\mathbf A\big)\Big)\geq  \text{dim}\Big(\text{null}\big(\mathbf Z\big)\Big)$   

so
$\text{dim}\Big(\text{null}\big(\mathbf Z\big)\Big) = \text{dim}\Big(\text{null}\big(\mathbf A\big)\Big)$  


- - - - -   
the main argument then is  
$\text{rank}\Big(\mathbf A\Big)  $  
$= \text{rank}\Big(\mathbf A \mathbf P^c\Big) $  
$=\text{rank}\Big(\mathbf A\mathbf S \begin{bmatrix}\mathbf {00}_r &\mathbf {00}^T\\ \mathbf {00}^T&\mathbf I_{n-r}\end{bmatrix} \mathbf S^{-1}\Big)$  
$=\text{rank}\Big(\mathbf Z \begin{bmatrix}\mathbf {00}_r &\mathbf {00}^T\\ \mathbf {00}^T&\mathbf I_{n-r}\end{bmatrix} \mathbf S^{-1}\Big)$  
$=\text{rank}\Big(\bigg[\begin{array}{c|c|c|c|c|c|c} \mathbf 0 & \mathbf 0 &\cdots & \mathbf 0 & \mathbf z_{r+1} & \cdots & \mathbf z_{n}\end{array}\bigg]\mathbf S^{-1}\Big)$   
$=\text{col rank}\Big(\bigg[\begin{array}{c|c|c|c|c|c|c} \mathbf 0 & \mathbf 0 &\cdots & \mathbf 0 & \mathbf z_{r+1} & \cdots & \mathbf z_{n}\end{array}\bigg]\mathbf S^{-1}\Big)$   
$=\text{col rank}\Big(\bigg[\begin{array}{c|c|c|c|c|c|c} \mathbf 0 & \mathbf 0 &\cdots & \mathbf 0 & \mathbf z_{r+1} & \cdots & \mathbf z_{n}\end{array}\bigg]\Big)$  
i.e. the above matrix has n-r linearly independent rows (columns).  
*how do we know this?* We can see by inspection that the final matrix has the first $r$ columns as zero vectors, so  

$\bigg[\begin{array}{c|c|c|c|c|c|c} \mathbf 0 & \mathbf 0 &\cdots & \mathbf 0 & \mathbf z_{r+1} & \cdots & \mathbf z_{n}\end{array}\bigg]\mathbf e_k = \mathbf 0$  

i.e. the kth standard basis vectors for $k\in\{1,2,..., r\}$  

so there are at least $r$ linearly independent vectors in the nullspace of the final matrix -- but we know by assumption that there are exactly $r$ linearly independent vectors in said nullspace, so this implies  

$ \mathbf 0 = \sum_{k=r+1}^n \alpha_{k}\mathbf z_{k}$    
**iff** each $\alpha_{k} =0 $

otherwise there would be at least one more linearly independent vector in the nullspace of $\mathbf Z$, hence  
$\big\{\mathbf z_{r+1} , \mathbf z_{r+2}, ..., \mathbf z_{n}\big\}$  
form a linearly independent set  

Thus $\mathbf Z$ has  
dim(nullspace) + rank = r + (n-r) = n = dimension  = number of columns in $\mathbf Z$    
and $\mathbf A$ has the same dimension nullspace, same rank, and same number of columns as $\mathbf Z$, and hence $\mathbf A$ obeys this as well   

