# Run Experiments

Let $A \in \mathbb{R}^{m \times n}$, $H \in \mathbb{R}^{n \times m}$.

Consider the following properties:

$$
AHA = A \tag{P1} \\
$$

$$
HAH = H \tag{P2} \\
$$

$$
(AH)^T = AH \tag{P3} \\
$$

$$
(HA)^T = HA \tag{P4} \\
$$

From the full singular-value decomposition $A = U \Sigma V^T$ with 

$
\Sigma =: \begin{bmatrix}\underset{\scriptscriptstyle r\times r}{D} & \underset{\scriptscriptstyle r\times (n-r)}{0}\\[1.5ex]
\underset{\scriptscriptstyle (m-r)\times r}{0} & \underset{\scriptscriptstyle (m-r)\times (n-r)}{0}\end{bmatrix},
$

with $D$ diagonal, let $\Gamma:=V^T H U$, where

$
\Gamma=: \begin{bmatrix}\underset{\scriptscriptstyle r\times r}{X} & \underset{\scriptscriptstyle r\times (m-r)}{Y}\\[1.5ex]
\underset{\scriptscriptstyle (n-r)\times r}{Z} & \underset{\scriptscriptstyle (n-r)\times (m-r)}{W}\end{bmatrix}.
$

Then $H= V\Gamma U^T$. And we have the following theorem:

Theorem 1 (See [1]):

- Property (P1) is equivalent to $X = D^{-1}$.
- If property (P1) is satisfied, then (P2) is equivalent to $ZDY = W$.
- If property (P1) is satisfied, then (P3) is equivalent to $Y = 0$.
- If property (P1) is satisfied, then (P4) is equivalent to $Z = 0$.

In this experiment we solve the problems:

$$
\min_{F \in \mathbb{R}^{n \times m}, Z \in \mathbb{R}^{(n-r) \times r}}\{\sum_{i=1}^{n}\sum_{j=1}^{m} F_{ij} : F - V_2ZU_1^T \geq G, F + V_2ZU_1^T \geq -G\} \tag{$\mathcal{P}_{123}^{1}$} \\
$$

$$
\min_{\hat{H} \in \mathbb{R}^{n \times n}}\{||\hat{H}||_1 : \hat{A}\hat{H}\hat{A} = \hat{A}\} \tag{$\hat{P}_{1}^{1}$} \\
$$

$$
\min_{\hat{H} \in \mathbb{R}^{n \times n}}\{||\hat{H}||_1 : \hat{A}\hat{H}\hat{A} = \hat{A}, \hat{H} = \hat{H}^T\} \tag{$\hat{P}_{1, sym}^{1}$} \\
$$

Note that if $H$ satisfies (P1), (P2) and (P3) with respect to $A$, and using that $U = \begin{bmatrix}\underset{\scriptscriptstyle m \times r}{U_1} & \underset{\scriptscriptstyle m \times (m-r)}{U_2} \end{bmatrix}$, $V = \begin{bmatrix}\underset{\scriptscriptstyle n \times r}{U_1} & \underset{\scriptscriptstyle n \times (n-r)}{U_2} \end{bmatrix}$, $G = V_1D^{-1}U_1^T$, we have $||H||_1 = ||V\begin{bmatrix}D^{-1} & 0 \\ Z & 0 \end{bmatrix}U^T||_1 = ||G + V_2ZU_1^T||_1$. Thus, $(\mathcal{P}_{123}^{1})$ is just a reformulation of $\min \{||H||_1 : P1, P2, P3 \}$ using the full singular decomposition of $A$ and Theorem 1.

Each problem is solved using the following solver / algorithm:

- $(\mathcal{P}_{123}^{1})$: ADMM, ADMMe and LS (Local Search).
- $(\hat{P}_{1}^{1})$: Gurobi and LS.
- $(\hat{P}_{1, sym}^{1})$: Gurobi and LS.

The details of the implementation of ADMM and ADMMe for solving $(\mathcal{P}_{123}^{1})$ are explained in [2], LS is explained in [3].

To compare the different methods of solving the least-squares problem, we randomly generated 60 instances (10 for each configuration given by $(m, n, r, d)$), with varied ranks and densities, with the Matlab function sprand. The function generates a random $m \times n$ dimensional matrix $A$ with approximate density $d$ and singular values given by the non-negative input vector $rc$. The number of non-zero singular values in $rc$ is of course the desired rank $r$. The matrix is generated by sprand using random plane rotations applied to a diagonal matrix with the given singular values. For our experiments, we selected the $r$ nonzeros of $rc$ as the decreasing vector $M \times (\rho^1 , \rho^2 , ... , \rho^r )$, where $M = 2$, and $ρ = (1/M)^{(2/(r+1))}$ . The shape of this distribution is concave (as is the case for many matrices that one encounters), and moreover, the entries are not extreme (always between $1/2$ and $2$), and the product is unity, so we can reasonably hope that the numerics may not be terrible.

The parameters we used are $m = 500$, $n = 250$, $r = 25, 75, 125$, and density $d = 0.1, 0.25$.

We colleted the following measures for each problem:

- $||\tilde{H}||_1$
- $||\tilde{H}||_0$
- Time(s) (Computational Time)

Where $\tilde{H} = H \text{ or } \hat{H}$.

References:

- [1] Adi Ben-Israel and Thomas N.E. Greville. Generalized Inverses: Theory and Applications. Springer, 2 edition, 2003. https://doi.org/10.1007/b97366.
- [2] Gabriel Ponte, Marcia Fampa, Jon Lee, and Luze Xu. Good and fast row-sparse ah-symmetric reflexive generalized inverses, 2024. https://arxiv.org/abs/2401.17540.
- [3] Marcia Fampa, Jon Lee, Gabriel Ponte, Luze Xu. Experimental analysis of local searches for sparse reflexive generalized inverses, 2021. https://arxiv.org/abs/2001.03732.

## Imports

In [None]:
# Disables generation of pycache file
import sys
sys.dont_write_bytecode = True

# Imports libraries
import time
import os
import pandas as pd

# Imports made functions
from solvers import *
from utility import *
from admm_solvers import *
from local_search import *

## Running Experiments

In [None]:
result_columns_basenames = ["||H||_1" , "||H||_0", "Time(s)"]

# problems = ["1_norm_P123_admm", "1_norm_P123_admme", "1_norm_P1_LP", "1_norm_P1_sym_LP",
#             "1_norm_P123_LS", "1_norm_P1_LS", "1_norm_P1_sym_LS"]

problems = ["1_norm_P1_LP", "1_norm_P1_sym_LP", "1_norm_P123_LS", "1_norm_P1_LS", "1_norm_P1_sym_LS"]

result_column_names = []
for problem in problems:
    for basename in result_columns_basenames:
        result_column_names.append(f"{problem}_{basename}")

column_names = ["m", "n", "r", "d", "||A||_0", "||A^+||_0", "||A^+||_1", "||hatA^+||_0", "||hatA^+||_1"] + result_column_names

experiment = "6"

csv_file_path = f"./results/results_{experiment}.csv"
temp_csv_file_path = f"./results/results_{experiment}_temp.csv"
excel_file_path = f"./results/results_{experiment}.xlsx"

In [None]:
solvers = {
    "1_norm_P123_admm": admm1_123,
    "1_norm_P123_admme": admm1e_123,
    "1_norm_P1_LP": problem_1_norm_P1_solver,
    "1_norm_P1_sym_LP": problem_1_norm_P1_sym_solver,
    "1_norm_P123_LS": local_search_procedure,
    "1_norm_P1_LS": local_search_procedure,
    "1_norm_P1_sym_LS": local_search_procedure
}

is_viable_checks = {
    "1_norm_P123_admm": problem_1_norm_P1_P2_P3_viable_solution,
    "1_norm_P123_admme": problem_1_norm_P1_P2_P3_viable_solution,
    "1_norm_P1_LP": problem_1_norm_P1_viable_solution,
    "1_norm_P1_sym_LP": problem_1_norm_P1_sym_viable_solution,
    "1_norm_P123_LS": problem_1_norm_P1_P2_P3_viable_solution,
    "1_norm_P1_LS": problem_1_norm_P1_viable_solution,
    "1_norm_P1_sym_LS": problem_1_norm_P1_sym_viable_solution
}

In [None]:
def get_data(problem, A, hatA):
    if problem in ["1_norm_P123_admm", "1_norm_P123_admme", "1_norm_P123_LS"]:
        return A
    elif problem in ["1_norm_P1_LP", "1_norm_P1_sym_LP", "1_norm_P1_LS", "1_norm_P1_sym_LS"]:
        return hatA
    else:
        raise Exception("ProblemNonexistent: Trying to solve a nonexistent problem.")

def get_func_name(problem):
    if problem == "1_norm_P123_LS":
        return "LSFI_Det_P3"
    elif problem == "1_norm_P1_LS":
        return "LSFI_Det"
    elif problem == "1_norm_P1_sym_LS":
        return "LSFI_Det_Symmetric"
    else:
        raise Exception("FunctionNameNonexistent: Trying to get a nonexistent function name.")

In [None]:
try:
    df = pd.read_csv(csv_file_path)
except:
    # Creates an empty dataframe with the specified column names
    df = pd.DataFrame(columns=column_names)

    # Saves dataframe as a csv file
    df.to_csv(csv_file_path, index=False)

In [None]:
df = pd.read_csv(csv_file_path)

In [None]:
matrices_filepath = get_experiment_matrices_filepath(experiment)

for matrix_filepath in matrices_filepath[:1]:
    A = read_matrix(matrix_filepath)
    m, n, r, d = get_m_n_r_d_from_matrix_filepath(matrix_filepath)
    A_MP = np.linalg.pinv(A)

    hatA = np.dot(A.T, A)
    hatA_MP = np.linalg.pinv(hatA)

    instance_results = {
        "m": m,
        "n": n,
        "r": r,
        "d": d,
        "||A||_0": matrix_vec_0_norm(A),
        "||A^+||_0": matrix_vec_0_norm(A_MP),
        "||A^+||_1": matrix_vec_1_norm(A_MP),
        "||hatA^+||_0": matrix_vec_0_norm(hatA_MP),
        "||hatA^+||_1": matrix_vec_1_norm(hatA_MP)
    }

    for i in range(len(problems)):
        problem = problems[i]
        solver = solvers[problem]
        is_viable_check = is_viable_checks[problem]
        matrixA = get_data(problem=problem, A=A, hatA=hatA)
        if problem in ["1_norm_P123_LS", "1_norm_P1_LS", "1_norm_P1_sym_LS"]:
            start_time = time.time()
            H_star = solver(experiment=experiment, matrix_filepath=matrix_filepath, func_name=get_func_name(problem=problem))
            end_time = time.time()
            if (not is_viable_check(A=matrixA, H=H_star, m=m, n=n)):
                print(f"m: {m}")
                print(f"problem: {problem} did not find a viable solution")
                break
            problem_results = calculate_problem_results_5(A=matrixA, H=H_star, problem=problem)
            for key, value in problem_results.items():
                instance_results[key] = value
            instance_results[f"{problem}_Time(s)"] = end_time - start_time
        else:
            start_time = time.time()
            H_star = solver(A=matrixA)
            end_time = time.time()
            if (not is_viable_check(A=matrixA, H=H_star, m=m, n=n)):
                print(f"m: {m}")
                print(f"problem: {problem} did not find a viable solution")
                break
            problem_results = calculate_problem_results_5(A=matrixA, H=H_star, problem=problem)
            for key, value in problem_results.items():
                instance_results[key] = value
            instance_results[f"{problem}_Time(s)"] = end_time - start_time

    df.loc[len(df)] = instance_results
    df.to_csv(csv_file_path, index=False)

In [None]:
# Creates new csv file with different column name
with open(csv_file_path, "r") as file:
    data = file.readlines()
    for i in range(len(data)):
        for j in range(len(problems)):
            data[i] = data[i].replace(f"{problems[j]}_", "")
    with open(temp_csv_file_path, "w")as file2:
        for line in data:
            file2.write(line)

import csv
from openpyxl import Workbook

# Creates a new Excel sheet
workbook = Workbook()
sheet = workbook.active

# Reads and csv file and write the data on the Excel sheet
with open(temp_csv_file_path, mode='r', encoding='utf-8') as csv_file:
    csv_reader = csv.reader(csv_file)
    for row in csv_reader:
        sheet.append(row)

# Saves the Excel file
workbook.save(excel_file_path)

# Deletes temp csv file
# Verifies if file exists before deleting
if os.path.exists(temp_csv_file_path):
    os.remove(temp_csv_file_path)