# Orojenesis Artifact - Customization
The ipython notebook file contains Orojenesis examples to customize bounds for various Einsums.
the ISCA'24 *"Mind the Gap: Attainable Data Movement and Operational Intensity Bounds for Tensor Algorithms"* paper. Please run install.sh to install software dependencies.

## 0.  Setup Software Dependencies 
 Please first run install.sh to install software dependencies.

In [None]:
import os
if "TIMELOOP_BASE_PATH" not in os.environ:
    timeloop_path = input("Please specify the path to Timeloop repo (default: " +  os.getcwd() + "/../):" ) or os.getcwd() + "/../"
    os.environ["TIMELOOP_BASE_PATH"] = timeloop_path
    os.environ["TIMELOOP_DIR"] = timeloop_path
os.environ["TIMELOOP_ENABLE_FIRST_READ_ELISION"] = "1"
print("Path to timeloop repo: ", os.environ["TIMELOOP_BASE_PATH"])
import pathlib
import src.utils as utils

## 1. Customization Example 
This section demonstrates how to customize workload definitions and mapper constraints for Orojenesis bound generation.

- **Workload Definition**: The workload definition describes the tensor workload being analyzed.
    - <ins>Predefined Workload Classes</ins>: 
    We provide a base class named *Op* in `src/utils.py` that serves as an abstraction for different workload types. Currently, it supports convolution (*Conv*) and grouped batched matrix multiplication (*GBMM*). 
    - <ins>Defining New Einsum Shapes</ins>:
    If you need to handle a new Einsum shape beyond *Conv* and *GBMM*, you can easily extend the functionality by following the template provided in the Op class.
    - <ins>Problem Definition Output</ins>:
    The `to_yaml` function is responsible for converting the workload definition into a YAML format that adheres to the [Timeloop problem format](https://timeloop.csail.mit.edu/v4/input-formats/problem).
    
- **[Optional] Mapper**: The mapper specifies the search strategy and mapping constraints. 
    - <ins>Generic Mapper</ins>: We provide a generic mapper in `configs/single-einsum/mapper.yaml` that can work for most Einsum shapes. 
    - <ins>Workload-Specific Constraints</ins>: If you have knowledge of suboptimal or irrelevant search space options specific to your workload, you can define additional constraints in the mapper_constraints section of the mapper file. An example of this is provided in `configs/single-einsum/conv_mapper.yaml`.For *Conv* workloads. For more details on Timeloop mapper constraints,, please refer to [Timeloop mapper constraints](https://timeloop.csail.mit.edu/v4/input-formats/design/constraints).
    
The Speeder architecture is defined in `./outputs/single-einsum/arch.yaml`. In most cases, you won't need to modify this file.
    
## Example: Deriving Bounds for 1x1 Convolution

Let's assume we want to derive Orojenesis bounds for a 1x1 convolution with input channel size 32 and output channel size 16. Here's how to define the problem using the *Conv* class:

In [None]:
# Define the workload shape. 
prob = utils.Conv(R=1, S=1, C=32, K=16)
mapper_yaml = pathlib.Path('./configs/single-einsum/conv_mapper.yaml') 

# Specify output directory
output_dir = pathlib.Path('./outputs/single-einsum')

arch_yaml = pathlib.Path('./configs/single-einsum/arch.yaml')
utils.GenerateBound(prob, output_dir, arch_yaml, mapper_yaml, keep_one_best_entry_across_buf=True)

# Output CSV paths  
stats_files = utils.get_stats_files(output_dir, [prob]) 
print(f'Output CSV file: {stats_files[0]}')

- **Interpreting the CSV output**: 
    - <ins>Column 0</ins>: the buffer size in ascending order  
    - <ins>Column 1</ins>: the corresponding achievable operational intensity (OI) 
    - <ins>Column 2</ins>: the corresponding achievable DRAM access count 
    - <ins>Column 3</ins>: the mapping shortform 