# Assemble

This jornal runs examples for the following functions:
1. **breakup_overlaps_by_intersect** : Extract repeats in input_patter_obj that 
    has the starting indices of the repeats, into the essential structure 
    componets using bw_vec, that has the lengths of each repeat.
    
2. **check_overlaps**: Compares every pair of groups, determining if there are
    any repeats in any repeats in any pairs of the groups that overlap. 

3. **__num_of_parts** : Determine the number of blocks of consecutive time 
    steps in a list of time steps. A block of consecutive time steps represent 
    a distilled section of a repeat.    
4. **__inds_to_rows** :  Expands a vector containing the starting indices of a 
    piece or two of a repeat into a matrix representation recording when these
    pieces occur in the song with 1's. All remaining entries are marked with 
    0's.
5. **_compare_and_cut** : Compares two rows of repeats labeled RED and BLUE, and
    determines if there are any overlaps in time between them. If there is, 
    then we cut the repeats in RED and BLUE into up to 3 pieces. 
6. **merge_based_on_length** : Merges repeats that are the same length, as set 
    by full_bandwidth, and are repeats of the same piece of structure
7. **_merge_rows** : Merges rows that have at least one common repeat; said 
    common repeat(s) must occur at the same time step and be of common length
8. **hierarchical_structure** : Distills the repeats encoded in MATRIX_NO 
    (and KEY_NO) to the essential structure components and then builds the 
    hierarchical representation


# Need a new picture to show the relationships between the functions 


## Import Modules

In [1]:
import numpy as np
import assemble
from assemble import *
from inspect import signature 
from search import find_all_repeats
from utilities import reconstruct_full_block

[[0 0 1 1 1 0 0 0 1 0 0 0 1 1 0 0 1 0 0 0]
 [1 1 1 0 0 0 1 0 0 0 1 1 0 0 1 0 0 0 0 0]]
[1]
[[0 1 1 1 1 0 0 1 1 0 0 1 1 1 0 1 1 0 0 0]
 [1 1 1 1 0 0 1 1 0 0 1 1 1 0 1 1 0 0 0 0]
 [1 1 1 1 1 0 1 1 1 0 1 1 1 1 1 1 1 0 0 0]]
[1]


## 1. breakup_overlaps_by_intersect 

### About breakup_overlaps_by_intersect 


### Arguments

### Returns

### Example

## 2. check_overlaps
### About check_overlaps:

### Arguments

### Returns

### Example

## 3. __num_of_parts

### About __num_of_parts

This function is used to determine the number of blocks of consecutive time steps in a list of time steps. A block of consecutive time steps
represent a distilled section of a repeat. This distilled section will be replicated and the starting indices of the repeats within it will be 
returned. At the beginning of this function, it uses variable breakmark to check if input_vec contains a whole group of consecutive time steps or two groups of consecutive time steps. Then these two conditions go into the if-else statement separately and returns the moved starting indices and the corresponding lengths.

<img src="num_of_parts.png" alt="Chart" style="width:800px;" align = "middle"/>

As the above pictures shows, red and blue are two repeats, and purple is their intersection. Now we want to know the starting indices and length of the remaining part after cutting purple out of all of the repeats in red. Case a and b will go into if statement and c will go into else statement because we can see how many parts in input_vec in the picture clearly.

### Arguments

- input_vec: An array contains one or two parts of a repeat that are overlap(s) in time that may need to be replicated 
            
- input_start: An array contains starting index for the part to be replicated 
        
- input_all_starts: An array contains starting indices for replication 

### Returns

- start_mat: An array of one or two rows, containing the starting indices of the replicated repeats 
            
- length_vec: A column vector containing the lengths of the replicated parts 

### Examples

**Input 1** which goes into the if statement: 

In [2]:
input_vec = np.array([3,4])
input_start = np.array([0])
input_all_starts = np.array([3,7,10])

**Output 1**

In [3]:
from assemble import __num_of_parts
__num_of_parts(input_vec,input_start,input_all_starts)

(array([ 6, 10, 13]), 2)

**Input 2** which goes into the else statement: 

In [4]:
input_vec = np.array([3,5])
input_start = np.array([3])
input_all_starts = np.array([3,7,10])

**Output 2**

In [5]:
__num_of_parts(input_vec,input_start,input_all_starts)

(array([[ 3,  7, 10],
        [ 5,  9, 12]]),
 array([[1],
        [1]]))

## 4. __inds_to_rows

### About __inds_to_rows

This function expands a vector containing the starting indices of a piece or two of a repeat into a matrix representation recording when these pieces occur in the song with 1's. All remaining entries are marked with 0's. 

### Arguments

- start_mat: A matrix of one or two rows, containing the starting indices 
            
- row_length: length of the rows, an integer

### Returns

- new_mat: A matrix of one or two rows, with 1's where the starting indices and 0's otherwise 

### Examples

#### Input 

In [6]:
start_mat = np.array([0,1,6,7])
row_length = 10

#### Output

In [7]:
from assemble import __inds_to_rows

In [8]:
__inds_to_rows(start_mat, row_length)

array([[1, 1, 0, 0, 0, 0, 1, 1, 0, 0]])

## 5. compare_and_cut

### About compare_and_cut

This function compares two rows of repeats labeled RED and BLUE, and determines if there are any overlaps in time between them. If there is, 
then it cuts the repeats in RED and BLUE into up to 3 pieces. This function first determines if there is any intersection between the rows, if there is, then it starts comparing one repeat in red to one repeat in blue. By using the intersection of two repeats(purple) and the function __num_of_parts, we will know the new starting indices and its length. Then calling __num_of_inds changes these new starting indices and lengths to binary matrixes with ones where repeats start and zeros otherwise. After we have the new matrixes, we call merge_based_on_length to merge repeats that are the same length, and if the merged results have repeats within a row, we will call compare_and_cut again. 


# #Need  a picture to explain the whole process

### Arguments

- red: A binary row vector encoding a set of repeats with 1's where each repeat starts and 0's otherwise 
            
- red_len: The length of repeats encoded in red 
            
- blue: A binary row vector encoding a set of repeats with 1's where each repeat starts and 0's otherwise 
            
- blue_len: The length of repeats encoded in blue 

### Returns

- union_mat: A binary matrix representation of up to three rows encoding non-overlapping repeats cut from red and blue
- union_length: A vector containing the lengths of the repeats encoded in union_mat

### Examples

#### Input

In [9]:
red = np.array([1,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0])
red_len = np.array([5])
blue = np.array([1,1,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,0,0])
blue_len = np.array([3])

#### Output

In [10]:
from assemble import _compare_and_cut
_compare_and_cut(red, red_len, blue, blue_len)

[[0 0 1 1 1 0 0 0 1 0 0 0 1 1 0 0 1 0 0 0]
 [1 1 1 0 0 0 1 0 0 0 1 1 0 0 1 0 0 0 0 0]]
[1]
[[0 1 1 1 1 0 0 1 1 0 0 1 1 1 0 1 1 0 0 0]
 [1 1 1 1 0 0 1 1 0 0 1 1 1 0 1 1 0 0 0 0]
 [1 1 1 1 1 0 1 1 1 0 1 1 1 1 1 1 1 0 0 0]]
[1]


(array([[1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 0, 0],
        [1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0],
        [0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0]]),
 array([[1],
        [1],
        [2]]))

## 6. _merge_based_on_length

### About _merge_based_on_length

This function merges repeats that are the same length, as set by full_bandwidth, and are repeats of the same piece of structure. In the merging process, if there are rows that have at least one common repeat, the function will call _merge_rows to actually merge them.

### Arguments

- full_mat: A binary matrix with ones where repeats start and zeroes otherwise
        
- full_bw: The length of repeats encoded in input_mat
    
- target_bw: The length of repeats that we seek to merge

### Returns

- out_mat: A binary matrix with ones where repeats start and zeros otherwise with rows of full_mat merged if appropriate
        
- one_length_vec: The length of the repeats encoded in out_mat

### Examples

#### Input

In [11]:
full_mat = np.array([[0,0,0,1,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0],[1,1,1,0,0,0,1,0,0,0,1,1,0,0,1,0,0,0,0,0]])
full_bw = np.array([[2],[2]])
target_bw = np.array([[2],[2]])

#### Output

In [12]:
from assemble import _merge_based_on_length
_merge_based_on_length(full_mat,full_bw,target_bw)

[[0 0 0 1 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0]
 [1 1 1 0 0 0 1 0 0 0 1 1 0 0 1 0 0 0 0 0]]
[2]
[[0 1 1 1 1 0 1 1 0 0 0 1 1 0 0 1 0 0 0 0]
 [1 1 1 1 0 1 1 0 0 0 1 1 0 0 1 0 0 0 0 0]
 [1 1 1 1 1 1 1 1 0 0 1 1 1 0 1 1 0 0 0 0]]
[1]


(array([[1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 0, 0]]),
 array([2]))

## 7. _merge_rows

### About _merge_rows

This function merges rows that have at least one common repeat; said common repeat(s) must occur at the same time step and be of common length. In a while loop, the function checks all the unchecked rows one by one, finds indices of unmerged overlapping rows, unions rows with starting indices in common, and checks that newly merged rows do not cause overlaps within row (if there are conflicts, rerun compare_and_cut. When there is no unchecked row, it quits the function and finally returns the merged matrix.

### Arguments

- input_mat: A binary matrix with ones where repeats start and zeroes otherwise
        
- input_width: The length of repeats encoded in input_mat

### Returns

- merge_mat: A binary matrix with ones where repeats start and zeroes otherwise

### Examples

#### Input

In [13]:
input_mat = np.array([[0,0,1,1,1,0,0,0,1,0,0,0,1,1,0,0,1,0,0,0],
 [1,1,1,0,0,0,1,0,0,0,1,1,0,0,1,0,0,0,0,0]])
input_width = np.array([1])

In [14]:
from assemble import _merge_rows

In [15]:
_merge_rows(input_mat,input_width)

[[0 0 1 1 1 0 0 0 1 0 0 0 1 1 0 0 1 0 0 0]
 [1 1 1 0 0 0 1 0 0 0 1 1 0 0 1 0 0 0 0 0]]
[1]


array([[1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 0, 0]])