# Projektni zadatak 4, Genomska informatika, Skolska 2021/2022

# Aleksandar Malovic 2021/3375

## Testing remarks

Tests for individual functions are grouped within testing functions to separate and enclose testing scopes. This also simplifies calling the tests at other places in the code.

### Global testing helper functions

In [26]:
def assertArraysEqual(output, expectedOutput):
    assert output != None
    assert len(output) == len(expectedOutput)
    for i in range(0, len(expectedOutput)):
        assert output[i] == expectedOutput[i]

## Burrows-Wheeler transform

### Create list of all string rotations

Function appends the string to itself to make it simpler to calculate rotations (based on lesson slides). Implemntation using splicing is also possible but is suboptimal from a memory standpoint.

**ASSUMPTION:** Input string will already have the ending character appended, otherwise function will return None to indicate error. Function could create a local copy with the ending character appended but in case of large strings creating a local copy with just one additional character would be suboptimal from a memory standpoint.

In [50]:
def isInputValid(t):
    return t != None and len(t) > 0 and t != "$" and t.endswith('$')

def rotations(t):
    if not isInputValid(t):
        return None
    tt = t * 2
    return [ tt[i:i+len(t)] for i in range(0, len(t)) ]

#### Tests

In [51]:
def testInputValidation():
    # Test case 1: None, should return false
    assert not isInputValid(None)
    
    # Test case 2: Empty string, should return false
    assert not isInputValid('')
    
    # Test case 3: String containing only the ending character, should return false
    assert not isInputValid('$')
    
    # Test case 4: String missing ending character, should return false
    assert not isInputValid('abc')
    
    # Test case 5: Valid string, should return true
    assert isInputValid('abc$')

In [52]:
def testRotations():
    # Test case 1: None, should return None
    assert rotations(None) == None
    
    # Test case 2: Empty string, should return false
    assert rotations('') == None
    
    # Test case 3: String containing only the ending character, should return false
    assert rotations('$') == None
    
    # Test case 4: String missing ending character, should return false
    assert rotations('abc') == None
    
    # Test case 5: Input string of just one character 
    inputValue = 'a$'
    expectedOutput = ['a$', '$a']
    output = rotations(inputValue)
    assertArraysEqual(output, expectedOutput)
    
    # Test case 6: Valid input string
    inputValue = 'abcd$'
    expectedOutput = ['abcd$', 'bcd$a', 'cd$ab', 'd$abc', '$abcd']
    output = rotations(inputValue)
    assertArraysEqual(output, expectedOutput)

#### Running tests

In [53]:
testInputValidation()

testRotations()

### Sort string rotations in alphabetical order

Based on lesson slides.

In [77]:
def sortedRotations(t):
    r = rotations(t)
    return sorted(r) if r != None else None

#### Tests

In [76]:
def testSortRotations():
    # Test case 1: None, should return None
    assert sortedRotations(None) == None
    
    # Test case 2: Empty string, should return None
    assert sortedRotations('') == None
    
    # Test case 3: String containing only the ending character, should return None
    assert sortedRotations('$') == None
    
    # Test case 4: String missing ending character, should return None
    assert sortedRotations('abc') == None
    
    # Test case 5
    inputValue = 'abcd$'
    expectedOutput = ['$abcd','abcd$', 'bcd$a', 'cd$ab', 'd$abc']
    output = sortedRotations(inputValue)
    assertArraysEqual(output, expectedOutput)

#### Running tests

In [78]:
testSortRotations()

### Generate final Burrows-Wheeler transform

We take the last column of the sorted rotations matrix (based on lesson slides)

In [81]:
def calculateBurrowsWheelerTransform(t):
    r = sortedRotations(t)
    return ''.join(map(lambda x: x[-1], r)) if r != None else None

#### Tests

In [82]:
def testBurrowsWheelerTransform():
    # Test case 1: None, should return None
    assert calculateBurrowsWheelerTransform(None) == None
    
    # Test case 2: Empty string, should return None
    assert calculateBurrowsWheelerTransform('') == None
    
    # Test case 3: String containing only the ending character, should return None
    assert calculateBurrowsWheelerTransform('$') == None
    
    # Test case 4: String missing ending character, should return None
    assert calculateBurrowsWheelerTransform('abc') == None
    
    # Test case 5
    inputValue = 'abcd$'
    expectedOutput = 'd$abc'
    output = calculateBurrowsWheelerTransform(inputValue)
    assert output != None
    assert output == expectedOutput
    
    # Test case 6
    inputValue = 'abaaba$'
    expectedOutput = 'abba$aa'
    output = calculateBurrowsWheelerTransform(inputValue)
    assert output != None
    assert output == expectedOutput

#### Running tests

In [84]:
testBurrowsWheelerTransform()