This notebook was prepared by [Donne Martin](http://donnemartin.com). Source and license info is on [GitHub](https://github.com/donnemartin/interactive-coding-challenges).

# Solution Notebook

## Problem: Compress a string such that 'AAABCCDDDD' becomes 'A3BC2D4'.  Only compress the string if it saves space.

* [Constraints](#Constraints)
* [Test Cases](#Test-Cases)
* [Algorithm](#Algorithm)
* [Code](#Code)
* [Unit Test](#Unit-Test)
* [Bonus C Algorithm](#C-Algorithm)
* [Bonus C Code](#C-Code)

## Constraints

* Can we assume the string is ASCII?
    * Yes
    * Note: Unicode strings could require special handling depending on your language
* Is this case sensitive?
    * Yes
* Can we use additional data structures?  
    * Yes
* Can we assume this fits in memory?
    * Yes

## Test Cases

* None -> None
* '' -> ''
* 'AABBCC' -> 'AABBCC'
* 'AAABCCDDDD' -> 'A3BC2D4'

## Algorithm

* For each char in string
    * If char is the same as last_char, increment count
    * Else
        * Append last_char and count to compressed_string
        * last_char = char
        * count = 1
* Append last_char and count to compressed_string
* If the compressed string size is < string size
    * Return compressed string
* Else
    * Return string

Complexity:
* Time: O(n)
* Space: O(n)

Complexity Note:
* Although strings are immutable in Python, appending to strings is optimized in CPython so that it now runs in O(n) and extends the string in-place.  Refer to this [Stack Overflow post](http://stackoverflow.com/a/4435752).

## Code

In [1]:
class CompressString(object):

    def compress(self, string):
        if string is None or not string:
            return string
        result = ''
        prev_char = string[0]
        count = 0
        for char in string:
            if char == prev_char:
                count += 1
            else:
                result += self._calc_partial_result(prev_char, count)
                prev_char = char
                count = 1
        result += self._calc_partial_result(prev_char, count)
        return result if len(result) < len(string) else string

    def _calc_partial_result(self, prev_char, count):
        return prev_char + (str(count) if count > 1 else '')

## Unit Test

In [2]:
%%writefile test_compress.py
from nose.tools import assert_equal


class TestCompress(object):

    def test_compress(self, func):
        assert_equal(func(None), None)
        assert_equal(func(''), '')
        assert_equal(func('AABBCC'), 'AABBCC')
        assert_equal(func('AAABCCDDDDE'), 'A3BC2D4E')
        assert_equal(func('BAAACCDDDD'), 'BA3C2D4')
        assert_equal(func('AAABAACCDDDD'), 'A3BA2C2D4')
        print('Success: test_compress')


def main():
    test = TestCompress()
    compress_string = CompressString()
    test.test_compress(compress_string.compress)


if __name__ == '__main__':
    main()

Overwriting test_compress.py


In [3]:
%run -i test_compress.py

Success: test_compress


## C Algorithm

This algorithm is based on Python Algorithm

Define a 'result' (the compressed string) with the string's len .

* First, for each element into the string
    * If the element (char) is as the last char, we increment the count
    * Else
        * Concat the last char and count into the 'result'.
        * make the last char = char
        * count = 1
           
* Then, concat the last char and count into the 'result' 

* Finally, if the 'result' (compressed string) size is < string size
    * Return 'result' string
* Else
    * Return string

In [None]:
# %load compress.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>


char *compress (char *s);
char *_calc_partial_result (char p_char, char count);

int main ( ){
  printf("%s\n",compress("AABBCC"));
  return 0;
}

char *compress ( char *s){
  char *result = (char*) malloc(strlen(s)); // result string
  char prev_char; // previous char
  int count = 0; // count defines
  char *p_count;
  int i;
  if(sizeof(s) == 0)
    return NULL;
  prev_char = s[0];
  for (i=0; i<strlen(s); i+=1){
    if (s[i] == prev_char)
      count+=1;
    else{
      asprintf(&p_count, "%i", count);
      strcat(result, _calc_partial_result(prev_char, *p_count));
      prev_char = s[i];
      count = 1;
    }
  }
  asprintf(&p_count, "%i", count);
  strcat(result, _calc_partial_result(prev_char, *p_count));
  if(strlen(result)<strlen(s))
    return result;
  else
    return s;
}

char *_calc_partial_result (char p_char, char count){
  char *buff;
  int c = atoi(&count);
  if(c > 1){
    buff = (char*) malloc(sizeof(p_char) + sizeof(count));
    strcpy(buff, &p_char);
    strcat(buff, &count);
  }else{
    buff = (char*) malloc(sizeof(p_char));
    strncpy(buff, &p_char, sizeof(p_char));
  }
  return buff;
}
