# Encode and Decode strings

The goal here is to implement the functions to encode a number in a (given base) string and vice-versa: decode the string into a number.

The motivation to implement it is educational.
After reading and thinking about the problem of shortening URLs (e.g, *bit.ly*) and the need for a Base-62 encode/decode service, I felt like I should handle such exercise.
Help from [this stackoverflow post](http://stackoverflow.com/questions/1119722/base-62-conversion) was taken.

That said, base-62 is a goal, but I should start with a base-2 encode/decode solution; then expand it to *62*.

In [1]:
BASE = { 2 : '01',
       62 : '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'#-_@/.:'
       }

In [2]:
def encode(number,base=2):
    '''
    Encode 'number' to a string folowing 'base' length
    '''
    if number == 0:
        return 0
    base = int(base)
    assert BASE.has_key(base)
    num = number
    alphabet = list(BASE[base])
    arr = []
    while num:
        num,rem = divmod(num,base)
        arr.append(alphabet[rem])
    arr.reverse()
    return ''.join(arr)

In [3]:
encode(10000,62)

'2Bi'

In [4]:
def decode(charcode,base=2):
    '''
    Decode 'charcode' to a number after 'base' conversion
    '''
    base = int(base)
    assert BASE.has_key(base)
    alphabet = list(BASE[base])
    charcode = list(charcode)
    charcode.reverse()
    num = 0
    for i,char in enumerate(charcode):
        index = alphabet.index(char)
        num += index*(base**i)
    return num

In [5]:
decode('81jbnt',62)

7348410963

## Another solution

Worth noting is the following solution given by [SO user *Wolph*](http://stackoverflow.com/a/2549514/687896)

In [6]:
import string
BASE_LIST = string.digits + string.letters #+ '-_@/.:'
BASE_DICT = dict((c, i) for i, c in enumerate(BASE_LIST))

def base_decode(string, reverse_base=BASE_DICT):
    length = len(reverse_base)
    ret = 0
    for i, c in enumerate(string[::-1]):
        ret += (length ** i) * reverse_base[c]

    return ret

def base_encode(integer, base=BASE_LIST):
    if integer == 0:
        return base[0]

    length = len(base)
    ret = ''
    while integer != 0:
        ret = base[integer % length] + ret
        integer /= length

    return ret

In [7]:
base_encode(10000)

'2bI'

In [8]:
base_decode('2bI')

10000

In [9]:
base_decode('81jbnt')

7354709073

In [10]:
base_encode(7354709073)

'81jbnt'

In [11]:
url = 'http://stackoverflow.com/questions/1119722/base-62-conversion'
num = sum( ord(c) for c in url )
print num

5585


In [12]:
encode(num,62)

'1s5'

In [13]:
base_encode(num)

'1S5'

In [14]:
BASE.update({62:string.digits + string.letters })
encode(num,62)

'1S5'

In [15]:
from random import randint as rand
num = rand(0,1000)

import hashlib
md5 = hashlib.md5()
md5.update(url + encode(num))
md5.hexdigest()

'40069f7abd77245dab44a99913774265'

In [16]:
barray = bytearray(list(md5.hexdigest()))

In [17]:
''.join( encode(bc,62) for bc in barray )[:7]

'qmmsv1e'