#Kodo-python Getting Started

Write what's the intention of this walkthrough.

##Importing kodo

Before working with Kodo-python, you obviously need to have it installed and available. To ensure that's the case, try importing it:

In [1]:
# try importing the kodo module
try:
    import kodo
    print("Kodo imported Succesfully")
except ImportError:
    print("Unable to import kodo!")

Kodo imported Succesfully


If the import worked, you are ready to go to the next step. Otherwise please *re*visit the README for installation instructions.

## Creating an Encoder

In kodo, both encoder and decoder factories are created using factories. Doing so allows efficient memory management and reuse of various computations and components. 

Therefore, before creating an encoder, let's look at the encoder factories provided by the ``kodo`` module:

In [2]:
# print all members containing "Factory" and "Encoder"
print("\n".join([item for item in dir(kodo) if all([keyword in item for keyword in ["Factory", "Encoder"]])]))

FullVectorEncoderFactoryBinary
FullVectorEncoderFactoryBinary16
FullVectorEncoderFactoryBinary4
FullVectorEncoderFactoryBinary8
NoCodeEncoderFactory
OnTheFlyEncoderFactoryBinary
OnTheFlyEncoderFactoryBinary16
OnTheFlyEncoderFactoryBinary4
OnTheFlyEncoderFactoryBinary8
PerpetualEncoderFactoryBinary
PerpetualEncoderFactoryBinary16
PerpetualEncoderFactoryBinary4
PerpetualEncoderFactoryBinary8
SlidingWindowEncoderFactoryBinary
SlidingWindowEncoderFactoryBinary16
SlidingWindowEncoderFactoryBinary4
SlidingWindowEncoderFactoryBinary8
SparseFullVectorEncoderFactoryBinary
SparseFullVectorEncoderFactoryBinary16
SparseFullVectorEncoderFactoryBinary4
SparseFullVectorEncoderFactoryBinary8


As seen from the output, many different encoder factories exists. Most of these have decoder factory counterparts.
The attentive reader will maybe have seen a pattern from the factory names. The factory names are, with some exceptions, a combination of the encoding algorithm and the underlying finite field.

For this walkthrough we pick the full vector factory using the binary field, i.e. the *``FullVector``*``EncoderFactory``*``Binary``* factory.

Note: In this guide should the choice of encoding factory should be interchangable, therefore I'll define the class as ``EncoderFactory``.

In [3]:
# store the full vector binary encoder as EncoderFactory
EncoderFactory = kodo.FullVectorEncoderFactoryBinary

Let's see what the ``EncoderFactory``'s constructor takes as arguments: 

In [4]:
# Get information about the encoder factory's __init__ function
help(EncoderFactory.__init__)

Help on method __init__:

__init__(...) unbound kodo.FullVectorEncoderFactoryBinary method
    Factory constructor.
    
            :param max_symbols: The maximum symbols the coders can expect.
            :param max_symbol_size: The maximum size of a symbol in bytes.



So, to create a factory, we need to pick the ``max_symbols`` and ``max_symbol_size``.
These parameters determines upper bounds to the encoders created by the factory.

The proper values to pick depends on the use case.

Let's create an encoder_factory object:

In [5]:
max_symbols = 4
max_symbol_size = 32

encoder_factory = EncoderFactory(
    max_symbols=max_symbols,
    max_symbol_size=max_symbol_size)

We can now use the object's ``build`` method to create encoders, but other methods are also available:

In [6]:
# Print all public members
print("\n".join([item for item in dir(encoder_factory) if not item.startswith("__")]))

build
max_block_size
max_payload_size
max_symbol_size
max_symbols
set_symbol_size
set_symbols
symbol_size
symbols


These can be used to either get information about the created factory or set values used for the encoders to be created using the ``build`` method.

Let's print out the maximum block size, i.e. the maximum amount of data that can be encoded during each generation.

In [7]:
max_block_size = encoder_factory.max_block_size()
print("Max block size: {}".format(max_block_size))

Max block size: 128


Note, the maximum block size is directly correlated with the previously set ``max_symbols`` and ``max_symbol_size``.

In [8]:
calculated_max_block_size = max_symbols * max_symbol_size
print("Calculated max block size: {}".format(calculated_max_block_size))

Calculated max block size: 128


Well, let's do as they say in Monty Python:

In [9]:
from IPython.display import YouTubeVideo
YouTubeVideo('dEtm_Q2LK9g')

In [34]:
encoder = encoder_factory.build()

Fantastic, we've build our first encoder! Let's see what we can use it for:

In [35]:
# Print all public members
print("\n".join([item for item in dir(encoder) if not item.startswith("__")]))

block_size
is_systematic_on
payload_size
rank
set_symbol
set_symbols
set_systematic_off
set_systematic_on
symbol_size
symbols
trace
write_payload


Let's inspect the state of our newly created encoder.

In [36]:
def print_state(encoder):
    print(
        "block_size: {}\n"
        "is_systematic_on: {}\n"
        "payload_size: {}\n"
        "rank: {}\n"
        "symbol_size: {}\n"
        "symbols: {}".format(
            encoder.block_size(),
            encoder.is_systematic_on(),
            encoder.payload_size(),
            encoder.rank(),
            encoder.symbol_size(),
            encoder.symbols())
    )
print_state(encoder)

block_size: 128
is_systematic_on: True
payload_size: 38
rank: 0
symbol_size: 32
symbols: 4


We use the ``write_payload`` method to encode the data, but since we have yet to tell encoder what data to encode, we can't use it yet.
This can be seen from the encoder rank which is 0.

Let's create some data to encode:

In [37]:
data_in = (
    "The size of this data is exactly 128 bytes "
    "which means it will fit perfectly in a single generation. "
    "That is very lucky, indeed!"
)
print("Length of data string: {}".format(len(data_in)))

Length of data string: 128


Kodo uses python strings as data objects, which means each character represents a byte. Let's set the data to encode on the encoder.

In [38]:
encoder.set_symbols(data_in)

We should now be able to see how the state of the encoder has changed.

In [39]:
print_state(encoder)

block_size: 128
is_systematic_on: True
payload_size: 38
rank: 4
symbol_size: 32
symbols: 4


Notice how the rank is now equal to the number of symbols:

In [40]:
encoder.rank() == max_symbols

True

We can only encode if the rank is > 0.

Let's encode some packets:

In [41]:
packet1 = encoder.write_payload()
packet2 = encoder.write_payload()
packet3 = encoder.write_payload()
packet4 = encoder.write_payload()

print(
    "packet1: {}\n"
    "packet2: {}\n"
    "packet3: {}\n"
    "packet4: {}\n".format(
        packet1,
        packet2,
        packet3,
        packet4,
    )
)

packet1: �    The size of this data is exactly
packet2: �    128 bytes which means it will f
packet3: �   it perfectly in a single generat
packet4: �   ion. That is very lucky, indeed!



A few interesting things can be seen from this output. First of all it's very easy to see the content of the string in each packet. Secondly they are all Notice the ``�`` in the beginning of each string; that's the packet header which contains the symbol id.
The reason why the content of the packets are readable is that the encoder is systematic. Systematic means that we start by not coding any data, as all data initally will be innovative.

In [42]:
encoder.is_systematic_on()

True