A new structure for experimentation on decoding: communication channels #18269

sagetrac-dlucas · 2015-04-21T13:50:08Z

For now, there is no structure to easily add errors in codewords. If one wants to experiment with decoding algorithms on codes, the only possible way is to add errors "by hand", which is rather tedious.

We propose here a new structure, based on communication channels, on purpose to facilitate the experimentation process with decoding algorithms.

For now, our structure consists of:

an abstract class for channels,
a channel which adds a specific number of errors at random positions to each provided vector
a channel which adds a specific number of errors at random positions to each provided vector, and erases a specific number of random positions

With this new structure, creating n errors in a vector can be done in one line of code into Sage.
Adding a new Channel class should also be easy: all one needs to do is to inherit from the abstract class, and override and implement a method.

For better consistency, channels can only be accessed using channels. (see #15445) from the global namespace.

CC: @johanrosenkilde

Component: coding theory

Author: David Lucas

Branch/Commit: 383e67c

Reviewer: Vincent Delecroix

Issue created by migration from https://trac.sagemath.org/ticket/18269

The text was updated successfully, but these errors were encountered:

sagetrac-dlucas · 2015-04-21T13:50:58Z

Branch: u/dlucas/channels

sagetrac-dlucas · 2015-04-21T13:51:36Z

New commits:

`3f66f35`	`Abstract class for channels`
`df9fd2c`	`New channel: static error rate`
`4da3b35`	`Changed naming conventions and added new channel: ErrorErasure`
`62a2447`	`Channels are now part of the reference manual for coding. Added imports.`

sagetrac-dlucas · 2015-04-21T13:51:36Z

Commit: 62a2447

videlec · 2015-04-22T08:10:15Z

comment:3

Hello,

You might consider inheriting from SageObject (sage.structure.sage_object).
I would rather hide them under codes.channels.<name>. That way you can avoid the file channels_catalog.
Why sage.coding.channel_constructions and not sage.coding.channels?
Why not setting __call__ as an alias of transmit. That would allow

sage: channel(msg)

which is convenient.

Would be cool to have cross references with actual codes in the documentation. Right now the documentation looks like this is an isolated block.
You could add various tests to check that if you reset the seed you get the same answer. Might be useful to have this doctested and also for users to know that they can use it.

sage: set_random_seed(10)
sage: randint(0,5)
5
sage: set_random_seed(10)
sage: randint(0,5)
5
sage: set_random_seed(10)
sage: randint(0,5)
5

In _random_error_vector why not starting with

vect = [F.random_element() for _ in range(n)]

The functions _random_error_position, _random_error_vector and _tuple_to_integer will not appear in the documentation. You should either remove the _ at the begining or add some automethod sphinx directives (you have to look in the code to find how to do it)
The function _random_error_position should be moved to prandom. It is useful in its own and quite independent. Note that it is quite similar to

sage: sample(range(10), 4)
[1, 7, 6, 4]

For _random_error_vector I would rather use F._nonzero_random_element() that is useful in your case

sage: GF(2)._random_nonzero_element()
1

of course the default implementation is to call .random_element until something nonzero is found. Hence you can use

Why do you need this _tuple_to_integer? Why not using directly randint(*value)?

sage: value = [0,5]
sage: randint(*value)
3
sage: value = (0,5)
sage: randint(*value)
4

Eleven is a nice prime number. So I will stop there.

Vincent

sagetrac-dlucas · 2015-04-22T09:33:51Z

comment:4

Hello,

Thank you for the input!

I would rather hide them under codes.channels.<name>. That way you can avoid the file channels_catalog.
Why sage.coding.channel_constructions and not sage.coding.channels?

My idea here was to mimic the way codes were stored and hide: a code_constructions file for the codes, and a codes_catalog file for the list of codes. I wanted to do the same with channels for consistency. That's why I chose the name channel_constructions.
Hiding the channels under codes.channels. might be confusing (imho): one can think that channels depends on codes, which is not the case as they are completely independant objects. So they can be used on any vector, not only on codewords, or for experimentation on decoding. Which also explains why the documentation looks like an isolated block, I wrote it as a new object, independant from codes.

The functions _random_error_position, _random_error_vector and _tuple_to_integer will not appear in the documentation. You should either remove the _ at the begining or add some automethod sphinx directives (you have to look in the code to find how to do it)

Well, that was done on purpose as one should not use these methods except when implementing a new Channel object. Now, with a second thought about this, it's not that smart as one will have to browse through the code to find these methods. I think I'll remove the underscore to make them appear in the doc, while putting the statement "This is a helper function, for internal use only." in a ..NOTE:: block to highlight this.

For the rest of your comments, thanks a lot for the advice on these random issues. I'll change the code accordingly.

David

sagetrac-git · 2015-04-22T14:18:23Z

Branch pushed to git repo; I updated commit sha1. New commits:

`934f6c5`	`Moved random_error_position to prandom`
`5ee95d7`	`Changed code of random_error_vector`
`874deac`	`AbstractChannel now inherits from SageObject. Added an alias to transmit method.`

sagetrac-git · 2015-04-22T14:18:23Z

Changed commit from 62a2447 to 874deac

videlec · 2015-04-22T14:31:33Z

comment:6

random_error_position is not an appropriate name. Perhaps range_sample or sample_range. The description is not appropriate as well. Should be something like

def sample_range(n, k):
    """
    Return a subset of [0,...,n-1] of size k with uniform probability.

    INPUT:

    - ``n``, ``k`` - integers

    OUTPUT:

    An ordered list.   

    AUTHORS:

    This function is taken from codinglib (https://bitbucket.org/jsrn/codinglib/)
    written by Johan Nielsen.
"""

In the same function you should also:

check the input (just do n = int(n); k = int(k))
rename the variable error_position into something more meaningful
And for the line sage:set_random_seed(10) there must be a space sage: set_random_seed(10).

sagetrac-dlucas · 2015-04-22T14:33:53Z

comment:7

I did some changes, as stated by commit messages above.

Some remarks:

I kept tuple_to_integer because the user can pass either an integer or an interval as number_errors parameter for the channels. If the user passes an integer, then randint won't work. Of course it is possible to do the check anytime you call number_errors (if integer, return it, else do a randint and return the result) but it's rather tedious. Especially because it's something that will be used by a some channels. I think it's easier to call a single method rather than copying the if statement each time. Maybe tuple_to_integer is not really a good name though.
random_error_vector(n, F, error_positions) returns a vector of length n over F with 0s in every position except those listed in error_positions. Because of this, calling vect = [F.random_element() for _ in range(n)] in random_error_vector does not really fit, because it will create a vector with non-controlled zero positions, so we'll need to set all positions except those into error_positions afterwards. I think create a zero vector of size n and add random values at some positions is better.
I did not change the documentation, nor the catalog stuff (see comment 4). If you think that Channels are actually more linked to codes than to anything else, I can make some changes.

David

sagetrac-git · 2015-04-22T15:04:42Z

Changed commit from 874deac to 5f31c5c

sagetrac-git · 2015-04-22T15:04:42Z

Branch pushed to git repo; I updated commit sha1. New commits:

`5f31c5c`	`Changed name of random_error_position to sample_range, updated documentation, propaged naiming change. Removed some whitespaces`

sagetrac-dlucas · 2015-04-22T15:05:21Z

New commits:

`5f31c5c`	`Changed name of random_error_position to sample_range, updated documentation, propaged naiming change. Removed some whitespaces`

videlec · 2015-04-22T15:31:49Z

comment:10

Replying to @sagetrac-dlucas:

I did some changes, as stated by commit messages above.

Some remarks:

I kept tuple_to_integer because the user can pass either an integer or an
interval as number_errors parameter for the channels. If the user passes an
integer, then randint won't work. Of course it is possible to do the check
anytime you call number_errors (if integer, return it, else do a randint
and return the result) but it's rather tedious. Especially because it's
something that will be used by a some channels. I think it's easier to call a
single method rather than copying the if statement each time. Maybe
tuple_to_integer is not really a good name though.

The name is very bad. Moreover you can convert an integer a into the tuple (a,a) in the constructor. With your way of doing it you avoid the function call to randint but with your version you always add an extra function call to tuple_to_integer.

You should replace all occurrence of hasattr(X, '__iter__') by isinstance(X, (tuple, list)). Having an __iter__ method does not charaterize tuples.

random_error_vector(n, F, error_positions) returns a vector of length n
over F with 0s in every position except those listed in
error_positions. Because of this, calling vect = [F.random_element() for _ in range(n)] in random_error_vector does not really fit, because it will
create a vector with non-controlled zero positions, so we'll need to set all
positions except those into error_positions afterwards. I think create a
zero vector of size n and add random values at some positions is better.

Yes. You are right!

I did not change the documentation, nor the catalog stuff (see comment 4).
If you think that Channels are actually more linked to codes than to anything
else, I can make some changes

You should tell me ;-) Cryptogtaphers might like it but they do use strings
not vectors in a vector space over a finite field.

When you do __call__ = transmit the documentation is copied as well. Did you try

sage: Chan = channels.StaticErrorRateChannel(GF(11), 2)
sage: Chan.__call__?

I would rather implement StaticErrorRateChannel.transmit_unsafe as

def transmit_unsafe(self, msg):
    w = msg.copy()
    V = self.input_space()
    for i in sample_range(V.dim(), randint(*self.nb_errors())):
        w[i] = V.random_element()
    return w

it would be more direct. Moreover it shows that sample_range would better be an iterator instead of returning a list. By the way, this function sample_range has currently linear complexity in n, isn't it possible to make it linear in k?

In the main documentation class, it is useful to document the main methods. In other words, when reading sage: channels.StaticErrorChannel? I should be able to discvoer the method transmit (and its alias usage with __call__).
In the very top documentation, you can have links by writing class:`AbstractChannel` instead of *AbstractChannel*.

Vincent

videlec · 2015-04-22T15:40:06Z

comment:12

I would rather implement StaticErrorRateChannel.transmit_unsafe as
def transmit_unsafe(self, msg):
    w = msg.copy()
    V = self.input_space()
    for i in sample_range(V.dim(), randint(*self.nb_errors())):
        w[i] = V.random_element()
    return w 
it would be more direct. Moreover it shows that sample_range would better be an iterator instead of returning a list. By the way, this function sample_range has currently linear complexity in n, isn't it possible to make it linear in k?

Yes it is (the solution is not exactly linear though)

sage: S = Subsets(100,4)
sage: timeit("S.random_element()")
625 loops, best of 3: 54.2 µs per loop
sage: timeit("sample_range(100,4)")
625 loops, best of 3: 57.7 µs per loop
sage: S = Subsets(1000,4)
sage: timeit("S.random_element()")
625 loops, best of 3: 61.2 µs per loop
sage: timeit("sample_range(1000,4)")
625 loops, best of 3: 510 µs per loop

It is roughly the same speed for n=100 but for n=1000 yours is twice slower.

videlec · 2015-04-22T16:17:00Z

comment:13

Replying to @videlec:

Moreover it shows that sample_range would better be an iterator instead of returning a list. By the way, this function sample_range has currently linear complexity in n, isn't it possible to make it linear in k?

Yes it is (the solution is not exactly linear though)
...
It is roughly the same speed for n=100 but for n=1000 yours is twice slower.

This code for subsets is just using

sage: sample(range(n), k)

So I guess this is currently the best way to do it. And moreover, looking at Python doc

    To choose a sample in a range of integers, use xrange as an argument.
    This is especially fast and space efficient for sampling from a
    large population:   sample(xrange(10000000), 60)

Sorry for my previous remarks. You should just get rid of sample_range(n,k) and use sample(xrange(n),k).

Vincent

sagetrac-git · 2015-04-23T11:58:04Z

Branch pushed to git repo; I updated commit sha1. New commits:

`f515e4d`	Removed `__call__` = transmit
`87667eb`	`Removed sample_range method, propaged changes and changed code for transmit_unsafe in StaticErrorRateChannel`
`f9b07be`	`Removed tuple_to_integer method`
`f93852e`	`Improved documentation`

sagetrac-git · 2015-04-23T11:58:04Z

Changed commit from 5f31c5c to f93852e

sagetrac-dlucas · 2015-04-23T12:04:34Z

comment:15

Thank you again for the advice!

The only thing I did not do is to add cross-references between channels and codes. I think it's not the best time to do it, as there is no proper Encoder/Decoder structure. But when we will propose our structure for decoding, I think it will be interesting to illustrate the error correction using channels to create errors. Before that, I let it as is.

sagetrac-git · 2015-04-25T12:12:29Z

Changed commit from 2bc0590 to 723a55c

sagetrac-git · 2015-04-25T12:12:29Z

Branch pushed to git repo; I updated commit sha1. New commits:

`723a55c`	Replaced `__repr__` by _repr_

sagetrac-dlucas · 2015-04-25T12:14:29Z

comment:22

Hello,

For Sage objects, you should use _repr_ and not __repr__.

Ok, good to know! Modification done.

videlec · 2015-04-25T13:07:08Z

comment:23

In each class/function/method the documentation should be:

one line description
then more details in one or several paragraphs if needed. It is not the case in StaticErrorRateChannel and ErrorErasureChannel.

Moreover, in a class you should not write as a description Construct a X. It is rather simply X. The reason is because it is the documentation of the class itself that you will see with:

sage: Chan = channels.ErrorErasureChannel(GF(59)^40, 2, 2)
sage: Chan?<enter>

(BTW there is a nice short cut to build vector spaces: K^n just works!)

You can simplify a bit the ErrorErasureChannel._repr_. By doing things in two parts

def format_interval(t):
    return str(t[0]) if t[0] == t[1] else 'between %s and %s' %(t[0], t[1])

def _repr_():
    return "Error-and-erasure channel creating %s errors and %s erasures' %(
              format_interval(self.number_errors()),
              format_interval(self.number_erasures()))

And if you really care about pluralization

def format_interval(t, singular, plural):
    name = singular if t[1] < 2 else plural
    if t[0] == t[1]:
        return '%s %s' %(t[0], name)
    else:
        return 'between %s and %s %s' %(t[0], t[1], name)

Such format_interval can be used in all your _repr_ methods.

You can simplify ErrorErasureChannel.transmit_unsafe. Notice that the output of sample is already randomized (you can have a look at the code source which is pure python) and it is specified in its documentation

The resulting list is in selection order so that
all sub-slices will also be valid random samples

Hence, you can simply cut your errors into two parts as

errors = sample(xrange(n), number_errors + number_erasures)
error_positions   = errors[:number_errors]
erasure_positions = errors[number_errors:]

sagetrac-git · 2015-04-27T08:58:03Z

Branch pushed to git repo; I updated commit sha1. New commits:

`23dd71c`	Simplified `_repr_` and `_latex_` methods using a new helper function
`cb7b137`	`Some changes to the documentation`
`768ed9c`	`Simplified code in ErrorErasureChannel.transmit_unsafe`

sagetrac-git · 2015-04-27T08:58:03Z

Changed commit from 723a55c to 768ed9c

videlec · 2015-04-27T09:59:32Z

comment:26

In the string representation of channels the input space/output space does not appear

sage: n_err, n_era = 21, 21
sage: Chan = channels.ErrorErasureChannel(GF(59)^50, n_err, n_era)
sage: Chan
Error-and-erasure channel creating 21 errors and 21 erasures

i.e. no mention of GF(59)^50 in the above example. Is that what you want?

For the ErrorErasureChannel the output space is CartesianProduct. For mysterious reason, it is adviced to use cartesian_product from sage.categories.cartesian_product (see Meta-ticket: Cleanup cartesian products #15425).
In ErrorErasureChannel.transmit_unsafe you would better replace

diff --git a/src/sage/coding/channel_constructions.py b/src/sage/coding/channel_constructions.py
index 0d32ffc..8935db7 100644
--- a/src/sage/coding/channel_constructions.py
+++ b/src/sage/coding/channel_constructions.py
@@ -592,4 +592,5 @@ class ErrorErasureChannel(Channel):
 
+        zero = V.base_ring().zero()
         for i in erasure_positions:
-            message[i] = 0
+            message[i] = zero
         return message, erasure_vector

it is cleaner and potentially faster (not much).

You should add examples that check that the input vector is not modified, ie

    sage: v = my_funny_vector()
    sage: Chan(v)
    ... my_funny_output ....

Check that the input ``v`` is not modified::

    sage: v
    ... should be the initial vector ...

sagetrac-git · 2015-04-27T11:05:36Z

Branch pushed to git repo; I updated commit sha1. New commits:

`c3b5555`	`Replaced CartesianProduct by cartesian_product`
`6dd51f3`	`New test in transmit method. Replaced 0 by the zero of the base ring of the input space in ErrorErasureChannel.transmit_unsafe`
`383e67c`	Changed `_repr_` and `_latex_` methods

sagetrac-git · 2015-04-27T11:05:36Z

Changed commit from 768ed9c to 383e67c

sagetrac-dlucas · 2015-04-27T11:09:05Z

comment:28

For the ErrorErasureChannel the output space is CartesianProduct. For mysterious reason, it is adviced to use cartesian_product from sage.categories.cartesian_product (see #15425).

Thanks for the information.

I did the changes (and added input and output space in the representation methods for channels).

videlec · 2015-04-28T11:14:10Z

comment:29

Salut David,

All right! This ticket is good to go.

That would be nice to start reviewing other's ticket as well. You can have a look at the section "Reviewer Checklist" in the Sage's Developer Guide. If you want to start slowly, there are some tickets that are marked "beginner". Basically: one of your ticket gets reviewed -> you review one ticket.

Best,
Vincent

sagetrac-dlucas · 2015-04-28T11:45:30Z

comment:30

Hello,

Basically: one of your ticket gets reviewed -> you review one ticket.

Ok. I'll start reviewing then :)

videlec · 2015-04-28T11:48:47Z

comment:31

Replying to @sagetrac-dlucas:

Hello,

Basically: one of your ticket gets reviewed -> you review one ticket.

Ok. I'll start reviewing then :)

Cool :-) If you have doubts, hesitations, or anything do not hesitate to post a message to sage-devel@googlegroups.com

vbraun · 2015-04-29T03:13:41Z

Changed branch from u/dlucas/channels to 383e67c

sagetrac-dlucas mannequin added this to the sage-6.7 milestone Apr 21, 2015

sagetrac-dlucas mannequin added c: coding theory labels Apr 21, 2015

sagetrac-dlucas mannequin added the s: needs review label Apr 21, 2015

videlec added s: needs work and removed s: needs review labels Apr 22, 2015

sagetrac-dlucas mannequin added s: needs info and removed s: needs work labels Apr 22, 2015

videlec added s: needs work and removed s: needs info labels Apr 22, 2015

sagetrac-dlucas mannequin removed the s: needs work label Apr 23, 2015

sagetrac-dlucas mannequin added s: needs review and removed s: needs work labels Apr 25, 2015

videlec added s: needs work and removed s: needs review labels Apr 25, 2015

sagetrac-dlucas mannequin added s: needs review and removed s: needs work labels Apr 27, 2015

videlec added s: needs work and removed s: needs review labels Apr 27, 2015

sagetrac-dlucas mannequin added s: needs review and removed s: needs work labels Apr 27, 2015

videlec added s: positive review and removed s: needs review labels Apr 28, 2015

vbraun removed the s: positive review label Apr 29, 2015

vbraun closed this as completed in 41230fd Apr 29, 2015

This was referenced Jan 21, 2017

Meta-ticket: enhancement of the coding theory part of Sage #18846

Closed

q-ary symmetric channel class for coding theory #19511

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A new structure for experimentation on decoding: communication channels #18269

A new structure for experimentation on decoding: communication channels #18269

sagetrac-dlucas mannequin commented Apr 21, 2015

sagetrac-dlucas mannequin commented Apr 21, 2015

sagetrac-dlucas mannequin commented Apr 21, 2015

sagetrac-dlucas mannequin commented Apr 21, 2015

videlec commented Apr 22, 2015

sagetrac-dlucas mannequin commented Apr 22, 2015

sagetrac-git mannequin commented Apr 22, 2015

sagetrac-git mannequin commented Apr 22, 2015

videlec commented Apr 22, 2015

sagetrac-dlucas mannequin commented Apr 22, 2015

sagetrac-git mannequin commented Apr 22, 2015

sagetrac-git mannequin commented Apr 22, 2015

sagetrac-dlucas mannequin commented Apr 22, 2015

videlec commented Apr 22, 2015

videlec commented Apr 22, 2015

videlec commented Apr 22, 2015

sagetrac-git mannequin commented Apr 23, 2015

sagetrac-git mannequin commented Apr 23, 2015

sagetrac-dlucas mannequin commented Apr 23, 2015

sagetrac-git mannequin commented Apr 25, 2015

sagetrac-git mannequin commented Apr 25, 2015

sagetrac-dlucas mannequin commented Apr 25, 2015

videlec commented Apr 25, 2015

sagetrac-git mannequin commented Apr 27, 2015

sagetrac-git mannequin commented Apr 27, 2015

videlec commented Apr 27, 2015

sagetrac-git mannequin commented Apr 27, 2015

sagetrac-git mannequin commented Apr 27, 2015

sagetrac-dlucas mannequin commented Apr 27, 2015

videlec commented Apr 28, 2015

sagetrac-dlucas mannequin commented Apr 28, 2015

videlec commented Apr 28, 2015

vbraun commented Apr 29, 2015

A new structure for experimentation on decoding: communication channels #18269

A new structure for experimentation on decoding: communication channels #18269

Comments

sagetrac-dlucas mannequin commented Apr 21, 2015

sagetrac-dlucas mannequin commented Apr 21, 2015

sagetrac-dlucas mannequin commented Apr 21, 2015

sagetrac-dlucas mannequin commented Apr 21, 2015

videlec commented Apr 22, 2015

sagetrac-dlucas mannequin commented Apr 22, 2015

sagetrac-git mannequin commented Apr 22, 2015

sagetrac-git mannequin commented Apr 22, 2015

videlec commented Apr 22, 2015

sagetrac-dlucas mannequin commented Apr 22, 2015

sagetrac-git mannequin commented Apr 22, 2015

sagetrac-git mannequin commented Apr 22, 2015

sagetrac-dlucas mannequin commented Apr 22, 2015

videlec commented Apr 22, 2015

videlec commented Apr 22, 2015

videlec commented Apr 22, 2015

sagetrac-git mannequin commented Apr 23, 2015

sagetrac-git mannequin commented Apr 23, 2015

sagetrac-dlucas mannequin commented Apr 23, 2015

sagetrac-git mannequin commented Apr 25, 2015

sagetrac-git mannequin commented Apr 25, 2015

sagetrac-dlucas mannequin commented Apr 25, 2015

videlec commented Apr 25, 2015

sagetrac-git mannequin commented Apr 27, 2015

sagetrac-git mannequin commented Apr 27, 2015

videlec commented Apr 27, 2015

sagetrac-git mannequin commented Apr 27, 2015

sagetrac-git mannequin commented Apr 27, 2015

sagetrac-dlucas mannequin commented Apr 27, 2015

videlec commented Apr 28, 2015

sagetrac-dlucas mannequin commented Apr 28, 2015

videlec commented Apr 28, 2015

vbraun commented Apr 29, 2015