Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Develop #626

Merged
merged 16 commits into from Jun 5, 2018
6 changes: 3 additions & 3 deletions text/en/chapters/coding-compression.md
Expand Up @@ -254,8 +254,8 @@ The purpose of Huffman coding is to take a set of "symbols" (which could be char
It's normally presented as a way of compressing textual documents, and while it can do that reasonably well, it works much better in combination with Ziv-Lempel coding (see below).

But let's start with a very simple textual example.
This example language uses only 4 different characters, and yet is incredibly important to us: it's the language used to represent DNA, which is made up of sequences of four characters A, C, G and T).
For example, the 4.6 million characters representing an E.coli DNA sequence happens to start with:
This example language uses only 4 different characters, and yet is incredibly important to us: it's the language used to represent DNA, which is made up of sequences of four characters A, C, G and T.
For example, the 4.6 million characters representing an *E.coli* DNA sequence happens to start with:

```
agcttttcattct
Expand Down Expand Up @@ -353,7 +353,7 @@ digraph G {
}
{comment end}

To decode something using this structure (e.g. the code 0100110011110001011001 above), start at the top, and choose a branch based each successive bit in the coded file. The first bit is a 0, so we follow the left branch, then the 1 branch, then the 0 branch, which leads us to the letter a.
To decode something using this structure (e.g. the code 0100110011110001011001 above), start at the top, and choose a branch based each successive bit in the coded file. The first bit is a 0, so we follow the left branch, then the 1 branch, then the 0 branch, which leads us to the letter "a".
After each letter is decoded, we start again at the top.
The next few bits are 011..., and following these labels from the start takes us to "g", and so on.
The tree makes it very easy to decode any input, and there's never any confusion about which branch to follow, and therefore which letter to decode each time.
Expand Down
4 changes: 2 additions & 2 deletions text/en/chapters/data-representation.md
Expand Up @@ -426,7 +426,7 @@ It is really useful to know roughly how many bits you will need to represent a c
{panel type="spoiler" summary="Answers for above challenge"}
1. b (actually, 3 bits is enough as it gives 8 values, but amounts that fit evenly into 8-bit bytes are easier to work with)
2. c (32 bits is slightly too small, so you will need 64 bits)
3. c (This is a challenging question, but one a database designer would have to think about. There's about 94,000 km of roads in NZ, so if the average length of a road was 1km, there would be too many roads for 16 bits. Either way, 32 bits would be a safe bet.)
3. b (This is a challenging question, but one a database designer would have to think about. There's about 94,000 km of roads in NZ, so if the average length of a road was 1km, there would be too many roads for 16 bits. Either way, 32 bits would be a safe bet.)
4. d (Even 64 bits is not enough, but 128 bits is plenty! Remember that 128 bits isn't twice the range of 64 bits.)
{panel end}

Expand Down Expand Up @@ -878,7 +878,7 @@ The character **$** in UTF-32 would be:
00000000 00000000 00000000 00100100
```

And the character **犬** in UTF-32 would be:
And the character **犬** (dog in Chinese) in UTF-32 would be:
```
00000000 00000000 01110010 10101100
```
Expand Down