Cryptography has always been about message confidentiality i.e. making sure
that a message can only be understood by the sender and/or its intended
recipient. It typically involves converting a message into an incomprehensible
format and then that process being reversed by the recipient. A good
cryptography system should leave a potential eavesdropper with a sense of
helplessness i.e. they should feel like they don’t have the time or energy to figure
out what the message is. Of course the standard for “time or energy” has
changed drastically with computers and what seemed like a lot of time or
energy doesn’t seem like a lot if you have a computer and know how to write a
few scripts. However, computers also allow us to create even more complicated
encryption systems. Another thing to consider is that even with computers,
there are categories of problems or algorithms that are difficult for a computer
to solve. Recall how complex it was to solve a towers of hanoi problem (from CSC
220 and CSC/CYEN 131) even when the algorithm was very simple to
write/understand. A good cryptography system might be easy to understand
(i.e. how it works), but should still leave us (and our computers) feeling like there
isn’t enough time/RAM to figure out the original message. \
\
Definitions: 
<ul>
  <li>cryptology: enciphering and deciphering</li>
  <li>cryptography: making a cipher system</li>
  <li>cryptanalysis: breaking a cipher system</li>
  <li>encryption: scrambling a message</li>
  <li>decryption: unscrambling a message</li>
  <li>plain text: original “readable” message</li>
  <li>cipher text: encrypted “unreadable” message</li>
  <li>cipher: algorithm for performing encryption or decryption</li>
</ul>  

##### **<ins>History:</ins>**

Classic cryptography used pen, paper and perhaps simple mechanical aids. As we discuss them, try and see if you can figure out how we would break the cipher and how complicated it might be given current technology i.e. current definitions of “time and energy”. The highlighted encrypted messages are for you to try and decrypt.
- Hieroglyph: 3200BC to about 400AD. Utilized images or symbols to represent messages that could only be interpreted by those who knew what the symbols meant. Of course this is no different from writing in a language. The more people know that language, the less effective it is at keeping messages secret.


- Atbash cipher: 500BC to 1300AD. Its a type of substitution cipher where the order of the alphabet is reversed i.e. A becomes Z, B becomes Y, etc. Examples of it can actually be found in the bible [^1].
<p style="text-align:center;"><span style="background-color: #FFFF00">rhm'g xibkgltizksb ufm?</span></p>

[^1]: (https://www.theology.ox.ac.uk/article/crack-the-code)

In [None]:
$ echo {z..a} | tr -d ' ' # to create the character set z through a.
$ echo "abcde" | tr a-z "zyxwvutsrqponmlkjihgfedcba"

- Scytale cipher: around 7th century BC. Messages were written on strip of parchment wrapped round a rod. When strip was unwrapped, the message was unreadable unless it was re-wrapped around a rod of similar dimensions.
<p style="text-align:center;"><span style="background-color: #FFFF00">Sdeeeiootyrdnnmhdbwgzeareatostaithoattncimihhbhs</span></p>

- Caesar cipher: Also a substitution cipher. Named after the first famous person to use this technique. It typically involves replacing a character with another character in the same character set but identified by shifting the alphabet over by a specific number of positions. Caesar used to replace A with D, B with E, C with F, …, X with an A, Y with a B, and Z with a C. Because the alphabet was rotated by 3 positions, this specific version can be called a rot-3 cipher. One of the most common Caesar ciphers is the rot-13 because the alphabet is reflected i.e. it is it’s own inverse.
<p style="text-align:center;"><span style="background-color: #FFFF00">diwtgh hpxs: durdjght xi xh</span></p>

In [None]:
$ echo "abcde" | tr a-z d-za-c #rot-3
$ echo "abcde" | tr a-z n-za-m #rot-13

- Sliding shift cipher: Issue with Caesar is that it is easy to break these days. You only need to look at 25 different possibilities to decipher it visually. A sliding shift cipher attempts to make this more difficult by employing a different shift for each position in the plain text string. For example, the first letter is encoded using rot-1, second letter using rot-2, third letter using rot-3, etc. One doesn’t have to start at 1, but the increase size is typically 1. However, If I know that the cipher is the sliding shift cipher, breaking it is not that hard.

- Vigenere cipher: 1553-1863. Each letter of the plain text was encoded by rotating the alphabet based on a key or passphrase. For example, if the key was “key”, then the first letter would be gotten from a rot-10 alphabet i.e. where a became k, b became l, c became m, etc. The second letter would be gotten from a rot-4 alphabet i.e. where a became e, b became f, c became g, etc. The key would be repeated as many times as necessary to encode the entire plain text. A cipher disk or vigenere table (such as the one shown below) can be used for manual encryption/decryption. For example, the plain text “how does this work” can be encrypted using the key “vigenere”.

<style type="text/css">
.tg  {border-collapse:collapse;border-spacing:0;}
.tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:18px;
  overflow:hidden;padding:10px 5px;word-break:normal;}
.tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:18px;
  font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;}
.tg .tg-baqh{text-align:center;vertical-align:top}
</style>
<table class="tg" align="center" style="margin: 0px auto;"><thead>
  <tr>
    <th class="tg-baqh">Plain text</th>
    <th class="tg-baqh">H </th>
    <th class="tg-baqh">O </th>
    <th class="tg-baqh"><span style="font-weight:400;font-style:normal">W</span></th>
    <th class="tg-baqh"><span style="font-weight:400;font-style:normal">D </span></th>
    <th class="tg-baqh"><span style="font-weight:400;font-style:normal">O </span></th>
    <th class="tg-baqh"><span style="font-weight:400;font-style:normal">E </span></th>
    <th class="tg-baqh"><span style="font-weight:400;font-style:normal">S </span></th>
    <th class="tg-baqh"><span style="font-weight:400;font-style:normal">T </span></th>
    <th class="tg-baqh"><span style="font-weight:400;font-style:normal">H</span></th>
    <th class="tg-baqh"><span style="font-weight:400;font-style:normal">I </span></th>
    <th class="tg-baqh"><span style="font-weight:400;font-style:normal">S </span></th>
    <th class="tg-baqh"><span style="font-weight:400;font-style:normal">W </span></th>
    <th class="tg-baqh"><span style="font-weight:400;font-style:normal">O </span></th>
    <th class="tg-baqh"><span style="font-weight:400;font-style:normal">R </span></th>
    <th class="tg-baqh"><span style="font-weight:400;font-style:normal">K</span></th>
  </tr>
</thead>
<tbody>
  <tr>
    <td class="tg-baqh">key</td>
    <td class="tg-baqh">V</td>
    <td class="tg-baqh">I</td>
    <td class="tg-baqh">G</td>
    <td class="tg-baqh">E</td>
    <td class="tg-baqh">N</td>
    <td class="tg-baqh">E</td>
    <td class="tg-baqh">R</td>
    <td class="tg-baqh">E</td>
    <td class="tg-baqh">V</td>
    <td class="tg-baqh">I</td>
    <td class="tg-baqh">G</td>
    <td class="tg-baqh">E</td>
    <td class="tg-baqh">N</td>
    <td class="tg-baqh">E</td>
    <td class="tg-baqh">R</td>
  </tr>
  <tr>
    <td class="tg-baqh">Cipher text</td>
    <td class="tg-baqh">C</td>
    <td class="tg-baqh">W</td>
    <td class="tg-baqh">C</td>
    <td class="tg-baqh">H</td>
    <td class="tg-baqh">B</td>
    <td class="tg-baqh">I</td>
    <td class="tg-baqh">J</td>
    <td class="tg-baqh">X</td>
    <td class="tg-baqh">C</td>
    <td class="tg-baqh">Q</td>
    <td class="tg-baqh">Y</td>
    <td class="tg-baqh">A</td>
    <td class="tg-baqh">B</td>
    <td class="tg-baqh">V</td>
    <td class="tg-baqh">B</td>
  </tr>
</tbody></table>

![](Plain-key.png)

In the table above, the column headers is the plain text, and the row headers is the key, and where the corresponding column and row intersect is the cipher text. To <span style="background-color: #ff9e00">encrypt</span> a plain text character l and key character n, identify where the column l and row n intersect which is n. \
\
To <span style="background-color: #fff100">decrypt</span>, identify the cipher character on the appropriate key row, and then match it to the corresponding column header. For example, the cipher character p with the key character x becomes the plain text character s.\
\
Why use manual when we have computers? \
\
To take advantage of the computational power now available to us, however, it is significantly easier/faster if we use math. 


<style type="text/css">
.tg  {border-collapse:collapse;border-color:#93a1a1;border-spacing:0;}
.tg td{background-color:#fdf6e3;border-color:#93a1a1;border-style:solid;border-width:1px;color:#002b36;
  font-family:Arial, sans-serif;font-size:14px;overflow:hidden;padding:10px 5px;word-break:normal;}
.tg th{background-color:#657b83;border-color:#93a1a1;border-style:solid;border-width:1px;color:#fdf6e3;
  font-family:Arial, sans-serif;font-size:14px;font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;}
.tg .tg-m6gx{border-color:inherit;font-family:Verdana, Geneva, sans-serif !important;font-size:18px;text-align:center;
  vertical-align:top}
.tg .tg-hvap{border-color:inherit;font-family:Verdana, Geneva, sans-serif !important;font-size:18px;text-align:left;
  vertical-align:top}
</style>
<table class="tg" align="center" style="margin: 0px auto;"><thead>
  <tr>
    <th class="tg-m6gx">A</th>
    <th class="tg-m6gx">B</th>
    <th class="tg-m6gx">C</th>
    <th class="tg-m6gx">D</th>
    <th class="tg-m6gx">E</th>
    <th class="tg-m6gx">F</th>
    <th class="tg-m6gx">G</th>
    <th class="tg-m6gx">H</th>
    <th class="tg-m6gx">I</th>
    <th class="tg-m6gx">J</th>
    <th class="tg-m6gx">K</th>
    <th class="tg-m6gx">L</th>
    <th class="tg-m6gx">M</th>
    <th class="tg-m6gx">N</th>
    <th class="tg-m6gx">O</th>
    <th class="tg-m6gx">P</th>
    <th class="tg-hvap">Q</th>
    <th class="tg-hvap">R</th>
    <th class="tg-hvap">S</th>
    <th class="tg-hvap">T</th>
    <th class="tg-hvap">U</th>
    <th class="tg-hvap">V</th>
    <th class="tg-hvap">W</th>
    <th class="tg-hvap">X</th>
    <th class="tg-hvap">Y</th>
    <th class="tg-hvap">Z</th>
  </tr></thead>
<tbody>
  <tr>
    <td class="tg-m6gx">0</td>
    <td class="tg-m6gx">1</td>
    <td class="tg-m6gx">2</td>
    <td class="tg-m6gx">3</td>
    <td class="tg-m6gx">4</td>
    <td class="tg-m6gx">5</td>
    <td class="tg-m6gx">6</td>
    <td class="tg-m6gx">7</td>
    <td class="tg-m6gx">8</td>
    <td class="tg-m6gx">9</td>
    <td class="tg-m6gx">10</td>
    <td class="tg-m6gx">11</td>
    <td class="tg-m6gx">12</td>
    <td class="tg-m6gx">13</td>
    <td class="tg-m6gx">14</td>
    <td class="tg-m6gx">15</td>
    <td class="tg-hvap">16</td>
    <td class="tg-hvap">17</td>
    <td class="tg-hvap">18</td>
    <td class="tg-hvap">19</td>
    <td class="tg-hvap">20</td>
    <td class="tg-hvap">21</td>
    <td class="tg-hvap">22</td>
    <td class="tg-hvap">23</td>
    <td class="tg-hvap">24</td>
    <td class="tg-hvap">25</td>
  </tr>
</tbody></table>

If we map each character of the plain text and cipher text to a number (e.g. the number shown in the table above), then the rotation (or look up in the vigenere square) is essentially the same as adding those two numbers and looking up the corresponding letter.\
\
For example we see that plain text character l has a value of 11 and key character n has a value of 13. The sum of those two numbers is 24 which maps to y. This is an identical result to what we saw with the vigenere square. If the sum of the two numbers is more than 25, then we wrap around and continue from the beginning i.e. % 26.\
\
Decrypting is just a matter of doing the reverse i.e. subtract the numerical value of the key character from the numerical value of the cipher text character to get the numerical value of the plain text character. Make sure to wrap around if this difference is negative.\
\
More formally, if P is the plain text, K the key, C the cipher text, and the subscript i the character in the ith position of any of those three, then.

$C_i=(p_i+K_i)\%26$ \
\
To decrypt. \
\
$p_i=(26+C_i-K_i)\%26$ \
\
The 26 is added just to make the math of the wrap around consistent across different programming languages. Some languages give different results for the modulo of a negative number \
\
As an example, if our text is “cyberstorm is going to be bussin” and our key is “seacreatures”


\begin{aligned}
P_0 = c, K_0 = s → C_0 = (2+18)%26 = 20 = u 
\end{aligned}

\begin{aligned}
P_1 = y, K_1 = e → C_1 = (24+4)%26 = 2 = c 
\end{aligned}

\begin{aligned}
P_2 = b, K_2 = a → C_2 = (1+0)%26 = 1 = b 
\end{aligned}

\begin{aligned}
P_3 = e, K_3 = c → C_3 = (4+2)%26 = 6 = g 
\end{aligned}

\begin{aligned}
P_4 = r, K_4 = r → C_4 = (17+17)%26 = 8 = i \nonumber \\
\end{aligned}

\begin{aligned}
P_5 = s, K_5 = e → C_5 = (18+4)%26 = 22 = w \nonumber \\
\end{aligned}

\begin{aligned}
P_6 = t, K_6 = a → C_6 = (19+0)%26 = 19 = t \nonumber \\
\end{aligned}

\begin{aligned}
...
\end{aligned}
\
CipherText = **ucbgiwthld mk ysipx xo uy sykkmn** \
\
If you wanted to decrypt “qis afyr ilfjwkwot lwew vlwkar. Hg'j ixmlr. Rg uep.” using the same key. 

\begin{aligned}
C_0 = q, K_0 = s → P_0 = (26+16-18)%26 = 24 = y
\end{aligned}

\begin{aligned}
C_1 = i, K_1 = e → P_1 = (26+8-4)%26 = 4 = e
\end{aligned}

\begin{aligned}
C_2 = s, K_2 = a → P_2 = (26+18-0)%26 = 18 = s
\end{aligned}

\begin{aligned}
C_3 = a, K_3 = c → P_3 = (26+0-2)%26 = 24 = y
\end{aligned}

\begin{aligned}
C_4 = f, K_4 = r → P_4 = (26+5-17)%26 = 14 = o
\end{aligned}

\begin{aligned}
C_5 = y, K_5 = e → P_5 = (26+24-4)%26 = 20 = u
\end{aligned}

\begin{aligned}
C_6 = r, K_6 = a → P_6 = (26+17-0)%26 = 17 = r
\end{aligned}

\begin{aligned}
… 
\end{aligned}


Plain text = **yes your …** \
\
Since many programming languages actually store characters as integers, getting the numerical value of the character from the table above is often as simple as subtracting the integer value of ‘A’ or ‘a’.\
\
In c++ for example, ‘P’ - ‘A’ = 80 – 65 = 15. 

Historically speaking, cryptography was almost always concerned with hiding written language text. Computers have changed the game in three ways
- They do things faster and so allow us to decrypt cipher text messages faster
- They also allow us to come up with more complex ciphers i.e. ciphers that would be too complicated to carry out by hand in a realistic time frame
- They allow us to encrypt any kind of data that can be represented by 1s and 0s. ANYTHING. Not just text but any kind of digital file. One could argue this is the main change that computers brought to the cryptography field.

##### **<ins>Encoding.</ins>**
Cryptography is not only about hiding information. Sometimes its just about representing information in a manner that is convenient to work with. For example, since computers only deal with 1s and 0s, we have encoding schemes to represent information like characters in 1s and 0s. \
The most common encoding format is ASCII (American Standard Code for Information Interchange) which is based on the English alphabet. Its original format only required 7 bits to represent any thing that might be represented i.e. numbers, lower case and upper case letters, punctuation, and control characters (which were non-printable). \

Later as 8 bit storage became the norm, ASCII was extended to 8 bits which allowed for double the encodings. \
\
Common conversions are \
A = 65 = 01000001, B = 66 = 01000010, a = 97 = 01100001, b = 98 = 01100010 \
\
Another common encoding scheme is base-64. This particular scheme was birthed from a need to send any kind of binary information (such as attachments in en email) along a channel that only carried text. One way to identify this scheme is that the file will be made up entirely of characters, numbers and two signs e.g. + and /. Some files will also end with one or more = signs. \
\
The process of converting any binary stream to base-64 is similar to the conversion from binary to hexadecimal you covered in CSC/CYEN 131 i.e. group the bits, and then look up the mapping for that group (or the mapping for the numerical value of that group). It should not come as a surprise that for base 64, we shall be looking at groupings of 6 bits. $(2^6 = 64)$

<style type="text/css">
.tg  {border-collapse:collapse;border-color:#93a1a1;border-spacing:0;}
.tg td{background-color:#fdf6e3;border-color:#93a1a1;border-style:solid;border-width:1px;color:#002b36;
  font-family:Arial, sans-serif;font-size:14px;overflow:hidden;padding:10px 5px;word-break:normal;}
.tg th{background-color:#657b83;border-color:#93a1a1;border-style:solid;border-width:1px;color:#fdf6e3;
  font-family:Arial, sans-serif;font-size:14px;font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;}
.tg .tg-8we4{background-color:#c0c0c0;border-color:#333333;color:#333333;font-family:"Courier New", Courier, monospace !important;
  font-size:18px;text-align:center;vertical-align:bottom}
.tg .tg-do2v{background-color:#ffffff;border-color:#333333;font-family:"Courier New", Courier, monospace !important;font-size:18px;
  text-align:center;vertical-align:bottom}
.tg .tg-zf7o{background-color:#9b9b9b;border-color:#333333;color:#002b36;font-family:"Courier New", Courier, monospace !important;
  font-size:18px;font-weight:bold;text-align:center;vertical-align:bottom}
.tg .tg-dpog{background-color:#c0c0c0;border-color:#333333;font-family:"Courier New", Courier, monospace !important;font-size:18px;
  text-align:center;vertical-align:bottom}
.tg .tg-317h{background-color:#ffffff;border-color:#333333;color:#333333;font-family:"Courier New", Courier, monospace !important;
  font-size:18px;text-align:center;vertical-align:bottom}
</style>
<table class="tg" align="center" style="margin: 0px auto;"><thead>
  <tr>
    <th class="tg-zf7o">Value </th>
    <th class="tg-zf7o">Char </th>
    <th class="tg-zf7o">Value </th>
    <th class="tg-zf7o">Char </th>
    <th class="tg-zf7o">Value </th>
    <th class="tg-zf7o">Char </th>
    <th class="tg-zf7o">Value </th>
    <th class="tg-zf7o">Char </th>
  </tr></thead>
<tbody>
  <tr>
    <td class="tg-8we4">0 </td>
    <td class="tg-8we4">A </td>
    <td class="tg-317h">16 </td>
    <td class="tg-317h">Q </td>
    <td class="tg-8we4">32 </td>
    <td class="tg-8we4">g </td>
    <td class="tg-317h">48 </td>
    <td class="tg-317h">w </td>
  </tr>
  <tr>
    <td class="tg-8we4">1 </td>
    <td class="tg-8we4">B </td>
    <td class="tg-317h">17 </td>
    <td class="tg-317h">R </td>
    <td class="tg-8we4">33 </td>
    <td class="tg-8we4">h </td>
    <td class="tg-317h">49 </td>
    <td class="tg-317h">x </td>
  </tr>
  <tr>
    <td class="tg-8we4">2 </td>
    <td class="tg-8we4">C </td>
    <td class="tg-317h">18 </td>
    <td class="tg-317h">S </td>
    <td class="tg-8we4">34 </td>
    <td class="tg-8we4">i </td>
    <td class="tg-317h">50 </td>
    <td class="tg-317h">y </td>
  </tr>
  <tr>
    <td class="tg-8we4">3 </td>
    <td class="tg-8we4">D </td>
    <td class="tg-317h">19 </td>
    <td class="tg-317h">T </td>
    <td class="tg-8we4">35 </td>
    <td class="tg-8we4">j </td>
    <td class="tg-317h">51 </td>
    <td class="tg-317h">z </td>
  </tr>
  <tr>
    <td class="tg-8we4">4 </td>
    <td class="tg-8we4">E </td>
    <td class="tg-317h">20 </td>
    <td class="tg-317h">U </td>
    <td class="tg-8we4">36 </td>
    <td class="tg-8we4">k </td>
    <td class="tg-317h">52 </td>
    <td class="tg-317h">0 </td>
  </tr>
  <tr>
    <td class="tg-8we4">5 </td>
    <td class="tg-8we4">F </td>
    <td class="tg-317h">21 </td>
    <td class="tg-317h">V </td>
    <td class="tg-8we4">37 </td>
    <td class="tg-8we4">l </td>
    <td class="tg-317h">53 </td>
    <td class="tg-317h">1 </td>
  </tr>
  <tr>
    <td class="tg-8we4">6 </td>
    <td class="tg-8we4">G </td>
    <td class="tg-317h">22 </td>
    <td class="tg-317h">W </td>
    <td class="tg-8we4">38 </td>
    <td class="tg-8we4">m </td>
    <td class="tg-317h">54 </td>
    <td class="tg-317h">2 </td>
  </tr>
  <tr>
    <td class="tg-8we4">7 </td>
    <td class="tg-8we4">H </td>
    <td class="tg-317h">23 </td>
    <td class="tg-317h">X </td>
    <td class="tg-8we4">39 </td>
    <td class="tg-8we4">n </td>
    <td class="tg-317h">55 </td>
    <td class="tg-317h">3 </td>
  </tr>
  <tr>
    <td class="tg-dpog">8 </td>
    <td class="tg-dpog">I </td>
    <td class="tg-do2v">24 </td>
    <td class="tg-do2v">Y </td>
    <td class="tg-dpog">40 </td>
    <td class="tg-dpog">o </td>
    <td class="tg-do2v">56 </td>
    <td class="tg-do2v">4 </td>
  </tr>
  <tr>
    <td class="tg-dpog">9 </td>
    <td class="tg-dpog">J </td>
    <td class="tg-do2v">25 </td>
    <td class="tg-do2v">Z </td>
    <td class="tg-dpog">41 </td>
    <td class="tg-dpog">p </td>
    <td class="tg-do2v">57 </td>
    <td class="tg-do2v">5 </td>
  </tr>
  <tr>
    <td class="tg-dpog">10 </td>
    <td class="tg-dpog">K </td>
    <td class="tg-do2v">26 </td>
    <td class="tg-do2v">a </td>
    <td class="tg-dpog">42 </td>
    <td class="tg-dpog">q </td>
    <td class="tg-do2v">58 </td>
    <td class="tg-do2v">6 </td>
  </tr>
  <tr>
    <td class="tg-dpog">11 </td>
    <td class="tg-dpog">L </td>
    <td class="tg-do2v">27 </td>
    <td class="tg-do2v">b </td>
    <td class="tg-dpog">43 </td>
    <td class="tg-dpog">r </td>
    <td class="tg-do2v">59 </td>
    <td class="tg-do2v">7 </td>
  </tr>
  <tr>
    <td class="tg-dpog">12 </td>
    <td class="tg-dpog">M </td>
    <td class="tg-do2v">28 </td>
    <td class="tg-do2v">c </td>
    <td class="tg-dpog">44 </td>
    <td class="tg-dpog">s </td>
    <td class="tg-do2v">60 </td>
    <td class="tg-do2v">8 </td>
  </tr>
  <tr>
    <td class="tg-dpog">13 </td>
    <td class="tg-dpog">N </td>
    <td class="tg-do2v">29 </td>
    <td class="tg-do2v">d </td>
    <td class="tg-dpog">45 </td>
    <td class="tg-dpog">t </td>
    <td class="tg-do2v">61 </td>
    <td class="tg-do2v">9 </td>
  </tr>
  <tr>
    <td class="tg-dpog">14 </td>
    <td class="tg-dpog">O </td>
    <td class="tg-do2v">30 </td>
    <td class="tg-do2v">e </td>
    <td class="tg-dpog">46 </td>
    <td class="tg-dpog">u </td>
    <td class="tg-do2v">62 </td>
    <td class="tg-do2v">+ </td>
  </tr>
  <tr>
    <td class="tg-dpog">15 </td>
    <td class="tg-dpog">P </td>
    <td class="tg-do2v">31 </td>
    <td class="tg-do2v">f </td>
    <td class="tg-dpog">47 </td>
    <td class="tg-dpog">v </td>
    <td class="tg-do2v">63 </td>
    <td class="tg-do2v">/ </td>
  </tr>
</tbody></table>

To demonstrate this conversion, we shall start with text as represented using
ASCII. Even though we are using text to begin with, this technique can be used
with any digital file since all digital files are made up of 1s and 0s. \
\
Let’s say we want to convert the string “abc” from ascii to base-64.

<style type="text/css">
.tg  {border-collapse:collapse;border-color:#93a1a1;border-spacing:0;}
.tg td{background-color:#fdf6e3;border-color:#93a1a1;border-style:solid;border-width:1px;color:#002b36;
  font-family:Arial, sans-serif;font-size:14px;overflow:hidden;padding:10px 5px;word-break:normal;}
.tg th{background-color:#657b83;border-color:#93a1a1;border-style:solid;border-width:1px;color:#fdf6e3;
  font-family:Arial, sans-serif;font-size:14px;font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;}
.tg .tg-b73z{background-color:#ffffff;border-color:#333333;color:#002b36;font-family:"Courier New", Courier, monospace !important;
  font-size:18px;font-weight:bold;text-align:center;vertical-align:bottom}
.tg .tg-hopj{background-color:#9b9b9b;border-color:#333333;color:#333333;font-family:"Courier New", Courier, monospace !important;
  font-size:18px;font-weight:bold;text-align:center;vertical-align:bottom}
.tg .tg-zf7o{background-color:#9b9b9b;border-color:#333333;color:#002b36;font-family:"Courier New", Courier, monospace !important;
  font-size:18px;font-weight:bold;text-align:center;vertical-align:bottom}
.tg .tg-317h{background-color:#ffffff;border-color:#333333;color:#333333;font-family:"Courier New", Courier, monospace !important;
  font-size:18px;text-align:center;vertical-align:bottom}
</style>
<table class="tg" style="margin: 0px auto; undefined;table-layout: fixed; width: 657px" align="center"><colgroup>
<col style="width: 106.090909px">
<col style="width: 140.090909px">
<col style="width: 144.090909px">
<col style="width: 122.090909px">
<col style="width: 145.090909px">
</colgroup>
<thead>
  <tr>
    <th class="tg-zf7o">Input </th>
    <th class="tg-b73z">a </th>
    <th class="tg-b73z">b </th>
    <th class="tg-b73z" colspan="2">c </th>
  </tr></thead>
<tbody>
  <tr>
    <td class="tg-hopj">ASCII </td>
    <td class="tg-317h">97 </td>
    <td class="tg-317h">98 </td>
    <td class="tg-317h" colspan="2">99 </td>
  </tr>
  <tr>
    <td class="tg-hopj">Binary </td>
    <td class="tg-317h" colspan="4">0 1 1 0 0 0 0 1 0 1 1 0 0 0 1 0 0 1 1 0 0 0 1 1</td>
  </tr>
  <tr>
    <td class="tg-hopj">Index </td>
    <td class="tg-317h">24 </td>
    <td class="tg-317h">22 </td>
    <td class="tg-317h">9 </td>
    <td class="tg-317h">35 </td>
  </tr>
  <tr>
    <td class="tg-hopj">Base-64 </td>
    <td class="tg-317h">Y </td>
    <td class="tg-317h">W </td>
    <td class="tg-317h"><span style="font-weight:bold">J</span></td>
    <td class="tg-317h">j </td>
  </tr>
</tbody>
</table>

Which means it would be encoded as YWJj. \
\
The original bits (which in this case were gotten by encoding the characters in
8-bit ASCII) are placed next to each other, grouped in 6s, and then each of those
groups is converted back to a decimal number and looked up in the base-64
table.

In [None]:
$ echo -n "abc" | base64 # -n is to remove the newline character

Notice, however, that 8 bits for 3 characters sums up to 24 bits, which is a
perfect multiple of 6 (the group size) so every bit is converted accurately. That is
not always the case. If the number of bits of the original message is not a
perfect multiple of 8, then we have to pad the original message until we get to a number of bits that is a multiple of both 6 and 8 (8 bits because we’re using 8-
bit ascii) \
\
As an example, let’s say we wanted to convert the string “x”

<style type="text/css">
.tg  {border-collapse:collapse;border-color:#93a1a1;border-spacing:0;}
.tg td{background-color:#fdf6e3;border-color:#93a1a1;border-style:solid;border-width:1px;color:#002b36;
  font-family:Arial, sans-serif;font-size:14px;overflow:hidden;padding:10px 5px;word-break:normal;}
.tg th{background-color:#657b83;border-color:#93a1a1;border-style:solid;border-width:1px;color:#fdf6e3;
  font-family:Arial, sans-serif;font-size:14px;font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;}
.tg .tg-b73z{background-color:#ffffff;border-color:#333333;color:#002b36;font-family:"Courier New", Courier, monospace !important;
  font-size:18px;font-weight:bold;text-align:center;vertical-align:bottom}
.tg .tg-hopj{background-color:#9b9b9b;border-color:#333333;color:#333333;font-family:"Courier New", Courier, monospace !important;
  font-size:18px;font-weight:bold;text-align:center;vertical-align:bottom}
.tg .tg-zf7o{background-color:#9b9b9b;border-color:#333333;color:#002b36;font-family:"Courier New", Courier, monospace !important;
  font-size:18px;font-weight:bold;text-align:center;vertical-align:bottom}
.tg .tg-lapc{background-color:#ffffff;border-color:#333333;color:#002b36;font-family:"Courier New", Courier, monospace !important;
  font-size:18px;text-align:center;vertical-align:bottom}
.tg .tg-317h{background-color:#ffffff;border-color:#333333;color:#333333;font-family:"Courier New", Courier, monospace !important;
  font-size:18px;text-align:center;vertical-align:bottom}
.tg .tg-bi1e{background-color:#34cdf9;border-color:#333333;color:#333333;font-family:"Courier New", Courier, monospace !important;
  font-size:18px;text-align:center;vertical-align:bottom}
.tg .tg-xezj{background-color:#34cdf9;border-color:#002b36;color:#333333;font-family:"Courier New", Courier, monospace !important;
  font-size:18px;text-align:center;vertical-align:bottom}
.tg .tg-r5w0{background-color:#34cdf9;border-color:#333333;color:#3166ff;font-family:"Courier New", Courier, monospace !important;
  font-size:18px;text-align:center;vertical-align:bottom}
</style>
<table class="tg" align="center" style="margin: 0px auto; undefined;table-layout: fixed; width: 727px"><colgroup>
<col style="width: 106.090909px">
<col style="width: 173.090909px">
<col style="width: 181.090909px">
<col style="width: 122.090909px">
<col style="width: 145.090909px">
</colgroup>
<thead>
  <tr>
    <th class="tg-zf7o">Input </th>
    <th class="tg-lapc">X</th>
    <th class="tg-b73z"></th>
    <th class="tg-b73z" colspan="2"></th>
  </tr></thead>
<tbody>
  <tr>
    <td class="tg-hopj">ASCII </td>
    <td class="tg-317h">120 </td>
    <td class="tg-bi1e">0 </td>
    <td class="tg-bi1e" colspan="2">0 </td>
  </tr>
  <tr>
    <td class="tg-hopj">Binary </td>
    <td class="tg-317h">0 1 1 1 1 0 0 0</td>
    <td class="tg-xezj">0 0 0 0 0 0 0 0</td>
    <td class="tg-bi1e" colspan="2">0 0 0 0 0 0 0 0</td>
  </tr>
  <tr>
    <td class="tg-hopj">Index </td>
    <td class="tg-317h">30</td>
    <td class="tg-317h">0</td>
    <td class="tg-bi1e">0</td>
    <td class="tg-bi1e">0</td>
  </tr>
  <tr>
    <td class="tg-hopj">Base-64 </td>
    <td class="tg-317h">e</td>
    <td class="tg-317h">A</td>
    <td class="tg-bi1e">=</td>
    <td class="tg-r5w0">=</td>
  </tr>
</tbody>
</table>

The first 6 bits make sense: they convert to 30 which maps to “e”. The next two
bits are part of the original message and require 4 padded bits. They convert
to a 0 which maps to “A”. The next 12 bits, however, are made up entirely of bits
that were added just for padding. To convert them directly to base-64 would be
erroneous since there were not part of the original message. Therefore they are
mapped to the symbol “=” which you will notice was not part of the original
base-64 character set. Seeing a “=” symbol at the end of a base-64 conversion is
a way to tell that padding was done and therefore one would not consider
those extra bits when converting back from base-64. \
\
Let’s try another example in which we convert the string “CS” to base-64.

<style type="text/css">
.tg  {border-collapse:collapse;border-color:#93a1a1;border-spacing:0;}
.tg td{background-color:#fdf6e3;border-color:#93a1a1;border-style:solid;border-width:1px;color:#002b36;
  font-family:Arial, sans-serif;font-size:14px;overflow:hidden;padding:10px 5px;word-break:normal;}
.tg th{background-color:#657b83;border-color:#93a1a1;border-style:solid;border-width:1px;color:#fdf6e3;
  font-family:Arial, sans-serif;font-size:14px;font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;}
.tg .tg-b73z{background-color:#ffffff;border-color:#333333;color:#002b36;font-family:"Courier New", Courier, monospace !important;
  font-size:18px;font-weight:bold;text-align:center;vertical-align:bottom}
.tg .tg-hopj{background-color:#9b9b9b;border-color:#333333;color:#333333;font-family:"Courier New", Courier, monospace !important;
  font-size:18px;font-weight:bold;text-align:center;vertical-align:bottom}
.tg .tg-zf7o{background-color:#9b9b9b;border-color:#333333;color:#002b36;font-family:"Courier New", Courier, monospace !important;
  font-size:18px;font-weight:bold;text-align:center;vertical-align:bottom}
.tg .tg-lapc{background-color:#ffffff;border-color:#333333;color:#002b36;font-family:"Courier New", Courier, monospace !important;
  font-size:18px;text-align:center;vertical-align:bottom}
.tg .tg-317h{background-color:#ffffff;border-color:#333333;color:#333333;font-family:"Courier New", Courier, monospace !important;
  font-size:18px;text-align:center;vertical-align:bottom}
.tg .tg-bi1e{background-color:#34cdf9;border-color:#333333;color:#333333;font-family:"Courier New", Courier, monospace !important;
  font-size:18px;text-align:center;vertical-align:bottom}
.tg .tg-oy5m{background-color:#ffffff;border-color:#002b36;color:#333333;font-family:"Courier New", Courier, monospace !important;
  font-size:18px;text-align:center;vertical-align:bottom}
.tg .tg-r5w0{background-color:#34cdf9;border-color:#333333;color:#3166ff;font-family:"Courier New", Courier, monospace !important;
  font-size:18px;text-align:center;vertical-align:bottom}
</style>
<table class="tg" align="center" style="margin: 0px auto; undefined;table-layout: fixed; width: 727px"><colgroup>
<col style="width: 106.090909px">
<col style="width: 173.090909px">
<col style="width: 181.090909px">
<col style="width: 122.090909px">
<col style="width: 145.090909px">
</colgroup>
<thead>
  <tr>
    <th class="tg-zf7o">Input </th>
    <th class="tg-lapc">C</th>
    <th class="tg-b73z">S</th>
    <th class="tg-b73z" colspan="2"></th>
  </tr></thead>
<tbody>
  <tr>
    <td class="tg-hopj">ASCII </td>
    <td class="tg-317h">67 </td>
    <td class="tg-317h">83 </td>
    <td class="tg-bi1e" colspan="2">0 </td>
  </tr>
  <tr>
    <td class="tg-hopj">Binary </td>
    <td class="tg-317h">0 1 0 0 0 0 1 1</td>
    <td class="tg-oy5m">0 1 0 1 0 0 1 1</td>
    <td class="tg-bi1e" colspan="2">0 0 0 0 0 0 0 0</td>
  </tr>
  <tr>
    <td class="tg-hopj">Index </td>
    <td class="tg-317h">16 </td>
    <td class="tg-317h">53 </td>
    <td class="tg-317h">12 </td>
    <td class="tg-bi1e">0 </td>
  </tr>
  <tr>
    <td class="tg-hopj">Base-64 </td>
    <td class="tg-317h">Q </td>
    <td class="tg-317h">1 </td>
    <td class="tg-317h">M </td>
    <td class="tg-r5w0">= </td>
  </tr>
</tbody>
</table>

We see it converts to Q1M= in base-64. \
\
What about going backwards i.e. from base-64 to ASCII? We shall try “RU4=”

<style type="text/css">
.tg  {border-collapse:collapse;border-color:#93a1a1;border-spacing:0;}
.tg td{background-color:#fdf6e3;border-color:#93a1a1;border-style:solid;border-width:1px;color:#002b36;
  font-family:Arial, sans-serif;font-size:14px;overflow:hidden;padding:10px 5px;word-break:normal;}
.tg th{background-color:#657b83;border-color:#93a1a1;border-style:solid;border-width:1px;color:#fdf6e3;
  font-family:Arial, sans-serif;font-size:14px;font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;}
.tg .tg-hopj{background-color:#9b9b9b;border-color:#333333;color:#333333;font-family:"Courier New", Courier, monospace !important;
  font-size:18px;font-weight:bold;text-align:center;vertical-align:bottom}
.tg .tg-zf7o{background-color:#9b9b9b;border-color:#333333;color:#002b36;font-family:"Courier New", Courier, monospace !important;
  font-size:18px;font-weight:bold;text-align:center;vertical-align:bottom}
.tg .tg-lapc{background-color:#ffffff;border-color:#333333;color:#002b36;font-family:"Courier New", Courier, monospace !important;
  font-size:18px;text-align:center;vertical-align:bottom}
.tg .tg-trj9{background-color:#34cdf9;border-color:#333333;color:#002b36;font-family:"Courier New", Courier, monospace !important;
  font-size:18px;text-align:center;vertical-align:bottom}
.tg .tg-317h{background-color:#ffffff;border-color:#333333;color:#333333;font-family:"Courier New", Courier, monospace !important;
  font-size:18px;text-align:center;vertical-align:bottom}
.tg .tg-bi1e{background-color:#34cdf9;border-color:#333333;color:#333333;font-family:"Courier New", Courier, monospace !important;
  font-size:18px;text-align:center;vertical-align:bottom}
.tg .tg-oy5m{background-color:#ffffff;border-color:#002b36;color:#333333;font-family:"Courier New", Courier, monospace !important;
  font-size:18px;text-align:center;vertical-align:bottom}
</style>
<table class="tg" align="center" style="margin: 0px auto; undefined;table-layout: fixed; width: 687px"><colgroup>
<col style="width: 106.090909px">
<col style="width: 149.090909px">
<col style="width: 156.090909px">
<col style="width: 131.090909px">
<col style="width: 145.090909px">
</colgroup>
<thead>
  <tr>
    <th class="tg-zf7o">Input </th>
    <th class="tg-lapc">R </th>
    <th class="tg-lapc">U </th>
    <th class="tg-lapc">4 </th>
    <th class="tg-trj9">= </th>
  </tr></thead>
<tbody>
  <tr>
    <td class="tg-hopj">ASCII </td>
    <td class="tg-317h">17 </td>
    <td class="tg-317h">20 </td>
    <td class="tg-317h">56 </td>
    <td class="tg-bi1e">0 </td>
  </tr>
  <tr>
    <td class="tg-hopj">Binary </td>
    <td class="tg-317h">0 1 0 0 0 1</td>
    <td class="tg-oy5m">0 1 0 1 0 0</td>
    <td class="tg-317h">1 1 1 0 0 0</td>
    <td class="tg-bi1e">0 0 0 0 0 0</td>
  </tr>
  <tr>
    <td class="tg-hopj">Index </td>
    <td class="tg-317h">69 </td>
    <td class="tg-317h">78 </td>
    <td class="tg-bi1e" colspan="2">0 </td>
  </tr>
  <tr>
    <td class="tg-hopj">Output</td>
    <td class="tg-317h">E </td>
    <td class="tg-317h">N </td>
    <td class="tg-317h" colspan="2"></td>
  </tr>
</tbody>
</table>

We see it converts to “EN”

In [None]:
$ echo -n "RU4=" | base64 -d # -d is to decode base-64

##### **<ins>Cryptography today*.</ins>**
The Internet created a need for an even more secure cryptography. We now all use a system over which no single entity has complete power and bad actors can easily get access to the information you are sending to or receiving from someone else. Additionally we also need ways of confirming that the person we are communicating with is in fact the person we intended to communicate with, or the file we requested is in fact the file we received. All these issues/problems leverage modern cryptography in some form. \
Symmetric-key cryptography works like most of the classic cryptography approaches we discussed earlier i.e. both sender and receiver need to share the key in order to encrypt and decrypt the message. Depending on the algorithm, the input stream can be encrypted as it is experienced (stream ciphers) i.e. a byte or letter at a time OR it can encrypt them in blocks of bits with each block encrypted as a single unit (block ciphers). \
\
Examples of algorithms in this category include AES, 3DES, Serpent, Twofish, Blowfish \
\
This is typically a fast cryptography approach. The only issue is that it is based on the assumption that the two people trying to communicate were able to share a key privately. Doing so requires a private channel and yet we’re trying to set up a private channel. Many times this is not possible for one reason or another e.g. physical separation. \
\
![](2.png)
\
Asymmetric cryptography is relatively newer and uses the idea of two keys. The two keys are linked in such a way that any message encrypted with one of them requires the second to be decrypted. Another characteristic of the keys is that you can not design (or guess) one of them from the other. This second characteristic is the whole crux of the system. One of the more common ways of guaranteeing that is the use of prime numbers/factors. It takes very long to identify whether a number is prime, and it takes even longer to identify the prime factors of a number (particularly a very large number). The process of encryption/decryption is therefore slightly different. The person who wants to send a message encrypts it with one key, and the person who wants to decrypt it uses the other key. The first key is called the public key because you provide that key to everyone and anyone. You do this so that anyone who wants to send you a message can be sure that you are the only person who can open it since you are the only person who has the other key (aptly called the private key)
![](3.png)

There are two major advantages of this approach. First is that we don’t have to have met for you to communicate with me and know that our communication is private. There are servers dedicated to storing people’s public keys, people attach them to emails, etc. So if you want to communicate with me, all I have to have done is have published my public key somewhere. \
\
The second advantage is that this technique allows us to solve the problem of confirming the identity of someone if the encryption/decryption process is reversed. Person A uses their private key to encrypt a broadcast message. Anyone who has access to their public key (which everyone does) can now use that key to decrypt the message. It might be weird because it is not a secret message and anyone can decrypt it but the cool thing about it is that we all know that no-one else could have created that message in the first place since only you have the private key. In that regard, the public/private key approach allows us to have digital signatures which help us to determine the authenticity of a message. It allows us to answer questions like “is this message from a trusted source?” and “has this message been changed in transit by someone pretending to be the trusted source?”. \
\
Examples of the public/private key systems include Diffie-Helman, RSA, DSA (Digital Signature algorithm), ECDSA (Eliptical curve DSA). You can find some of these details in the certificate details of your web browser. \
\
Hashing is another way cryptography is used today. Recall from CSC 220 that hashing is a central part of how hash tables work. For hash tables, hashing converted a key of any length/type into a valid index in the table where the value would be stored. Outside of hash tables, hashing is used to translate data of any size into a unique fixed size string output. Any good hashing function works in such a way that this process cannot be reversed. At first glance, this seems counter productive i.e. why scramble up a long message into a short string of gibberish if I cannot unscramble that gibberish back to get the message? The reason is that a good hash function will scramble any message into a unique string of gibberish. I’ll give you two examples of where this functionality is actually helpful. \
\
Imagine you downloaded an ISO file for Linux Mint and it took 2 hours to download on your home internet (in fact imagine how long it would take on the slow internet we had not that long ago). It would be very painful if you took a full day installing the operating system only to realise that the ISO file was corrupted in some way during the download and you’ll have to start the whole download and install process again with the hope that it doesn’t happen again. What many servers that store large files will do is provide a hash of that file. That way once you have downloaded the file, you can run it through the hash function, a process which takes a very short time, and confirm if the hash you got is identical to the hash that the server said you should get. If so, then your file was not corrupted. If not, then you need to re-download and not waste your time installing the corrupted file. \
\
Another place that hashing is used is in the storage of passwords. Given that bad actors can easily get access to systems, it would be bad if they got access to every password that was ever used on that system in plain text. What most servers/systems will do is actually store a hash of your password instead. That way there is no way the bad actors can get your passwords. They just get the hashes and there is no way to get the passwords when you have the hash. The only thing they can do is run a bunch of passwords they know of through the hashing function and hope that one of them hashes to the exact same thing as your hash. And if that happens then they have figured out your password. There are ways to deal with this but they are beyond our discussion right now. \
\
There are multiple hashing functions being used these days. These hash functions are typically evaluated based on the number of collisions they have (which should be very low), and how difficult they are to reverse (which should be impossible). \
\
MD5 (Message Digest algorithm 5) converts large data into a string of 128 bits. As early as 1993 researchers found vulnerabilities in it (such as collisions) and so it isn’t used much for high security cryptography but rather for checksums to verify data.

In [None]:
$ echo -n "hello world" | md5sum # any small difference in input
$ echo “hello world” | md5sum # should produce a very different hash

SHA – Secure Hash Algorithms – is a family of hashing functions. SHA1 is similar
to MD5 and was pretty much abandoned as of 2010. SHA 256 and SHA 512 are
probably some of the more common ones and are named after the length in
bits of the output they produce.

In [None]:
$ echo -n "hello world" | sha1sum
$ echo -n “hello world” | sha256sum
$ echo -n “hello world” | sha512sum

This has been a lot of information but hopefully it is very intriguing information.
Some of these topics we shall cover in more detail later in the quarter with more
application. Others you’ll just have to wait till you see them in classes like CSC
444/544/CYEN 406 [^long].

[^long]: Typically offered every other year in the Spring quarter. Was last offered in
Spring 2024.