In [4]:
%%html
<script src="js/showNotes.js"></script>

<p style="font-size:9px;">Version: Discussion version 1, 
    July 18, 2023 | <b>updated Nov 11, 2023</b> [&copy; 2023 gb]</p>
<img src="images/ischool-banner.png" />
<p>&nbsp;</p>
<div style="height: 80px;color:#3B7EA1; font-size:24px; vertical-align:top; text-align:left;">
    Computing for Data Science
    <img style="width:20%; vertical-align:top; float: left; padding-right: 20px; display: block; margin-left: 0; margin-right: auto;" 
         src="images/ucb_logo.png"/>
</div>

<h2 style="background-color: #003262; color:white; border-radius: 4px; padding: 8px;">Week 11: Optional.  Python + SQL, Encoding, and the Aesthetics of Data</h2>
<p>This notebook details encoding issues and how you can convert data from/to another encoding for data cleansing and other practices.  It shows, too, an example of using python to communicate a RDBMS (MySQL).</p>
<p><ol>
    <li><a href="w11.ipynb">
            Main Week 11 Lesson</a>
    </li>
    <li><a href="w11_examples.ipynb" target="new">
        Real-World Examples: MySQL &amp; Python, Encoding Issues, Vector Model and 3D Plot; Vector Model &amp; NLTK + plots; Bit Signature</a></li>
    </ol>
    </p>
    <p>We look more at the aesthetics of data in the visualization part next week!</p>
    <hr />
    <p>The below are the user-facing representation of the same glyph: on the screen they look different, in the code they are all the same.</p>  
<h2 style="background-color:coral;color:white;padding:10px;">
    <p>Encoding &amp; Display of Data Glyphs.</p>
</h2>
    <p style="font-size:72px;"><span style="font-family:Palatino">A</span>
    <span style="font-family:Baskerville">A</span>
    <span style="font-family:Helvetica">A</span>
    <span style="font-family:Futura">A</span>
</p>

<h2 style="background-color: #003262; color:white; border-radius: 4px; padding: 8px;">Encoding</h2>
<p style="border-radius:3px;background-color:steelblue;padding:5px;color:white;">Example 1.  Encoding</p>
<p>Letters and characters we see on screen are <b>glyphs</b> - they represent human interpretation of symbols.  Just as in our own penmanship and printing (and computer fonts) the same glyph can be visually represented in many ways, so it is in the computer, the <b>encoding</b> of a given glyph can have different representations, too.</p><p>A main point for this lesson is we may need to convert data to/from different representations to make them comparable - as part of "data cleansing" - and to be sure we can integrate various data sources successfully.</p>
Encoding:
<ol><li>definitions</li>
<li>binary</li>
<li>hex (as <b>code points</b>)</li>
<li>octal</li>
<li>HTML entities</li>
<li>Alt sequences (Windows)</li>
<li>and actual name!</li>
</ol>
<p>Here&rsquo;s an example of <code>:dragon</code><span style="font-size:36px;">&#128009;</span>
<hr /></p>

<p>Most computing languages include a library to <b>encode</b> and <b>decode</b> individual characters or entire data streams. We use these to ensure the data will be in a known, preferred format before using.</p>
<p>a <b>glyph</b> is the computing term for describing the human-oriented symbols, like alphabets, abjads, characters; e.g., a, M, ल, ж, 明</p>
<p><b>ASCII</b> is a computer encoding standard covering the numeric equivalents of 0 to 127.  Originally 7-bits; uses 8-bits (1 byte) per character.</p>

<b>Unicode</b> is a descriptive standard identifying or mapping between the glyph and the computer presentation.  Unicode is a superset, encompasing ASCII when using 8-bites; otherwise it is a multi-byte character encoding scheme or 2, 3, or 4 bytes per character.</p>

<p><u>Input and Unicode/UTF-8</u> Lots of physical hardware (keyboards) for input; limited options so need a software solution - the keyboard driver - that accepts an input and converts it to a code point. Checkout the <a href="https://unicode.org/charts" target="new">Unicode Code Pages</a>.</p>

<p><b>UTF-8</b> (Unicode Transformation Format - 8 bits) is the computer implementation of the Unicode standard.<br />
Unicode: Compare 8-bits (for ascii) and the same for 16-bit (say using Russian): <code>a</code> in ascii is 097, dec is 61; in hex it&rsquo;s 0061; in binary it&rsquo;s <code>011000001</code>... in the range for Cyrllic (0400-04FF), the same <code>a</code> glyph is 0430.</p>

<p>Other Standards include <b>ISO-8859-x</b>, that includes "Latin1"; a term people use equally tho incorrectly to refer to ASCII, tho they are the same codes.</p>

<p>Other computing technologies use similar encoding schemes but with different syntax.  E.g., the HTML <b>entity</b> for <i>dragon</i> &#x1f409; in hex is &amp;#x1f409.  In UTF-8 hex its f09f9089 (or equivalent 0xF0 0x9F 0x90 0x89)</p>

<p><b>ISO-639-x</b> is the standard for language and country codes, e.g., <code>zh</code> = zhong wen; en = english.  We can divide further: <code>de-de</code> means "German language in Germany"; "de-au" means "German language in Austria".  Can even use US States - en-us-tx means "Texas"!  <br />
    Note, too, that files can have embedded xml tags for code to identify a language and to ensure the data print correctly, such as &lt;xml:lang='jp'&gt; to indicate a section of the data are in Japanese.<br />Always use these kinds of standards because network servers are configured to send/receive data in a particular encoding as well as user preferences are stored this way.</p>
<p>
    When saving data: this is the "internal reflection of the data"; when presenting data on the screen for an end-user, the OS has its own "rendering engine" that reads the data stream, adjusts the output (shown below), adds the font metrics data and then shows on the screen (the "outward reflection of data").</p>
    
 <p>A cardinal rule today is the <i style="color:red;">data are separate from the presentation of data</i>.</p>

<table width="100%">
<tr><td>emoji</td>
    <td style="text-align:left;"><code>:dragon:</code></td></tr>
    <tr><td>HTML entity (decimal)</td>
        <td style="text-align:left;"><code> &amp; # 128009;</code></td></tr>
    <tr><td>HTML entity (hex)</td>
        <td style="text-align:left;"><code> &amp; # x1f409;</code></td></tr>
    <tr><td>Windows Alt</td>
        <td style="text-align:left;"><code>Alt + 1F409</code></td></tr>
    <tr><td>UTF-8 (hex)</td>
        <td style="text-align:left;"><code>0xF0 0x9F 0x90 0x89</code> or <code>f09f9089</code></td></tr>
    <tr><td>UTF-8 (binary)</td>
        <td style="text-align:left;"><code>11110000:10011111:10010000:10001001</code></td></tr>
    <tr><td>UTF-16 (hex)</td>
        <td style="text-align:left;"><code>0xD83D 0xDc09</code> or <code>d83ddc09</code></td></tr>
<tr><td>UTF-16 (decimal)</td>
    <td style="text-align:left;"><code>55,357 56,329</code></td></tr>
<tr><td>UTF-32 (hex)</td>
    <td style="text-align:left;"><code>0x0001F409</code> or <code>1f409</code></td></tr>
    <tr><td>UTF-32 (decimal)</td>
        <td style="text-align:left;"><code>128,009</code></td></tr>
    <tr><td>C/C++/Java source</td>
        <td style="text-align:left;"><code>"\uD83D\uDC09"</code></td></tr>
    <tr><td>Python source code</td>
        <td style="text-align:left;"><code>u"\U0001F409"</code></td></tr>
</table>

<hr />
<h2><a href="https://www.unicode.org/charts/" target="new">Visit the Unicode Homepage</a> CODE POINTS and Code Charts to see the glyphs and their unique names and code points.</h2>

<img style="width:400px;" src="images/unicode-1.png">

<img style="width:500px;" src="images/unicode-2.png">

<p style="border-radius:3px;background-color:steelblue;padding:5px;color:white;">Examples: Encoding and Input</p>
<p>We rely on virtual (software) keyboards to map between our input stream (like pressing the letter "a" to generate the code 0097) and the output display (like seeing the Cyrillic letter ф, code 0430).</p>
<p>The standard for describing every glyph with a unique name and code point is <b>Unicode</b>.  The computer implementation is <b>UTF-8</b>, or <i>Unicode transformation format, 8-bits</i>, a multi-byte encoding scheme.  If your OS is English, the byte stream is 8-bits; if you're using a Chinese OS, then the byte stream is 16-bytes. Since the left-most bits for English aren't needed (they're all 0s) they can be dropped, making the file smaller.</p>
<p>Be sure to note the grid showing the unique code for each glyph.  Note, too, the official name of each glyph.  Python reads both.  [Note, there are also historical, socio-cultural, religious, musical, and other symbols.]</p>
<p>Let's check out the <span style="background-color:silver;color:maroon;border-color:maroon;padding:4px;">Unicode Code Chart</span> pages for In-Class Discussion: <a href="https://www.unicode.org/charts/" target="new">Code Page</a>.</p>
<p>Tech tangent: the byte stream may not match what we see.  For instance in Devanagari-based languages, Hebrew/Arabic, and others ... compare <code>ihndi</code> for <code>hindi</code>.  Output: हिनदि and Input: ह ि न द ि
    </p>
    <p>Trend is to use codec more but there are still lots of built-functions you should recognize.  Let&rsquo;s review ASCII for an example.  Notice the following ways to convert data from one format to another:<ul>
    <li><code>ord(<i>i</i>)</code></li>
    <li><code>chr(<i>i</i>)</code></li>
    <li><code>bin(<i>i</i>)</code></li>
    <li><code>hex(<i>i</i>)</code></li>
    <li><code>ascii()</code></li>
    <li><code>bytes()</code></li>
    <li><code>oct()</code></li>
    <li><code>str()</code></li></ul>
<p>You&rsquo;re invited to experiment with each.  Some of these methods prefer that the char be converted to int beforehand.</p>

<p style="border-radius:3px;background-color:steelblue;padding:5px;color:white;">Demonstration 1</p>
<p>The below example demonstrates several ways to convert bytes and encoding.  Useful to know when cleaning data and checking before data integration.</p>

In [1]:
import binascii
import unicodedata

# since Python3, we can use chr() for ascii and unicode chars.
print(ord('M'))
print(chr(77))

# if we shift the int value we can move to different letters:
print(chr(ord('M') + 12))

# test a unicode char.
print(ord("明"))

# but note that ord doesn't return the ascii per se, but 
# can raise a TypeError.  Just in case, force the issue with u
print(ord(u'क'))

print("-"*50)

""" single chars or entire strings """
s = "hope"
print("ascii/unicode values")
for i in s:
    print(ord(i))
    
def string2bits(s = ''):
    return [bin(ord(x))[2:].zfill(8) for x in s]

# testing our string hope
b = string2bits(s)
print(s)

# how as binary
print("-"*30)
for x in b:
    print(x)


    

77
M
Y
26126
2325
--------------------------------------------------
ascii/unicode values
104
111
112
101
hope
------------------------------
01101000
01101111
01110000
01100101


<hr /><p style="border-radius:3px;background-color:steelblue;padding:5px;color:white;">Byte Stream</p>
<p>We can convert data to a <b>byte stream</b>, too.  Think about how you might want to convert your data to a similar format.  For instance <code>b'\xe6\x88\x91\xe5\xa5\xbd'</code>.  We use <code>encode()</code> and <code>decode()</code>.</p>
    <hr />
<p>
    Note too the <b>unicodedata</b> tool that will convert an int or hex into and out of the Unicode name. Check out https://docs.python.org/3/library/unicodedata.html</p>

In [11]:
j = b'\xe6\x88\x91\xe5\xa5\xbd'
decoded_j = j.decode()
print(decoded_j)


print("_"*50)
import unicodedata

print("Names:",unicodedata.name(chr(233)))
print(unicodedata.name(chr(0x0bf2)))

# BY NAME:
print(unicodedata.lookup('MUSICAL SYMBOL G CLEF'))

# BY NAME EXAMPLE OF RANGE:
print("-"*50,"\n")
import unicodedata

# ord() takes 1 char Unicode string and returns the code point value.

u = chr(233) + chr(0x0bf2) + chr(3972) + chr(6000) + chr(13231)

for i, c in enumerate(u):
    print(i, '%04x' % ord(c), unicodedata.category(c), end=" ")
    print(unicodedata.name(c))

# get numeric value of second character
print(unicodedata.numeric(u[1]))

for i in range(233, 400):
    print(unicodedata.name(chr(i)))


我好
__________________________________________________
Names: LATIN SMALL LETTER E WITH ACUTE
TAMIL NUMBER ONE THOUSAND
𝄞
-------------------------------------------------- 

0 00e9 Ll LATIN SMALL LETTER E WITH ACUTE
1 0bf2 No TAMIL NUMBER ONE THOUSAND
2 0f84 Mn TIBETAN MARK HALANTA
3 1770 Lo TAGBANWA LETTER SA
4 33af So SQUARE RAD OVER S SQUARED
1000.0
LATIN SMALL LETTER E WITH ACUTE
LATIN SMALL LETTER E WITH CIRCUMFLEX
LATIN SMALL LETTER E WITH DIAERESIS
LATIN SMALL LETTER I WITH GRAVE
LATIN SMALL LETTER I WITH ACUTE
LATIN SMALL LETTER I WITH CIRCUMFLEX
LATIN SMALL LETTER I WITH DIAERESIS
LATIN SMALL LETTER ETH
LATIN SMALL LETTER N WITH TILDE
LATIN SMALL LETTER O WITH GRAVE
LATIN SMALL LETTER O WITH ACUTE
LATIN SMALL LETTER O WITH CIRCUMFLEX
LATIN SMALL LETTER O WITH TILDE
LATIN SMALL LETTER O WITH DIAERESIS
DIVISION SIGN
LATIN SMALL LETTER O WITH STROKE
LATIN SMALL LETTER U WITH GRAVE
LATIN SMALL LETTER U WITH ACUTE
LATIN SMALL LETTER U WITH CIRCUMFLEX
LATIN SMALL LETTER U WITH DIAER

In [12]:
# let's confirm
print("-"*30)
for x, y in zip(b, s):
    print(y,":", x)

""" demo with ranges """
print("\n\n","_"*50,"\nDemonstration with unicode ranges - most cool ... \n","_"*50,"\n")
for i in range(65, 122):
    print("Glyph:", chr(i), "\tDecimal:", i, "\tBinary:", end="")
    print(format(i, 'b'), " ",bin(i), "\tHex:",hex(i))

print("\n")
col = 0

for i in range(400,410):
    x = hex(i)
    y = chr(i)
    if col % 5:
        print(" | \thex:",x,"and char:", y)
        col = 0
    else:
        print("\thex",x," and char:", y, end="")
        col += 1

------------------------------
h : 01101000
o : 01101111
p : 01110000
e : 01100101


 __________________________________________________ 
Demonstration with unicode ranges - most cool ... 
 __________________________________________________ 

Glyph: A 	Decimal: 65 	Binary:1000001   0b1000001 	Hex: 0x41
Glyph: B 	Decimal: 66 	Binary:1000010   0b1000010 	Hex: 0x42
Glyph: C 	Decimal: 67 	Binary:1000011   0b1000011 	Hex: 0x43
Glyph: D 	Decimal: 68 	Binary:1000100   0b1000100 	Hex: 0x44
Glyph: E 	Decimal: 69 	Binary:1000101   0b1000101 	Hex: 0x45
Glyph: F 	Decimal: 70 	Binary:1000110   0b1000110 	Hex: 0x46
Glyph: G 	Decimal: 71 	Binary:1000111   0b1000111 	Hex: 0x47
Glyph: H 	Decimal: 72 	Binary:1001000   0b1001000 	Hex: 0x48
Glyph: I 	Decimal: 73 	Binary:1001001   0b1001001 	Hex: 0x49
Glyph: J 	Decimal: 74 	Binary:1001010   0b1001010 	Hex: 0x4a
Glyph: K 	Decimal: 75 	Binary:1001011   0b1001011 	Hex: 0x4b
Glyph: L 	Decimal: 76 	Binary:1001100   0b1001100 	Hex: 0x4c
Glyph: M 	Decimal: 77 	Bi

<hr />
<p>Fun note about multilingual processing, code points, and more!</p>

<img src="images/bytestream1.png" style="width:500px;">

<hr /><p style="border-radius:3px;background-color:steelblue;padding:5px;color:white;">Demonstration 3: Compression</p>
<p>Leading to a complex, real-world compression question.  In this example we use <code>codecs</code> 
    library and demonstrate, too, using different encoding standards (here, utf-8 and ascii), and 
    conclude with making a binary stream, something we encounter in neural nets, compression, full-text retrieval, concept extraction.  A full example appears at the bottom - rather advanced and so just fyi.</p>

In [76]:
""" Practical note: in some fields, such as full-text processing, auto concept detection, 
image compression it's useful to use a binary representation of the data. 
In this example the word we want to convert is 'Bonjour'.  """

print("Now we import python 3.x's codecs for encoding/decoding.")
import codecs
codecs.encode(b"a", "hex")

x = 'f'
codecs.encode(x, encoding='utf-8')

# using encode for bytes:
byte_object = "bonjour".encode('utf-8')
print(byte_object)

print(byte_object[0])

print("-"*50)
print("Using python 3 encoding. Note the syntax diff. ")
b = "bonjour".encode('ascii')
map(bin,bytearray(b))

import binascii
bin(int(binascii.hexlify(b), 16))
# there's also binascii.hexlify / unhexlify

Now we import python 3.x's codecs for encoding/decoding.
b'bonjour'
98
--------------------------------------------------
Using python 3 encoding. Note the syntax diff. 


'0b1100010011011110110111001101010011011110111010101110010'

<blockquote style="background-color: lightblue;">For fun with encoding, see the enigma machine code example below.  It converts only single terms now ... can you make it convert entire documents?  And back again?</blockquote>

<hr />
<p style="border-radius:3px;background-color:steelblue;padding:5px;color:white;">Demonstration 4: Pay attention to end-user display possibilities: preferences for encoding, language, time-zone, etc.</p>

In [77]:
""" Check user preferences """
import locale
# what does the user prefer?
locale.getpreferredencoding()

# If you're using windows server or the like you'll get other stuff
# e.g., you'll likely find something like cp1252.

'UTF-8'

<hr /><p style="border-radius:3px;background-color:steelblue;padding:5px;color:white;">Demonstration 4a: Encoding mismatch errors!</p>

In [78]:
# python's open function handles most encoding/decoding issues,
# but just in case ... 

# NOTICE: the encoding='ascii' WILL CAUSE AN ERROR - 
# change to utf-8 and compare.
import unicodedata

path_to_file = '/users/gb/Documents/UCB-DataSci/Sp22/week_11_sp22/'

with open(path_to_file + 'data/unicode-file-test.txt', 
          encoding='ascii') as f:
    for line in f:
        print(repr(line))
        
# try the above again with a different encoding scheme, such as ascii, to 
# throw an exception.

UnicodeDecodeError: 'ascii' codec can't decode byte 0xe8 in position 153: ordinal not in range(128)

<hr />
<h2 style="background-color: #003262; color:white; border-radius: 4px; padding: 8px;">Optional: Student question about MySQL and python</h2>
<p>The &ldquo;config&rdquo; data you see usually is in a separate text file (like "config.ini").  Here just to demonstrate the usual content.</p>
<p><b>NOTE</b> This script won't run 'cause you need other libraries and an actual server, etc.  But you see the syntax.</p>

In [2]:
""" NOTE!  Different approaches (procedural versus object-oriented, 
different servers (Apache v. AWS and others, and versions may require different 
libraries.  As of Nov 2023, there's a couple of ways to integrate Python + MySQL 
(or other RDBMS/NoSQL)
For Web Environments (tho cgi for python may be deprecated ... here are 
the usual libraries:
import cgitb, cgi, sys, io, subprocess, traceback, os, stat, configparser
import datetime
from datetime import datetime
cgitb.enable() # this is optional - for showing error msgs in browser
from mysql.connector import MySQLConnection, Error
import mysql.connector
from mysql.connector import Error
from mysql.connector import errorcode
from configparser import ConfigParser # to read the config.ini file.
"""

import mysql.connector
from mysql.connector import errorcode
import datetime

""" NOTE here we have a different implementation of the SQL libraries ... 
from mysql.connector import MySQLConnection, Error
import mysql.connector
from mysql.connector import Error
from mysql.connector import errorcode
"""
from configparser import ConfigParser

config = {
    'user': 'root',
    'password': 'XXXXXXXX',
    'host': '127.0.0.1',
    'database': 'cog2',
    'raise_on_warnings': True
}

try:
    cnx = mysql.connector.connect(**config)
except mysql.connector.Error as err:
    if err.errno == errorcode.ER_ACCESS_DENIED_ERROR:
        print("Sorry, access denied.")
    elif err.errno == errorcode.ER_BAD_DB_ERROR:
        print("No such database.")
    else:
        print("Error:",err)
else:
    # start query
    cursor = cnx.cursor()
    query = ("SELECT recno, propName FROM property")
    
    cursor.execute(query)
    
    for (recno, propName) in cursor:
        print("ID {} {}".format(recno, propName))
    
    cursor.close()
    # end of query

    cnx.close()


Error: 2003 (HY000): Can't connect to MySQL server on '127.0.0.1:3306' (61)


<hr />
<p style="border-radius:3px;background-color:steelblue;padding:5px;color:white;">
    Note about ASCII, UTF-8 and 16 and the <code>encode</code> method.</p>
<p>Student asks about diff between them. Let&rsquo;s compare the byte output and the command to convert.</p>

In [80]:
import unicodedata

print(unicodedata.name("T"))
print(unicodedata.name("\u0054"))

print(unicodedata.name("𓀀"))
print(unicodedata.name("\U00013000"))

LATIN CAPITAL LETTER T
LATIN CAPITAL LETTER T
EGYPTIAN HIEROGLYPH A001
EGYPTIAN HIEROGLYPH A001


<hr />
<p>End of this optional (but fascinating) notebook about encoding.</p>
<p>Continue if you want to the examples notebook.  Else off to the Activity.</p>
<hr />

In [6]:
# ----------------- Enigma Settings -----------------
rotors = ("I","II","III")
reflector = "UKW-B"
ringSettings ="ABC"
ringPositions = "DEF" 
plugboard = "AT BS DE FM IR KN LZ OW PV XY"
# ---------------------------------------------------

def caesarShift(str, amount):
	output = ""

	for i in range(0,len(str)):
		c = str[i]
		code = ord(c)
		if ((code >= 65) and (code <= 90)):
			c = chr(((code - 65 + amount) % 26) + 65)
		output = output + c	
	return output

def encode(plaintext):
	global rotors, reflector,ringSettings,ringPositions,plugboard
	#Enigma Rotors and reflectors
	rotor1 = "EKMFLGDQVZNTOWYHXUSPAIBRCJ"
	rotor1Notch = "Q"
	rotor2 = "AJDKSIRUXBLHWTMCQGZNPYFVOE"
	rotor2Notch = "E"
	rotor3 = "BDFHJLCPRTXVZNYEIWGAKMUSQO"
	rotor3Notch = "V"
	rotor4 = "ESOVPZJAYQUIRHXLNFTGKDCMWB"
	rotor4Notch = "J"
	rotor5 = "VZBRGITYUPSDNHLXAWMJQOFECK"
	rotor5Notch = "Z" 
	
	rotorDict = {"I":rotor1,"II":rotor2,"III":rotor3,"IV":rotor4,"V":rotor5}
	rotorNotchDict = {"I":rotor1Notch,"II":rotor2Notch,"III":rotor3Notch,"IV":rotor4Notch,"V":rotor5Notch}	
	
	reflectorB = {"A":"Y","Y":"A","B":"R","R":"B","C":"U","U":"C","D":"H","H":"D","E":"Q","Q":"E","F":"S","S":"F","G":"L","L":"G","I":"P","P":"I","J":"X","X":"J","K":"N","N":"K","M":"O","O":"M","T":"Z","Z":"T","V":"W","W":"V"}
	reflectorC = {"A":"F","F":"A","B":"V","V":"B","C":"P","P":"C","D":"J","J":"D","E":"I","I":"E","G":"O","O":"G","H":"Y","Y":"H","K":"R","R":"K","L":"Z","Z":"L","M":"X","X":"M","N":"W","W":"N","Q":"T","T":"Q","S":"U","U":"S"}
	
	alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
	rotorANotch = False
	rotorBNotch = False
	rotorCNotch = False
	
	if reflector=="UKW-B":
		reflectorDict = reflectorB
	else:
		reflectorDict = reflectorC
	
	#A = Left,	B = Mid,	C=Right 
	rotorA = rotorDict[rotors[0]]
	rotorB = rotorDict[rotors[1]]
	rotorC = rotorDict[rotors[2]]
	rotorANotch = rotorNotchDict[rotors[0]]
	rotorBNotch = rotorNotchDict[rotors[1]]
	rotorCNotch = rotorNotchDict[rotors[2]]
	
	rotorALetter = ringPositions[0]
	rotorBLetter = ringPositions[1]
	rotorCLetter = ringPositions[2]
	
	rotorASetting = ringSettings[0]
	offsetASetting = alphabet.index(rotorASetting)
	rotorBSetting = ringSettings[1]
	offsetBSetting = alphabet.index(rotorBSetting)
	rotorCSetting = ringSettings[2]
	offsetCSetting = alphabet.index(rotorCSetting)
	
	rotorA = caesarShift(rotorA,offsetASetting)
	rotorB = caesarShift(rotorB,offsetBSetting)
	rotorC = caesarShift(rotorC,offsetCSetting)
	
	if offsetASetting > 0:
		rotorA = rotorA[26-offsetASetting:] + rotorA[0:26-offsetASetting]
	if offsetBSetting > 0:
		rotorB = rotorB[26-offsetBSetting:] + rotorB[0:26-offsetBSetting]
	if offsetCSetting > 0:
		rotorC = rotorC[26-offsetCSetting:] + rotorC[0:26-offsetCSetting]

	ciphertext = ""
  
	#Converplugboard settings into a dictionary
	plugboardConnections = plugboard.upper().split(" ")
	plugboardDict = {}

	for pair in plugboardConnections:
		if len(pair)==2:
			plugboardDict[pair[0]] = pair[1]
			plugboardDict[pair[1]] = pair[0]
  
	plaintext = plaintext.upper()  
	for letter in plaintext:
		encryptedLetter = letter  
	
		if letter in alphabet:
	  		#Rotate Rotors - This happens as soon as a key is pressed, before encrypting the letter!
	  		rotorTrigger = False
	  		#Third rotor rotates by 1 for every key being pressed
		if rotorCLetter == rotorCNotch:
			rotorTrigger = True 
			rotorCLetter = alphabet[(alphabet.index(rotorCLetter) + 1) % 26]
		
		#Check if rotorB needs to rotate
		if rotorTrigger:
			rotorTrigger = False
		if rotorBLetter == rotorBNotch:
			rotorTrigger = True 
			rotorBLetter = alphabet[(alphabet.index(rotorBLetter) + 1) % 26]
  
		#Check if rotorA needs to rotate
		if (rotorTrigger):
			rotorTrigger = False
			rotorALetter = alphabet[(alphabet.index(rotorALetter) + 1) % 26]
		else:
			#Check for double step sequence!
			if rotorBLetter == rotorBNotch:
				rotorBLetter = alphabet[(alphabet.index(rotorBLetter) + 1) % 26]
				rotorALetter = alphabet[(alphabet.index(rotorALetter) + 1) % 26]
	
		#Implement plugboard encryption!
		if letter in plugboardDict.keys():
			if plugboardDict[letter]!="":
				encryptedLetter = plugboardDict[letter]
				
		#Rotors & Reflector Encryption
		offsetA = alphabet.index(rotorALetter)
		offsetB = alphabet.index(rotorBLetter)
		offsetC = alphabet.index(rotorCLetter)

		# Wheel 3 Encryption
		pos = alphabet.index(encryptedLetter)
		let = rotorC[(pos + offsetC) % 26]
		pos = alphabet.index(let)
		encryptedLetter = alphabet[(pos - offsetC +26)%26]

		# Wheel 2 Encryption
		pos = alphabet.index(encryptedLetter)
		let = rotorB[(pos + offsetB)%26]
		pos = alphabet.index(let)
		encryptedLetter = alphabet[(pos - offsetB +26)%26]
	  
		# Wheel 1 Encryption
		pos = alphabet.index(encryptedLetter)
		let = rotorA[(pos + offsetA)%26]
		pos = alphabet.index(let)
		encryptedLetter = alphabet[(pos - offsetA +26)%26]
	  
		# Reflector encryption!
		if encryptedLetter in reflectorDict.keys():
			if reflectorDict[encryptedLetter] != "":
				encryptedLetter = reflectorDict[encryptedLetter]
	  
		#Back through the rotors 
		# Wheel 1 Encryption
		pos = alphabet.index(encryptedLetter)
		let = alphabet[(pos + offsetA) % 26]
		pos = rotorA.index(let)
		encryptedLetter = alphabet[(pos - offsetA +26)%26] 
	  
		# Wheel 2 Encryption
		pos = alphabet.index(encryptedLetter)
		let = alphabet[(pos + offsetB)%26]
		pos = rotorB.index(let)
		encryptedLetter = alphabet[(pos - offsetB +26)%26]
	  
		# Wheel 3 Encryption
		pos = alphabet.index(encryptedLetter)
		let = alphabet[(pos + offsetC)%26]
		pos = rotorC.index(let)
		encryptedLetter = alphabet[(pos - offsetC +26)%26]
	  
		#Implement plugboard encryption!
		if encryptedLetter in plugboardDict.keys():
			if plugboardDict[encryptedLetter] != "":
				encryptedLetter = plugboardDict[encryptedLetter]

		ciphertext = ciphertext + encryptedLetter
	return ciphertext

#Main Program Starts Here
print("  ##### Enigma Encoder #####")
print("")
plaintext = input("Enter a single word text to encode or decode:\n")
ciphertext = encode(plaintext)

print("\nEncoded text: \n " + ciphertext)

  ##### Enigma Encoder #####

Enter a single word text to encode or decode:
fish

Encoded text: 
 YXOR


<p>As of Mar 13, 2022 - GB; Updated Nov 11, 2023 GB</p>