Text and Testing
===================

This section will cover exercises that we will run against both text items as well at those that are
around working with python validation/testing.   

## Encoding/Decoding

For this first exercise we are just going to get familiar with the encoding system.  

- Here lets create a string variable named `utf_test` that contains the unicode character Ã (\u00C3)

<Answer
utf_test = '\u00C3'
print(utf_test)
>

- Next thing we will do is to take the utf_test and convert it to binary (encode) and assign to `utf_binary`   

<Answer
utf_binary = utf_test.encode()
print(utf_binary)
>

- Now we will decode the string and print the decoded results   

<Answer
print(utf_binary.decode())
>

- Now lets encode `utf_test` as an ascii value into the variable `ascii_test`  

<Answer
ascii_test = utf_test.encode('ascii', 'replace')
print(ascii_test)
>

- Finally we will decode the ascii version and print the results   

<Answer
print(ascii_test.decode('ascii'))
>

## Working with Regex

For this next section we are going to run through a couple of regular expression tests.   

In [None]:
import re

- First regex, lets create a credit card number regex that will group the last 4 digits (with or without space)   

<Answer
cc_regex = '(\d{4}\s*){4}'
>

In [None]:
cc_regex = ''   # TODO

test_1 = '1234 5683 2343 2432'
test_1_result = re.search(cc_regex, test_1)

print(test_1_result.groups())   # Should have 2432

test_2 = '4333444455556666'
test_2_result = re.search(cc_regex, test_2)

print(test_2_result.groups())   # Should have 6666

- Next lets add a litte more that will match an email using gmail's + filtering
  * Capture email specifics
  * Capture filter item
  * Capture domain
  
<Answer - Easy
email_regex = '([\w\.]+)'

test_1 = 'tom.riddle+regexone@hogwarts.com'
test_1_result = re.findall(email_regex, test_1)

print(test_1_result)
>

<Answer - Named
email_regex = '(?P<name>[\w\.]+)\+?(?P<filter>[\w\.]*)@(?P<domain>[\w\.]+)'
>

In [None]:
email_regex = ''    # Todo

test_1 = 'mk.wright+myfilter@uvu.edu'
test_1_result = re.search(email_regex, test_1)

print(test_1_result.groups())

## Binary Sample

For this sample set we are going to play around with binary data, and maybe even unpack a file.  

In [None]:
import struct

- Lets create a byte string of the ASCII word 'HELLO' using a list of numbers into variable `word_bin`   

<Answer
word_bin = bytes([72, 69, 76, 76, 79])
>

- Now we will mutate the variable `word_bin` to be HLELO    

<Answer
word_bin = bytearray(word_bin)
word_bin[1], word_bin[2] = word_bin[2], word_bin[1]
>

- For this section we are going to read in a file (notes-07/sample.png) in binary   
  * We will then make sure it is a png
  * Then we will get the width and height
  
<Answer
with open('notes-07/sample.png', 'rb') as f:
    results = f.read()
    
print(len(results))
png_header = b'\x89PNG\r\n\x1a\n'

if results[:8] == png_header:
    width, height = struct.unpack('LL', results[16:24])
    print(width, height)
else:
    print('Not a png')
>

In [None]:
png_header = b'\x89PNG\r\n\x1a\n'