<a data-flickr-embed="true" href="https://www.flickr.com/photos/kirbyurner/51883694941/in/album-72177720296706479/" title="week2_schedule"><img src="https://live.staticflickr.com/65535/51883694941_84ef7655e9.jpg" width="359" height="500" alt="week2_schedule"></a><script async src="//embedr.flickr.com/assets/client-code.js" charset="utf-8"></script>

# Session 4:  Clarusway Mini-Bootcamp

* Boolean Logic Expressions
* Truth Values of Logic Statements
* The Strength of the String

### Useful Links (not always unique)

* [This Notebook on Colab](https://colab.research.google.com/github/4dsolutions/bootcamp/blob/main/session2.ipynb)
* [This Notebook on nbviewer](https://nbviewer.org/github/4dsolutions/bootcamp/blob/main/session2.ipynb)
* [Common String Operations](https://docs.python.org/3/library/string.html)
* [Example StackOverflow Question](https://stackoverflow.com/questions/432842/how-do-you-get-the-logical-xor-of-two-variables-in-python)
* [StackOverflow: Basic BitFlipper](https://stackoverflow.com/questions/27958292/basic-bit-flip-algorithm)
* [json module](https://docs.python.org/3/library/json.html)
* [numpy package](https://numpy.org/)
* [pandas package](https://pandas.pydata.org/)
* [Python Data Science Handbook by Jake VanderPlas](https://jakevdp.github.io/PythonDataScienceHandbook/)
* [Free Books by Allen Downey](https://greenteapress.com/wp/)
* [scikit-learn](https://scikit-learn.org/stable/)
* [Course Album](https://flic.kr/s/aHBqjzCs82)
* [Course Repository](https://github.com/4dsolutions/bootcamp)

### Glossary of Terms (not alphabetical)

* The Turk: apparent chess playing robot that was actually a hoax (it beat Napoleon)
* Data Munging:  cleaning and regularizing data to make it usable in computations
* ML: Machine Learning, training models to predict correctly
* DL: Deep Learning: a model training technique
* numpy package: n-dimensional array object, linear algebra
* pandas package:  DataFrames containing Series containing numpy arrays
* JS: JavaScript
* Angular: JS framework from Google
* React: JS framework from Facebook
* HTML:  Hypertext Markup Language (version 5 most recent)
* CSS: Cascading Style Sheet (version 3 most recent)
* scikit-learn: Machine Learning in Python
* TensorFlow: ML for Python from Google
* PyTorch: ML for Python from Facebook

## Boolean Logic Expressions

True and False are actually both keywords and names for the integers 1 and 0 respectively.  We say the bool type is a subclass of the int type.  This is part of Python's design, not a feature shared by all languages.

In [None]:
issubclass(bool, int)

## Truth Values of Logic Statements

The boolean type looks for operators relating True or False values, namely: and, or, not.

In [None]:
not "c" in "cat"

In [None]:
"c" not in "cat" # allowed syntax

In [None]:
("c" in "cat") and ("d" in "dog")

Remember:  special names (`__ribs__`) inside the various types, control what the operators mean and do.  The `in` keyword may be considered an operator and `a in b` is equivalently `b.__contains__(a)`.  This second, more ugly way of saying it, is just a reminder that, when the time comes, you-the-programmer have the power to take control of `in` with respect to your own types.

In [None]:
"cat".__contains__"c"

The code below is a preview of the rest of Python:  defining your own types with their own methods.  The keywords `class` and `def` have not yet been formally introduced.  However we have talked quite a bit about types versus intances.  

Below, the Animal type is defined such that `in` always triggers a `True` response.  This code is for any animal instance.  We still have to create those instances, by calling the Animal type.

In [None]:
class Animal:
    """
    Preview of defining a new type: Animal
    with one method, triggered by 'in' 
    that always returns True
    """
    def __contains__(self, thing):  # <-- special name in action
        print("Yes, always")
        return True

In [None]:
a = Animal()  # calling the Animal type to get an instance
"b" in a

In [None]:
("c" in "cat") or (1/0)  # 2nd condition never visited if first is True

In [None]:
a = "c" in "cat"
b = "d" in "dog"
(a and not b) or (b and not a)  # exclusive or

At the foundation of Boolean Logic, we have what are called bitwise operations, which operate bit-for-bit, applying the logic of and, or, xor.

In [None]:
from operator import xor

xor(a,b)

The Python code:

```python
f"{bitwise:04b}"
```

means:  find `bitwise` in the current namespace and substitute its value expressed in binary, padding with 0s on the left if necessary, and 4 wide.  The prefix "f" means "format".  An alternative, in older Pythons, is to explicitly feed the required names through the string type's `format` method.

```python
"{:04b}".format(bitwise)  # no placeholder name needed
```

Here in the Jupyter Notebook context, the last row in a cell is automatically evaluated for output.  Inside a conventional program, `print` would be needed to send output to the console and/or names would be assigned so that results could have some use.

In [None]:
bitwise = 0b1000 & 0b1100  # and the bits
f"{bitwise:04b}"

In [None]:
bitwise = 0b1000 | 0b1100  # or the bits
f"{bitwise:04b}"

In [None]:
bitwise = 0b1000 ^ 0b1101  # xor the bits
f"{bitwise:04b}"

In [None]:
bin(0b1000 & 0b1100)  # another approach

In [None]:
bitwise = ~0b11100111 & 0xff  # see Basic BitFlipper in Links above
f"{bitwise:08b}"

In [None]:
f"{-bitwise:08b}"

In [None]:
"copyright" in dir(__builtins__)

## The Strength of the String

The string type is core to information processing because so much information is communicated using a text encoding, be that Unicode, ASCII or something else.

The other core data exchange type is binary, and knowing the file type in the case of binary is just as critical as knowing the encoding type in the case of text.  Is this a picture, music, or other file?

Text files have the advantage of meaning something to the eyes, i.e. they tend to be human readable to a far greater extent than random bytes in a binary file.

### Strings and File I/O

This section illustrates working with persistent text files, through file type objects.  Text files typically need to be "parsed".  From Wikipedia: 

<blockquote>Parsing, syntax analysis, or syntactic analysis is the process of analyzing a string of symbols, either in natural language, computer languages or data structures, conforming to the rules of a formal grammar. The term parsing comes from Latin pars, meaning part.</blockquote>

In [None]:
file_object = open("links.txt", "r")

In [None]:
sample = file_object.readlines()[5:10]

In [None]:
file_object.close()

In [None]:
sample

In [None]:
sample[0]

In [None]:
sample[0][:-1]

In [None]:
sample[0][:-2].replace("* [","").split("](")

In [None]:
sample[1][:-2].replace("* [","").split("](")

In [None]:
file_object = open("glossary.txt", "r")

In [None]:
sample = file_object.readlines()[5:10]

In [None]:
file_object.close()

In [None]:
sample

In [None]:
sample[3][2:-1].split(": ")

In [None]:
sample[4][2:-1].split(": ")

## Sandboxed Review: UTF-8, Bits and Bytes

Lets write code to accept any UTF-8 encoded character in order to:

* figure out how many bytes it requires
* show those bytes
* show payload bits only
* convert payload to hex
* compare with published hex value using requests (http)


In [None]:
input_char = "Σ" # "T" "😊"  <--- several candidates for analysis

byte_string = bytes(input_char, encoding='UTF-8')
howmany_bytes = len(byte_string)
print("Bytes:", howmany_bytes)

In [None]:
byte_string

In [None]:
f"{byte_string[0]:08b}"

In [None]:
bin(byte_string[0])

| Byte0    |   | Byte1    |   | Byte2    |   | Byte3    |   |
|----------|---|----------|---|----------|---|----------|---|
| 0xxxxxxx |   |          |   |          |   |          |   |
| 110xxxxx |   | 10xxxxxx |   |          |   |          |   |
| 1110xxxx |   | 10xxxxxx |   | 10xxxxxx |   |          |   |
| 11110xxx |   | 10xxxxxx |   | 10xxxxxx |   | 10xxxxxx |   |

The code below could be shortened quite a bit using looping constructs.  However at this point in our boot camp, those looping constructs are ahead of us.  Once you learn them, think about coming back and writing some shorter functions.

The strategy below is to tease the payload apart from the structural bits.  The Markdown table above, [from a Table Generator](https://www.tablesgenerator.com/markdown_tables#), remind us how UTF-8 works.

In [None]:
payload = ""

if howmany_bytes == 4:
    byte0 = f"{byte_string[0]:08b}"
    byte1 = f"{byte_string[1]:08b}"
    byte2 = f"{byte_string[2]:08b}"
    byte3 = f"{byte_string[3]:08b}"
    payload += byte0[5:]
    payload += byte1[2:]
    payload += byte2[2:]
    payload += byte3[2:]
    print("Bytes", byte0, byte1, byte2, byte3, sep="\nByte:   ")

if howmany_bytes == 3:
    byte0 = f"{byte_string[0]:08b}"
    byte1 = f"{byte_string[1]:08b}"
    byte2 = f"{byte_string[2]:08b}"
    payload += byte0[4:]
    payload += byte1[2:]
    payload += byte2[2:]
    print("Bytes", byte0, byte1, byte2, sep="\n Byte:   ")
    
if howmany_bytes == 2:
    byte0 = f"{byte_string[0]:08b}"
    byte1 = f"{byte_string[1]:08b}"
    payload += byte0[4:]
    payload += byte1[2:]
    print("Bytes", byte0, byte1, sep="\n Byte:   ")
    
if howmany_bytes == 1:
    byte0 = f"{byte_string[0]:08b}"
    payload += byte0
    print("Bytes", byte0, sep="\n Byte:   ")

print()
print("Payload:", payload)
code_point = int(payload, 2)
hex_point = f"{code_point:04x}".upper()
print(f"Code Point (hex):", hex_point)
print("Code Point (dec):", code_point)

In [None]:
import requests
# response = requests.get("https://www.compart.com/en/unicode/U+1F60A")
response = requests.get("https://www.compart.com/en/unicode/U+03A3")

In [None]:
response.status_code|

In [None]:
start = response.text.index("<title ")
stop = response.text.find("</title>")
print(response.text[start:stop+8])