## Python Fundamentals

This notebook illustrates basic Python programming functionality.

<ol>
    <li>Fundamental to any programming language is defining variables and assigning values to them.  Everything is actually an <b>object</b> in Python programming; therefore, datatypes are actually <i>classes</i> and variables are <i>instances</i> of these classes (aka, objects). We illustrate working with variables that hold singular (aka Unary) values, as well as various structures for holding multiple values (e.g., Sets, Lists and Dictionaries). We also illustrate how to reference specific values, or ranges of values, within those multi-value data structures. What's more, since multi-value data structures are inherently <i>enumerable</i>, we also demonstrate the syntax for <i>iterating</i> over the values contained within these structures.</li> 
    <li>Frequently, it becomes necessary to convert (or cast) variable values between datatypes. Since this frequently occurs between numerical and string datatypes, we demonstrate converting both unary and multi-value data structures.</li> 
    <li>Finally, when functionality that requires three or more lines of code may be utilized more than once within a program, it is customary to <i>encapsulate</i> that functionality within a <b>function</b> that can subsequntly be <i>called</i> multiple times from arbitrary locations within that program.  We will illustrate the syntax for defining functions, and in doing so, we will also demonstrate how to validate the datatype of the values being entered into those functions, and how to raise errors when those validations fail.</li>
</ol>

### 1.0. Importing Libraries

Because Jupyter Notebooks are linear and incremental in nature, any libraries, variables or functions must first be entered into memory by executing the cell in which they first appear before they will be available for use in any subsequent cells.  It is customary to 'import' any required libraries at the top of the notebook where they can be managed in a singular location.

In [8]:
import pandas as pd
import numpy as np
import numbers

In [None]:
!pip install beautifulsoup4
import beautifulsoup as bs

### 2.0. Variables: Unary Values

Here we demonstrate how to declare and assign value to variables that hold singular (aka unary) values.  These include the three numeric datatypes native to the Python language (integers, floats and complex numbers), as well as strings. The **string** datatype is, in fact, a special datatype called a **sequence** [of characters] that is actually an array of individual bytes where each byte represents an individual Unicode character. The **string** datatype is an ordered collection of one or more characters delimitted by either single quotes, double-quotes, or triple-quotes. Since there is no dedicated character datatype in Python, a single character is actually a **string** having a length of one (1).

In [43]:
my_integer = 42
my_float = 3.14
my_complex = 2 + 4j

print(f"The variable 'my_integer' contains the value: {my_integer} and is of datatype: '{type(my_integer)}'")
print(f"The variable 'my_float' contains the value: {my_float} and is of datatype: '{type(my_float)}'")
print(f"The variable 'my_complex' contains the value: {my_complex} and is of datatype: '{type(my_complex)}'")

The variable 'my_integer' contains the value: 42 and is of datatype: '<class 'int'>'
The variable 'my_float' contains the value: 3.14 and is of datatype: '<class 'float'>'
The variable 'my_complex' contains the value: (2+4j) and is of datatype: '<class 'complex'>'


In [47]:
my_string = 'This is an ordered collection of Unicode characters. Delimited by single-quotes'
print(f"The variable 'my_string' contains the value: '{my_string}' and is of datatype: '{type(my_string)}'")

my_string2 = "This is a string delimited by double-quotes."
print(f"The variable 'my_string2' contains the value: '{my_string2}' and is of datatype: '{type(my_string2)}'")

my_string3 = '''This illustrates how
    triple-quotes can be used to delimit
    strings where carriage returns and 
    other white-space formatting make sense.
    For example:
    
    SELECT first_name
        , last_name
        , middle_initial
        , date_of_birth
    FROM customers
    WHERE last_name LIKE 'M%';
'''
print(f"The variable 'my_string3' contains the value: '{my_string3}' and is of datatype: '{type(my_string3)}'")

The variable 'my_string' contains the value: 'This is an ordered collection of Unicode characters. Delimited by single-quotes' and is of datatype: '<class 'str'>'
The variable 'my_string2' contains the value: 'This is a string delimited by double-quotes.' and is of datatype: '<class 'str'>'
The variable 'my_string3' contains the value: 'This illustrates how
    triple-quotes can be used to delimit
    strings where carriage returns and 
    other white-space formatting make sense.
    For example:
    
    SELECT first_name
        , last_name
        , middle_initial
        , date_of_birth
    FROM customers
    WHERE last_name LIKE 'M%';
' and is of datatype: '<class 'str'>'


### 3.0. Multi-Value Variables: Sets, Tuples, Lists  and Dictionaries

Here we demonstrate declaring and assigning values to multi-value data structures,  Although Sets, Lists and Tuples appear to be quite similar, they differ in how they behave, and how their contents are stored in system memory.  All three data structures can be used to store heterogeneous data (i.e., a mixture of numeric, character and complex data types).

#### 3.1. The Set
The data elements within a **Set** are unordered. Therefore, being an unordered collection, a **set** does not record element position or order of insertion, and as a result sets don't support indexing, slicing, or other sequence-like behavior. There are actually two types of **set** native to the Python language: **set** which is mutable, and **frozenset** which is immutable (i.e., its contents cannot be altered).  Because the **set** is mutable, its contents can be changed using methods like **add()** and **remove()**.  What's more, sets disallow duplicate values.

<p>*If you pay careful attention to the result of executing the cell below you will notice that duplicate values are ignored and some elements appear in a different order than was specified when the variable was declared and populated.*

In [None]:
my_set = {0, 1, 2, 3, 'One', 'Two', 'Three', 0, 1, 2, 3}
print(f"The variable 'my_set' contains the value: '{my_set}' and is of datatype: '{type(my_set)}'")

for i in my_set:
    print(i)

#### 3.2. The Tuple
The data elements within a **Tuple** are ordered, and like the *frozensets*, tuples are immutable. However, unlike the **set** a **tuple** can be used to store duplicate values if needed. 

In [None]:
my_tuple = (0, 1, 2, 3, 'One', 'Two', 'Three', 0, 1, 2, 3)
print(f"The variable 'my_tuple' contains the value: '{my_tuple}' and is of datatype: '{type(my_tuple)}'")

for i in my_tuple:
    print(i)

#### 3.3 The List
The data elements within a **List** are ordered, and they are mutable.  They are *not* stored in contiguous memory spaces, nor do they pre-allocate any space within system memory upon declaration. A **list** is a double-linked sequence that supports both forward and backward traversal. This structure supports inserting new data elements into specific arbitrary positions within the data structure.

In [None]:
my_list = [1, 2.0, 3.14, 'One', 'Two', 'Three']
print(f"The variable 'my_string' contains the value: '{my_list}' and is of datatype: '{type(my_list)}'")

for item in my_list:
    print(f"Item number: {item}")

The following cells illustrate how a **List** can be *indexed* (i.e., elements within the collection can be referenced according to their *position* within the collection).
* The first cell illustrates referencing the first *position* in the collection; noting that the array is zero-based.
* The second and third cells illustrate referencing a *range* of positions (e.g., the first 3 positions, all the positions after the third).
* The fourth cell illustrates referencing the second cell from the upper-bound (i.e., the end) of the collection by specifying a negative value.

In [None]:
my_list[0]

In [None]:
my_list[:3]

In [None]:
my_list[3:]

In [None]:
my_list[-2]

#### 3.4 The Dictionary
The data elements within a **Dictionary** are ordered, and like the **set** and **list**, are mutable. The **dictionary** differs in that it is a *Mapping* type. In other words, the **dictionary** *maps* a key to a value.  (For example: 0:"Zero", 1:"One", 2:"Two"). **Dictionaries** preserve insertion order; therefore, updating a key will not affect the order. Keys added after deletion are inserted at the end. 

In [None]:
my_dictionary = {0 : 'John',
                 1 : 'Paul',
                 2 : 'George',
                 3 : 'Pete',
                 4 : 'Ringo'
                }

print(f"The variable 'my_dictionary' contains the value: '{my_dictionary}' and is of datatype: '{type(my_dictionary)}'")

for i in my_dictionary:
    print(f"Each Item in My Dictionary Contains a Key {i}, and a Value '{my_dictionary[i]}'")

In [None]:
my_dictionary[2]

### 4.0. Converting Variables Between Datatypes
#### 4.1. Converting Integer Values to Strings, and Strings to Integer Values

In [None]:
my_integer = 42
print(f"The Value {my_integer} is of datatype '{type(my_integer)}'")

my_string = str(my_integer)
print(f"The Value {my_string} is of datatype '{type(my_string)}'")

my_integer = int(my_string)
print(f"The Value {my_integer} is of datatype '{type(my_integer)}'")

#### 4.2. Converting a List of String Values to a List of Floats, and a List of Floats to a List of Strings 

In [None]:
string_list = ['1.01','2.02','3.03','4.04','5.05']
float_list = list(map(float, string_list))

print(f"Converted List is: {float_list}")

for i in float_list:
    print(f"The Value {i} is of datatype '{type(i)}'")

In [None]:
string_list = list(map(str, float_list))

print(f"Converted List is: {string_list}")

for i in string_list:
    print(f"The Value {i} is of datatype '{type(i)}'")

### 5.0. Creating (Defining) Functions

Here we *define* (def) and call two simple functions: **add** and **concatenate**, each of which expect three input parameters. Within each function we validate the expected datatype of those inputs (e.g., *add()* expects numeric data, *concatenate()* expects character *string* data).  We implement this validation using the Python-native *isinstance()* method in concert with the *Number* type of the *numbers* library, and the Python-native *str* datatype specification. When any of the inputs is *not* of the expected type, a ValueError is *raised*, otherwise (else) the expected operation is implemented and *returned* using the validated inputs.

In [27]:
def add(x1, x2, x3):
    for x in [x1, x2, x3]:
        if isinstance(x, numbers.Number) != True:
            raise ValueError("Input values must be of a numerical datatype.")
        else:
            return x1 + x2 + x3

In [29]:
add(3,3,3)

9

In [31]:
add("Three", "Three", "Three")

ValueError: Input values must be of a numerical datatype.

In [33]:
def concatenate(str1, str2, str3):
    for s in [str1, str2, str3]:
        if isinstance(s, str) != True:
            raise ValueError("Input values must be of the 'string' datatype.")
        else:
            return f"{str1}{str2}{str3}"

In [35]:
concatenate("string1", "string2", "string3")

'string1string2string3'

In [37]:
concatenate(1,2,3)

ValueError: Input values must be of the 'string' datatype.