# Important Python concepts
This notebook is to highlight some of the most common Python syntax that will be used along with PySpark for Data Engineering. For a list of more complete trainings on Python, see https://dustinvannoy.com/2021/10/08/learn-python-resources

## Strings
First, we will set some variables, which does not require specifying the data type in Python.
Then, we run a few common string operations.

In [0]:
my_var_1 = "String 1"
my_var_2 = 'String 2'

print(my_var_1, my_var_2)

String 1 String 2


In [0]:
concatenated_string_1 = my_var_1 + ", " + str(my_var_2)

formatted_string = f"My variables are: {my_var_1}, {my_var_2}!"

print(concatenated_string_1)
print(formatted_string)

String 1, String 2
My variables are: String 1, String 2!


## Other Data Types

In [0]:
# int
my_int = 12

# float
cost = 23.12

# boolean
is_easy = True

print(type(my_int), my_int)
print(type(cost), cost)
print(type(is_easy), is_easy)

<class 'int'> 12
<class 'float'> 23.12
<class 'bool'> True


In [0]:
# bytes
my_bytes = b'Byte string'

# tuple
my_tuple = ("Dana", 22, False)

# set
my_set = {12, 22, 22, 25, 25}

print(type(my_bytes), my_bytes)
print(type(my_tuple), my_tuple)
print(type(my_set), my_set)

<class 'bytes'> b'Byte string'
<class 'tuple'> ('Dana', 22, False)
<class 'set'> {25, 12, 22}


## List
Now let's look at a Python List, which is like an Array or Linked List.

In [0]:
my_list = ["Dana", "Dave", "Diana", "Doung", "Dustin"]

my_2D_list = [["Dana", 22], ["Dave", 31], ["Diana", 19], ["Doug", 34], ["Dustin", None]]

print(my_list)
print(my_2D_list)

['Dana', 'Dave', 'Diana', 'Doung', 'Dustin']
[['Dana', 22], ['Dave', 31], ['Diana', 19], ['Doug', 34], ['Dustin', None]]


## Loop
Example of looping through 1 dimensional and 2 dimensional iterable objects.

In [0]:
for name in my_list:
  print(name)
  print("yes")

Dana
yes
Dave
yes
Diana
yes
Doung
yes
Dustin
yes


In [0]:
for row in my_2D_list:
  for item in row:
    print(item)

Dana
22
Dave
31
Diana
19
Doug
34
Dustin
None


## Dictionary
Let's look at a Python dictionary type. It works like a hashMap where you can lookup a value based on a key. These can also be multidimensional. These are used a lot to store JSON.

In [0]:
data = {"key1": "Dana",
        "key2": "Dante",
        "key3": "Diana",
        "key4": "Doug",
        "key5": "Delia",
        "key6": "Dustin"
       }

print(data["key3"])

data["key3"] = "Bob"
print(data["key3"])

Diana
Bob


In [0]:
for key, name in data.items():
  print(key, " -> ", name)
print("t")

key1  ->  Dana
key2  ->  Dante
key3  ->  Bob
key4  ->  Doug
key5  ->  Delia
key6  ->  Dustin
t
