# Grounds

This and subsequent notebooks will build on many concepts of using Spark and PySpark particularly, using various online resources.

The idea of these notebooks is to accumulate information in a more concise manner than it is presented online and present the final user with some concepts to ponder about and re-utilize in own work.

This notebook concentrates on using [First Steps with PySpark](https://realpython.com/pyspark-intro/) lesson from Realpython.com.

One of the things that will have to be understood by the PySpark user is the anonymous functions denoted in python with `lambda`. These functions are without a name, and essentially are oneliners (they cannot be of multiple lines). Together with that, examples like filter(), map() and reduce() should at very least be seen before moving to pyspark.

In [1]:
# Example
x = ["Adam", "nico", "Eva", "uumuu", "cherry"]
print(sorted(x))                                    # because of case sensitivity some variables change places
print(sorted(x, key=lambda words: words.lower()))   # sorting is case insensitive

['Adam', 'Eva', 'cherry', 'nico', 'uumuu']
['Adam', 'cherry', 'Eva', 'nico', 'uumuu']


In [2]:
# work with filtering of information
string_text = "One of these is ... not like the others!"
list_text   = [word for word in string_text.split(" ")]
print(list_text) # This will be used as the variable to experiment with

# In the following example "..." will be removed from all the list_text possible values
print(list(filter(lambda values: "..." not in values, list_text)))

# If list is not called in the last instance, the
# filter output as an itteretable will be returned. And that's an object.

['One', 'of', 'these', 'is', '...', 'not', 'like', 'the', 'others!']
['One', 'of', 'these', 'is', 'not', 'like', 'the', 'others!']


In [3]:
# map an application of a method or a function to a variable
def spongebob_it(text):
    """
    Make text look like it is in a spongebob meme.
    
    input:
        text (str)    - a simple string
        
    ouput:
        new_txt (str) - a simple string
        
    Example:
        input: Hello
        output: HeLlO
    """
    text = text.lower()
    new_text = ""
    for i, letter in enumerate(text):
        if (i+1) % 2 == 0:
            new_text += letter.upper()
        else:
            new_text += letter
    return new_text

print(list(map(lambda word: spongebob_it(word), list_text)))

['oNe', 'oF', 'tHeSe', 'iS', '...', 'nOt', 'lIkE', 'tHe', 'oThErS!']


In [4]:
# reduce() reduces the iteretable to a single item
# Therefore, list() is also not used in this case
from functools import reduce
print(reduce(lambda x1, x2: x1 + x2, list_text))

Oneoftheseis...notliketheothers!
