# Python Functions, Return Values, and Fluent Code Styles

## Functions and return values

The information below will be presented in python, but technically speaking apply equally yo nearly all programming languages. Some lean one way or the other a bit further, but these are really fundamental concepts that you can apply to anything from python to ocaml.

Lets start by creating a simple python function and discuss what it does.

In [38]:
def function0(arg1 : int, arg2 : int) -> int:
    return arg1 + arg2

The above simple function accepts two arguments, adds them together, and returns the result.  

The return part of this function is the important bit for this case. When this function runs, a bit of new data will be made available to the context that invoked the call

In [39]:
a = function0(100, 200)
print(a)

300


Something to keep in mind is that the value gets returned regardless of whether I want to us it or not.  Above, the returned value was assigned the name a, and I was able to print it.  Its entirely
valid to call the function and not name the result.  This means you can't directly reference the return value.

```python
function0(100,200)
print(???)
```

While I can't directly reference the returned value, it is still there, and I can indirectly reference it.

In [40]:
b = function0(100, 200) + function0(400, 500)
print(b)

1200


As another example, the return value from a function can be passed in as the input to a second function

In [41]:
c = function0(function0(100, 100), function0(200, 200))
print(c)

600


Return values from a function exist. If you want to or need to refer to them by name further on in your computation, you need to explicitly name them, but you don't have to do so to make use of the return values. They are somewhat like literal values.
 
```python
function0(100, 200)
```

I can't ever refer to those literal values later on. They have no name.  I can't pass those same numbers into any other functions. The entity that was created by the compiler when it saw the literal string '100' was created, given no name, but the language still knows how to pass that value to the function.

This is very similar with function return values.  You can still call functions on them (+ and - are just functions, really), stick them in lists (which is also just a form of function invocation), and any other thing that you can think of doing with a named variable. 

Since you can't reference them by name, that means you can basically only refer to them at the point of invocation, where the position in the code is enough to understand what to do with it.  In the call above:

```python
c = function0(100, 100) + function0(200, 200)
```

I can't refer to `function0(100,100)` by name, but the syntax rules make it clear that I want what ever that unnamed variable is to be passed as the first parameter to the surrounding function, which in this case is the + operation.

In summary then, function return values are just like any other variable, except that you can't refer to them anyplace other than where they were called.  

There are of course many places where you don't need nor want to ever refer to this value ever again, so its common practice to simply invoke a function in such a way that the value will be passed to some other function, to do a computation or store the value off.  Its very common to do this, and im sure you've seen lots in our work:  

```python
sevone.connect(getpass.getuser(), getpass.getpass("domain pwd: "))
```

This ability to call functions and pass their results around with out naming them is very common, but can lead to some confusion

In [42]:
list(sorted(map(lambda x: x**2, range(0, 10))))

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

If we dissect the above expression, we soon realize we have to read that sequence of calls basically inside out.

The above code is saying:

1) Create a range from 0 to 10
2) Take that and map the squared function over each one of them
3) Sort the output
4) turn it into a list.

That sequence of instructions is pretty simple to understand.  But when you read the code above, its basically presented backwards.

1) Turn the inbound stuff in to a list and return it
2) Sort the inbound stuff and return it
3) Square all of the inbound stuff and return it
4) Generate a range of 0-10 numbers and return it

 Personally, I, and I would guest most other people, find the second version harder to read.  I have to remember the first line as im reading the second line and have to slowly build up a complex thing in my head that shows me what I am doing. The first one is much easier to understand, I never need to know much more than the preceding line when I read things.

Some languages have provided built in tricks to do this for you.  For instance, the above could be written in the language F# like this.

```f#
[0 .. 10] |> List.map (fun x -> x**2) |> List.sort  
```

Above, the `|>` operator is called the 'pipe' operator, and is used to 'pipe' data through a sequence of functions.

The details of F# are not terribly important, but I wanted to indicate that this idea of making function calls look like an easy to read list is so common and popular that many languages include it as a
standard or even the default way to call a list of functions.  

Python has no built in ability to do this, but it is possible to write code that allows it.

## Methods

In python, in addition to functions, we can also create classes, objects, and methods.  

Classes are descriptions of an entity that contains data and operations that work on that data (called methods).  When a class is instantiated, you get and object, and you can invoke methods on that object. 

An example to make it easier to understand:

In [43]:
# a class describes what the objects will look like
class Example:
    def __init__(self): # initializes the object
        self.__value = 0
    
    # note the parameter self in every method.  This is an implicit parameter that will automatically be set when ever you call a method on an object.
    
    def add(self, value: int): # prpvides a method that will add value to the internal __value
        self.__value += value
        
    def sub(self, value: int): # same as add, but subtract
        self.__value -= value
        
    def value(self):
        return self.__value
        
ex = Example() # This is instantiating the class, and we get an object.

# invoking methods
ex.add(100) # note that we dont explicitly pass in the self parameter.  Self is really the ex object variable we called above
ex.sub(20)

print(ex.value())

# in reality, the call ex.add(100) can be thought of as Example__add(ex, 100), but you dont have to do this, using the . notation is enough to 
# cause all of that to happen automatically.


80


An important thing to note on a class definition is that every method must start with a parameter called `self`.

Self is a name you can use in your method to get to the object and methods attached to the object.  And that is the key thing to know about a object.  Each method naturally and by default has a reference to the
object so they can work on the data stored therein.

There are many other things we could mention about this, classes and objects can get rather complex, but for our purposes, all we need to know is that methods on objects can reference the object automatically via the 
self variable.  This gives us the ability to do something pretty slick going forward.

In [44]:
class Example2:
    def __init__(self):
        self.__value = 0
    
    def add(self, value: int):
        self.__value += value
        return self

    def sub(self, value: int):
        self.__value -= value
        return self

    def value(self):
        return self.__value

Ive re-created our example class, but note how the operations add and sub both returh the object via the self variable.

Remember from above that you dont need to name a return variable to use it, you can simple perform operations on it, like calling a nother function, etc.  

The same is true with objects, but since objects have methods on them, we can do something interesting.

In [45]:
ex = Example2()

answer = ex.add(100).sub(20).add(40).add(200).sub(200).value()
print(answer)

120


Each method we called on example2 returned the object, allowing us to simply call more methods on that anonymous return values.

This is very useful for a few reasons, but one thing to note is that we have achieved something close to that pipe operator we saw before, where rather than having to call a set of nested functions that are hard to read and interpret, we can call the operations as a sequence of operations that can easily be read from left to right, just like english.

This is often called a 'fluent` style of writing your code.  Lets compare the results

In [46]:
def add(a,b):
    return a+b

def sub(a,b):
    return a-b

sub(add(add(sub(add(0,100), 20), 40), 200), 200) 

120

Which is easier to make sense of?

```python
sub(add(add(sub(add(0,100), 20), 40), 200), 200) 
```
or 

```python
ex.add(100).sub(20).add(40).add(200).sub(200).value()
```

I know my answer, i immensely prefer the second one. It took me two attempts to even write the first one correctly, i got things out of order a few times.

Think of a cooking recipie.  Are they written like this:

1) Get some flower, salt, and yeast
2) Add in 2 tsp of salt and 2tsp of yeast to the flower
2) Mix in some water
3) Let stand for a few hours
3) Bake

or are they written like this:

1) Bake the stuff
    1) The stuff has been sitting for 2 hours
        1) Get the stuff yet
                1) The dry stuff should be full of salt, flower, and yeast
                        1) Get the flower, salt, and yeast from the cupboard

This is unreadable gibberish, yet it is fundamentally how most programming languages read when you use nested function calls and anonymous function returns.  

## Pandas

This fluent style of programming is quite common, and a perfect example of this is looking at the pandas data library

In [47]:
import pandas as pd

In [48]:
mdf = pd.DataFrame({'a': [1, 2, 3, 4, 5], 'b': [5, 10, 15, 20, 25]})

mdf

Unnamed: 0,a,b
0,1,5
1,2,10
2,3,15
3,4,20
4,5,25


Each method on a dataframe, such as `assign` is written roughly like this:

```python
class DataFrame:
  ...
  def assign(self, ...):
    ...
    return self
```

Just as in Example2 above. Each call returns the object that started the operation.  These returned values dont need to be named, but we can just keep calling other methods on it.


In [49]:
newdf = (
    mdf
    .assign(c = 2)
    .assign(d = lambda df: df.b**2)
)

newdf

Unnamed: 0,a,b,c,d
0,1,5,2,25
1,2,10,2,100
2,3,15,2,225
3,4,20,2,400
4,5,25,2,625


Due to pandas decision to write things to follow the fluent style of method invocation, we can issue a series of commands, each which change the dataframe a bit, and we can string together this series of operations in a nice, linear, readable order. We start with our initial data, then that data is passed through each method call, one at a time, in the order written, until we get to the end and we have our output

This idea, of linerally taking something, passing it through an operation that modifies it, taking the results of that step and passing it along again, is a common enough real world thing.  And that helps explain why I keep using the term pipe, with out explaining it.

![Process Piping](https://i.pinimg.com/originals/2a/13/43/2a1343b4db63693e00e7097e90583ba8.jpg)

This diagram of a factory converting raw input material through a variety of steps, to a final product.  These steps all take in several inputs, raw ingredients, power, heat, machine settings, etc, and  they transform the inputs into one or more outputs, which are then sent to the next step through **pipes**.  The analogy should be clear.  

What were doing above is the same thing.  Each method is a step in the transformation of raw ingredients to something else, which takes the raw ingredients, settings, other material, and does something to them all and returns something out the end. Each of these transformations are machines in a chemical plant, and we are piping the output of one step to the input of the next step.  

In general, software keeps this kind of simple, the pipelines only have a single direction and is a very linear flow, but now you know where the terminology came from.

When you see 

```python
newdf = (
    mdf
    .assign(c = 2)
    .assign(d = lambda df: df.b**2)
)
```

Think that each of those assign statements are a bit of machinery that will transform the initial mdf raw input into something else, and each successive call is the wiring up of the output of one machine with the input of the next, and we can just keep chaining things together until we have our final output, which we can then dump into a tank for later use (we gave it a name so we could refer to it later).

Pandas goes one step further, and has an explicit `.pipe` method that allows you to tie arbitrary machines of your own invention into this.

In [50]:
def mycustomfunction(df):
    return df.iloc[0:2]

newdf = (
    mdf
    .assign(c = 2)
    .assign(d = lambda df: df.b**2)
    .pipe(mycustomfunction)
)

newdf

Unnamed: 0,a,b,c,d
0,1,5,2,25
1,2,10,2,100


Above, we 'pipe' the data from `.assign(d...` through the custom function mycustomfunction, which transforms the data frame (in this case just returns the first two rows) and pases the results further down the rest of the chain.

## Summary

So, I hope you can see the benefits of this fluent style of programming, and how its all done under the hoods with the unnamed python variables that hold the unnamed return values.

With a fluent style of programming, I can write a sequence of operations that all operate in a linear, intuitive, easy to read sequence. They read like a simple cooking instruction.  And its all made possible by the fact that return values from functions dont need to have a name to be used.


