<a href="https://colab.research.google.com/github/calumrussell/fpl/blob/master/FPL_tut.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Why you should care?**

Learning how to program isn't for everyone. Computers aren't for everyone.

After several hours of feeble struggling on a project, I still get frustrated. I imagine throwing my computer into the air, watching with glee as it's silicon insides spill onto the ground before me, try throwing an error now...

...but computers are also magic. When you have built something, you feel accomplishment and wonder. When you are working, there is a satisfaction in watching a project come together. But, most importantly, computers solve problems.

Most people get stuck learning how to program. Why? Because programming is presented as something to learn and teach, not something which makes your life immediately better. This is often fun for the teacher but boring for the student.

I learned how to program because I had a problem that needed a solution. The first code I wrote, I belive, was for a university paper on monetary policy: I needed a statistical tool that Excel didn't have. The first "real" program calculated the weights for a portfolio of equities: again, I needed a tool that Excel didn't have. I didn't build this stuff because I love computers or maths or algorithms...I needed solutions.

My purpose is to show you that programming is useful. You will not become an expert overnight but you also don't need to spend years trudging your way through exercises or building toy applications that don't interest you. You can, hopefully, build things in a few weeks, and those things will be useful to you. FPL is the perfect place to test and build programming skills. Build skills. Have fun. Learn new things. Challenge yourself.


**What to expect?**

I am not a teacher. I did not study Computer Science. My only "qualification" is that programming changed my life, and I believe it can help other people. 

Maybe it helps you get a job, maybe it helps you at work or school, or maybe you just have fun. Hopefully, all three. 

I will do my best to give the most concise explanation possible of the fundamentals of programming but I cannot teach you everything, and you will need to supplement this knowledge on your own. My aim it not to teach you but to help you teach yourself.

This may sound scary but if anyone attempted to foist their version of programming onto me, I would have lost interest immediately. That is why this is hopefully going to be as much about FPL as it is about programming.


**What is programming?**

Unfortunately this part is, imo, unavoidable so I will be brief: a computer program stores and operates on data.

A computer is composed of storage - RAM and hard drive - and computation - the CPU. Storage stores data until it is required for computation. And the CPU has instructions that operate on data. This sounds vauge but it is basic stuff: addition, multiplication, etc.

A computer program is just built ontop of this structure. This may sound ludicrously simple but that is really it. N

Abstract part over. We can get onto the actual code.

In [None]:
name = "Leo Messi"
print(name)

**Variables**

Intuitively, we can understand this but the detail of what is actually happening is unclear.

The part on the right is clearly two words but why quotes? Inside the quotes is a list of letters, and the quotes are used to distinguish this as data, and not some other word used by the program. A list of letters is more commonly called a string, and strings are one of many data types. Other data types are: integers like 0 4 3 or floats (decimals) like 0.2 0.5 10.5. We are defining some data that is stored in some way.

The equals sign denotes that we want to store our string in the thing to the left of our equals sign. These are called operators, they "operate" on things to their left and right. The plus sign is an operator, and can be used to add values: 3 + 4

The part on the left is a variable. This is where our string is stored in the program, and we can use that name to refer to the value contained within. We print our **name** variable somehow (don't worry, we will explain), and get the value within. It isn't clear why this sort of thing is useful now, it will become clear later.

In [None]:
some_var = 10
some_var = 20
print(some_var)

20


We can use the variable name more than once in a program, and we can initialise a variable with the equals sign more than once. All that happens is that we just overwrite the data "inside" our variable.

What about print? This is a function. And the purpose of the function is to take some data, perform a computation using that data, and then return another value. So we call a function by using a special name, and pass arguments to the function inside the brackets (note: variables are called arguments when passed into a function).

**Functions**

In [None]:
def add(a, b):
    return a + b

What would do we get if we call this function?

In [None]:
number_1 = 1
number_2 = 3
result = add(number_1, number_2)
print(result)

4


Again, this is all intuitive but something far more complex is clearly going on here.

We have defined our own function, the definition of which starts with the word def. At this point, it is worth nothing that **def** is a keyword: special words which are used to define basic constructs in Python. If we tried to call a variable **def**, Python gets confused (thinking that you are defining a function).

So we name our function, and then define the parameters (more pedantry, when we define a function the arguments that we pass are, technically, called parameters...it is like they are trying to put people off, don't worry) that our function requires to run.

A few things are worth noting briefly:

Our function can ask for parameters and not use them

In [None]:
##Always returns 30, get baited, 30 is best
def add_baited(a, b):
    return 10+20

Our function can ask for no parameters

In [None]:
##Always returns 30, don't care bruv, 30 is still best
def add_not_interested():
    return 10+20

Finally, our function returns some result to the user but that isn't necessary either. Note that the print function didn't return anything, it just printed the value that was "inside" our variable.

In [None]:
def i_am_pointless(value):
    print(value)
    return

It is worth briefly noting that functions are the main construct of any program, almost every programming language, and are the reason why programming is useful.

It isn't immediately obvious why but if we return to our add function: what is going on here? It looks straightforward, we are just adding...but the key point is that the knowledge about how to add is abstracted away from any particular sum.

We haven't created a function which only works for 8 + 6 but which works for any two numbers. This abstraction away from a specific sum means that we can just repeatedly call this function, and we will always get that same logic without having to write out the numbers. The logic of adding is abstracted away, we just need **a** and **b**. Once calculations become more complex, this is essential and why programming is so valuable.

To tidy up the loose ends: the plus sign is an operator, we can assign numbers to variables as well as strings, and we can return values from a function and store that in a variable.

**Function Scope**

We have to take a boring detour here:

In [None]:
a = 10
def add_2(a, b):
    return a + b
result = add_2(5, 10)
print(result)

15


What is happening here? We define **a** twice but the parameter takes precedence why? Couldn't this also print 20?

Every programming language has scopes which basically define how naming conflicts between variables are resolved. Unsurprisingly, the solution to this problem is: don't use the same variable name.

But this example, highlights something about scope: when we call the function, it runs the stuff inside the definition, and overwrites the value pointing to **a**. So when we refer to **a**, it means something new within the function.

In [None]:
c = 10
def add_3(a, b):
    return a + b + c

result = add_3(5, 5)
print(result)

20


But if we use a totally different name, it will still search outside the "scope" of the function into the "global scope" - the highest level, usually the main scope in which the program is running - and find **c** there. 

And yes, the scope relates to the indentation. When we indent, we are doing so to denote a change in scope. So when we define a new function, the function's code is indented (but not the definition) so it is distinct from the scope containing it.

The rules around scope in Python are actually a bit more tricky than this because Python has relatively few restrictions on how run programs. But it is worth saying that the code above is terrible and never something you should do. Why? Because you may not define **c** and **add_3** close together and so the program would run, you would get some output, and possibly not realise that the value **c** was coming from somewhere else. Instead:

In [None]:
def add_3(a, b):
    c = 10
    return a + b + c

Here, c is defined where it is used. And if we try to use c outside this function, Python will be unable to find it and throw an error which is far more useful to us.

These examples are tedious but should, hopefully, give a brief insight into how computer programs run. Scope also can cause weird and baffling errors if you are new to programming.

**Lists**

In [None]:
passes = [20, 40, 50]
players = ["Leo Messi", "Cristiano Ronaldo"]

Lists are useful if you are interested in FPL but something may be a bit puzzling here. We have clearly defined something that contains numbers or strings...but this isn't like any data that we have seen before. The list is not data itself but is data that stores data?

This is another example of abstraction. To see why this useful, think about why it is important that lists exist at all. Why do we need lists? They just store data, the data is surely more important, the list is just the wrapper.

But what if we need to do some calculation that is unique to lists and not the values that they contain? So we need some abstraction of a list so that we can perform some calculations that are unique to lists. Some examples may include: getting the first member of a list or getting the length of the list.

In [None]:
print(len(passes))
print(players[0])

3
Leo Messi


This square brackets business looks odd but it is just a special syntax unique to lists, called indexing, which returns the first element. Why zero? In computer science, zero not one is the first element i.e. 0, 1, 2, 3, 4, 5 of a list.

**For Loops**

In [None]:
for player in players:
    print(player)

Leo Messi
Cristiano Ronaldo


Now this is real witchcraft looking code. Again, we can roughly intuit what is going on here based on the result but exactly how is unclear.

Well, the syntax is more confusing than the operation itself. The **in** is just used to refer to a list, or something with lots of values, and we are just scrolling through the list selecting each value. And the **player** variable will refer to each value within a list. This is called a for loop, and is common to most programming languages. Note, that the terminology used in Python is that **players** is an iterable and the for loop is and iterator. So an **iterator** is something that operates on an **iterable**. On each loop, **player** gets redfined.

In [None]:
shots = [1, 0, 4]
total = 0
for shot in shots:
    total = total + shot
print(total)

5


To see why this is useful, imagine that we have a list containing the number of shots a player took in their last few matches. We can use our iterator to calculate the total number of shots. Note, that we have to define our total variable outside the scope of our loop. Why?

In [None]:
shots = [1, 0, 4]
for shot in shots:
    total = 0
    total = total + shot
print(total)

4


Every time we run our loop, the code inside the scope of our loop runs again with a new value for shot. If we want to preserve some value over the length of the loop, therefore, we need to keep that variable outside the scope of our loop so it doesn't get overwritten to zero on each loop. We overwrite on each loop but use the old value to add onto the new value.

And note, we can do something like this:

In [None]:
def shot_sum_calculator(shot_list):
    total = 0
    for shot in shot_list:
        total = total + shot
    return total

messi_shots = [1, 0, 4]
ronaldo_shots = [4, 2, 0]

messi_total = shot_sum_calculator(messi_shots)
ronaldo_total = shot_sum_calculator(ronaldo_shots)
print(messi_total)
print(ronaldo_total)

5
6


Programming is valuable because we can abstract the logic of our computation away from any particular data values. Here, we have created a function which will calculate the total number of shots for any list of shots that you give it. It is abstracted away from our data, it doesn't know about Messi, it doesn't know about Ronaldo, so we can use it for any player, any time, and we will get the right answer. This may seem simple but when you have thousands of players with shots for thousands of games, that abstraction is important. A computer will calculate the total for each player in nanoseconds, and we can (unlike Excel) make any additional calculations, as complex as we can imagine.

**If/Else Statement**

The other only essential programming construct that I have to tell you about is this:

In [None]:
def add_4(a, b):
    if a < 0:
        return a
    elif a < 10:
        return b
    else:
        return a + b
    
print(add_4(11, 2))
print(add_4(9, 2))
print(add_4(-1, 3))

13
2
-1


Clearly, the program is looking at some conditions and deciding what code to run but how?

The terms after **if** and **elif** are control-flow statements that control the flow of the program, and decide which code will run. These statements evaluate to True or False - booleans - and if the statement is true, then the code indented runs with no other branch running.

So the first statement a < 0 is either True or False, if False it goes to the next statement a < 10, and if False it goes to the final statement. Note there is only one **if** and one **else** - which runs if every other statement is False - but there can be multiple **elif** keywords for each if statement. The only necessary component of an if statement is the **if**, **else**/**elif** do not need to be included.

If statements are necessary components for all programming but I would suggest that people build some intuition with them. Although they seem simple, they are a fairly consistent source of bugs, and it is really only with practice that you learn how not to use them.

**Summary**

The above is 95% of the most basic knowledge required for programming. This may seem surprising because few people could read the stuff above and then start to program. It is like suggesting that someone could read a few sentences in Italian, and have all the basic knowledge required to disassemble a Ferrari using the Italian instructions.

But learning a programming language is slightly different to learning a human language. Yes, there is a lot of complexity. How do I read from a CSV file? How do I run a program on my computer? Where does the print function come from? But a lot of this is just the implementation details of the language that you are using. Some languages have more complex constructs on top of the things we mention above, I will briefly outline the most important one below, but the most fundamental constructs, that do 95% of the work of storing and operating on data, that give intuition about what progamming is have been covered above.

So I will outline one last construct, and go through an example application that demonstrates how to use this knowledge. Don't get disheartened. You will make it.

**Objects**

The last construct that we are missing is one that is often hard for those new to programming to understand because, frankly, it makes very little sense outside of programming.

When I tell you about a for loop, you can imagine a box and pulling things out of the box, doing something with each thing until the box is empty.

Objects are abstract...but that is actually the point. An object is a data type, like a number or a string, but one that you define. That object can store data but it can also define operations on that data too. For example, we can imagine a number object which represents a specific number - like 9 or 5 - but also has an add operation, a minus operation, etc. which are natural to all numbers. That is an object.

And to define an object, we need to write something called a class.

In [None]:
class Player:
    
    def __init__(self, name):
        self.name = name
        return
    
    def get_name(self):
        return self.name
    
messi = Player("Leo Messi")
ronaldo = Player("Cristiano Ronaldo")

print(messi.get_name())

Leo Messi


Many things about this are going to be unclear. Don't worry. It will make sense to you with time and practice. But I will note a few things.

First, the functions here are defined within the scope of our class. This means we can't use them outside the class, they only make sense when you are referring to a Player object.

Second, we define a class but we create objects. A class is just a definition of the internals but an object refers to an actual player: Messi or Ronaldo, in this case. This distinction is important, again, because we are seeking to abstract the behaviour away. A class refers to a general non-specific definition that can be used if we provide a name, an object is specific.

Third, the weird looking **__init__** function is just something internal to the programming lanugage that gets called whenever we try to create the object. This is where we pass the name of our player into our object. The weird-sounding **self** argument is actually the object itself...so we pass the object to the object? Yes, basically. Don't worry, just think...we need to define some properties on our player. We have defined their name, we could add their age, place of birth, whatever other properties are unique and interesting to players. But how do we store those properties, we have to store them on the object so when we create the object, we pass in the name, and store that on the object itself so we can use it later.

Fourth, objects are defined with capital letters, this is so they are distinct from functions because when we create an object, we do it using brackets like a function...so they need to be distinct. And when we create an object, the return value is the object itself. We can call functions on that object with a **.function_name()**, this is just a way to refer to properties on the object (we used it in **__init__** too).

Finally, whilst most programming lanagues uses Objects to create useful abstrations, it is not 100% necessary and is less likely to be used in the kind of data-oriented code that is common to FPL. This does not mean that it isn't useful, esp. when you write complex programs, but it is more common with applications in other areas. Some programming paradigms/languages have no objects at all (and some programming languages, including Python and Jave, use objects for everything).

In [None]:
##Import code from a package into our scope
##A package refers to a Python package, which is code usually written by someone else
##with some functionality that you require.
##You install the packages from the internet using pip, a Python package manager, which
##installs the packages onto your computer.
##When you type import, Python searches the folder with those packages.
import pandas as pd

##reading a csv file from a github repo
players = pd.read_csv("https://raw.githubusercontent.com/vaastav/Fantasy-Premier-League/master/data/2019-20/cleaned_players.csv")
with_points = players[['second_name', 'total_points']]
##Understanding this requires understanding about pandas which is beyond the scope of this 
##tutorial, basically we are turning this data into a list
to_list = with_points.values.tolist()

In [None]:
##We have a list of lists, each list contains a players second name
##and the number of fantasy points they scored in the 19-20 season
to_list[0]

['Mustafi', 43]

In [None]:
##So let's see if we can find who the top five points scorers were that season?

##We need to give sorted a function which defines what value we want to sort on
##Without this function, it wouldn't know whether to sort on the player name,
##or the total number of points
def sort_func(value):
    return value[1]

##This is a built-in sorting function in Python. We pass in a key which returns
##the value to sort on, we pass our list, and we reverse the order so that our list
##sorts values in descending value, as the default value is ascending
top = sorted(to_list, key=sort_func, reverse=True)

In [None]:
##We use the indexing syntax on our list to return the first 5 values
##i.e. the top 5 points scorers for the 19/20 season.
top[0:5]

[['De Bruyne', 251],
 ['Salah', 233],
 ['Mané', 221],
 ['Vardy', 210],
 ['Alexander-Arnold', 210]]