# Models

In machine learning, a model "models" the real world, or least a part of it using mathematical concepts and language. The same way humans build an understanding of how the world works over time, which then affects how we behave and make decisions. When we are little kids, we have a certain model of the world. This helps us to know what is edible and what isn't, for instance. As we grow up, we refine this model, and start becoming better at distinguishing what is food and what isn't. We might not be able to explain how we make this decision, but we know that given some input, let's say a photo of something on a plate, we are able to make a prediction of whether this is food or not.

Of course, my understanding of the world might vary from yours because we have had different lives and experiences, or we simply take in things differently. My own understanding of the world might have even been different if my life had happened differently. The same is true with models. Depending on how they are built and trained, they might make decisions in wildly different ways.

The fact that we do not know exactly how we, humans, make decisions is one of the key reasons why machine learning is so challenging and fascinating. It forces us to try to understand our own decision-making. Of course, the same way we tried to mimic birds when we tried to build flying machines, immitating human reasoning might not be the key to creating intelligence, and therefore, there are different schools of thought when it comes to how machine learning models should be built.

Models have of course all sorts of properties that we could delve into, but here, we'll focus on the high-level components of any model

## What information is stored

When a model is created, one must define what its state will be composed of. For instance, let's take a simplistic prediction model for house prices. Such a model could for instance contain a single parameter containing the average price of a house. It could be more complex and store the parameters definining the distribution of the price based on various variables.

It is crucial to understand what information a model is working with, because sometimes this may be a limiting factor in what type of prediction you might want to make. Indeed, if the model only stores the average price of a house, it will not be able to make the difference between a very cheap or a very expensive one, because _all_ it knows is the average price.

## How that information is used

Models use the information at their disposal to make decisions. The same way humans do. If you look at a plate of food, you will use your memory to check whether or not you enjoy this particular food, and decide whether to eat it based on this. Humans are very complicated models and we can make use of loads of information very fast, in intricate ways, but like a model, we can only make use of the information that we do possess.

Models may make use of their information in various ways. Some models may be using some form of decision tree to come to a conclusion, some may use statistical distributions, and some may use distances for instance.

## How the model is trained

As we've explained before, models possess a state composed of multiple variables, or parameters, and use them in order to make decisions. Obviously, machine learning is also deeply focused on the _learning_ part. In theory, one could just try all combinations of parameters and see which one performs better. But as we've mentioned previously, models may contain hundreds, thousands, if not millions, of parameters, thus making going through the search space intractable.

Each model may learn their parameters in various ways. This is what a big part of machine learning research is about. Not necessarily coming up with brand new types of models, but with novel ways to learn parameters, in a more efficient and robust manner.

## What information is used to train the model

As with humans, models need to be trained effectively. This is why we go to school for so long. We take years to learn some concepts, and the quality of the learning materials has much to do with how well we learn. If we've been given bad maths classes, it will be harder for us to be good at maths, however smart we may be. Models work the same way. They need relevant information, or data, and enough of it to be performing well at their task. Often, what data is available is the biggest deciding factor on what model to use, as models may perform differently given how much and what type of data there is.

## Examples of models

Now that we have talked about the theoretical aspect of models, let's give some real-world examples.

### Maps

Maps are a great example of models. They bundle our own understanding of where places are located on the planet and represent it using coordinates. If we are at a certain longitude and latitude, we will be in this country, or in the ocean etc. Of course, there are many different kinds of maps. Some include borders, some include city-level information, some include information about the topology of certain regions, and some might only cover certain areas of the planet.

These are all different models of the same thing, our planet. You could of course think of the first few maps of the world, which now seem ridiculous because of their innaccuracy, but nonetheless, they are also models. The only difference between then and now is that we are now able to make our models more representative due to an extended set of tools and technology, but in essence, the models are the same as they sum up our understanding of the world in the same fashion.

### Languages

Languages are models too. They map our thoughts into words and sounds. Some languages are better at expressing certain emotions, some languages are more useful in sciences, some languages are easier to speak. We have invented many different ways to model our human mind so that we could have a medium with which to communicate. The same works for programming languages, they are models of machine level instructions, they encapsulate physical operations using a mathematical language.

### Writing

Writing is a model. We are trying to put on paper what our thoughts are. Of course, we might use different alphabets, spellings, or even caligraphy, but all of these different models represent the same thing, albeit in various ways.

### The Law

The Law is also an interesting model. It tries to represent whether an action should be considered acceptable or not by society. It does that by creating a legal system. Of course, laws change constantly, and vary from place to place, as customs and what may be considered acceptable also vary greatly based on time and location.

## What makes a model good or bad?

It all depends on what you consider good or bad. For a map, it might be good to be detailed, or it might be good to be easy to read. For a language, it might be good to concise, or it might be good to be complete and detailed. Some models are considered good because they are simple, some because they are specific.

Sometimes, a good model is not much about the model itself, but about how easy it is to understand, train, or to update.

## No Free Lunch Theorem

The No Free Lunch theorem states that all models, when their performance is averaged all tasks, will be equivalent. This suggests that there isn't a "master model" which will perform well on all tasks. Some model might be good at a certain task and horrible at another.

https://en.wikipedia.org/wiki/No_free_lunch_theorem

## Occam's Razor

Occam's Razor, also called the law of parsimony, is a core principle or belief in machine learning. In short, a simpler model is a better model. So we should always strive to create models as simple as possible. If you've ever written code, you know that you can very easily make it "work" but it can very quickly become spaghetti code that is difficult to maintain. This is based on the assumption that the more moving parts there are within a model, the more likely some are to be wrong.

## Bias-variance dilemna

Of course, a model needs to be general enough to be flexible and robust to noise, but it also needs to be precise and specific enough, to be accurate. This is a dilemna as improving one will reduce the other, and therefore, one needs to find the right spot. The bias-variance dilemna is a key challenge in machine learning