(ch2:bigpicture)=
# Look at the big picture

Welcome to the Neurobiology Institute! A research advisor you work for has come to you with an interesting problem. All mammalian brains, including humans, have bilateral brains. This means that the brain is separated into left and right halves, which are each responsible for unique functions. The brain, comprised of billions of neurons, features a unique communication array, in which individual neurons "communicate" with one another by way of long fiber tracts, called axons, which are responsible for taking signals from the center of the neuron to other neurons. Different patterns in which these neurons connect produce the unique functions your brain is able to do: it is able to manage passive functions like breathing, a heartbeat, vascular tone (the ability of your blood vessels to contract and relax to keep blood pumping through your body), as well as more active functions like moving, hearing, seeing, and higher level thought like your ability to read this book or remember the content. Due to their proximity in the brain, neurons in the left hemisphere tend to be a lot more connected to other neurons also in the left hemisphere, and vice versa for the right hemisphere. This pattern, known as within-hemisphere *affinity*, materializes as neurons being better connected with other neurons that are in the same hemisphere. 

When neurobiologists look at the brains of *living* organisms, state-of-the-art neuroimaging techniques do not allow them to look at individual neurons. Instead, they construct MRI images which group neurons into approximately 1mm resolution voxels (voxels are like the pixels of a camera, but 3D). A single voxel of the brain typically includes a few million neurons, and a single brain is typically comprised of thousands of voxels. Even with thousands of voxels, computation on so much data can get extremely intensive very quickly. These voxels are usually further grouped together with other voxels in their surrounding area, giving neurobiologists a much higher granularity to work with. The whole process kind of looks like this:


```{figure} ../../Images/brain.png
---
scale: 80%
align: center
name: dwi_brain
---
A schematic of the fiber tracts and regions of the brain. These fiber pathways are used to deduce how many connections different regions of the brain are connected by. 
```

However, this begs an important question, which is why your colleague has come to you: how do you choose which voxels to group together for your analysis? Can you identify properties from what you know about brain function to programmatically determine an optimal way to make your analysis more computationally feasible?

If these are the types of questions you have when you see new network data, then this is the right book for you. 

## Framing the problem

The first question to ask your colleague is; what exactly is the objective here? How will science (or a company) use and benefit from the knowledge we hope to gain? In network machine learning, the choice of the model used is *everything*. The model determines what sorts of questions we are capable of asking, and what sorts of *answers* we are capable of learning. Asking about the objectives will directly shape which models and approaches you use.

Your colleague replies that he will give you a network. The network will be the brain network from a human being, where nodes will be the individual voxels from an MRI scan. The edge weights in the network will represent the total number of fibers that connect that voxel to another voxel in the brain. Your colleague wants to know whether the voxels of the brain can be grouped together with other voxels in the brain for further analysis.

The next question to ask is what the current solution looks like. This will help you to understand where to start approaching the problem, and give you a reference for the performance of your techniques. Your colleague answers that presently, voxels are grouped together by manually studying brains which have been donated to science, and manually tracing out axonal fiber tracts in a research lab with microscopes. This is incredibly crude and non-technical, and has no performance metrics of note. So you've got a totally novel problem to approach!

Next, you need to determine what type of network machine learning problem you have. What type of data do you have? Do you have any covariates associated with that data? What type of question do you want to answer? Do you want to test a hypothesis, or make predictions? What characteristics will your model need to reflect to be able to answer the question appropriately? Before you progress further, you should try to think and answer these questions for yourself. 

Remembering back to [types of network machine learning problems](ch1:types), you immediately conclude that this is a single network learning problem. Your network is non-attributed, since you only know the nodes and edges of the network. The question asks about groups of nodes and edges, and you hope to use network modelling approaches to study your problem. You are going to need to come up with a definition of what it means for pairs of voxels to be similar, and you are going to want to be able to group voxels in a way that is meaningful for your colleague.

## Check the assumptions

Throughout the course of this book, we will try to keep in direct focus the assumptions being made by the techniques we might pick. You want to choose the simplest set of assumptions that can reasonably reflect the data. This means that you want to use the simplest statistical model that can answer the question you want to address. In this case, we don't care about individual neuron-to-neuron connections at all: we only care about how groups of neurons behave in relation to other groups of neurons. This means that we want to choose models which will allow us to learn about pairs of neuron groups, which is a very different problem from learning about individual neurons themselves. You don't want to find out after developing an analysis which produces results on pairs of neuron groups that your boss actually wanted you to compare individual neurons themselves!

After talking over your understanding of the problem with your research colleague, you are confident that he wants a way to be able to group voxels together based on how similar they are, and he gives you freedom to define that however you choose. You now have the green light to get coding!