# Bayesian network creation
The use of the package is focused on the `BayesianNetwork` class. The two main ways of creating new networks are using the constructor, in case you already have a graph structure or parameters, and learning it from data using `fit` function.

## Using the constructor
If you already have a graph structure and the network parameters (or joint probability distribution) in the right formats, it is posible to use the constructor for building the network.

The graph structure is represented using a `DiGraph` object from the `networkx` package.

In [None]:
from networkx import DiGraph

graph = DiGraph()
graph.add_nodes_from([1, 2])
graph.add_edges_from([(1, 2)])

### Gaussian case
The network parameters are represented with a dictionary where the keys are the identifiers of the nodes (they must be the same as in the `DiGraph` object) and the values are `GaussianNode` objects. `GaussianNode` is just a named tuple with four elements: `uncond_mean`, `cond_var`, `parents` and `parents_coeffs`. For each node, these elements represent the unconditional mean, conditional variance, parents and coefficients in the regression of the node on its parents.

In [None]:
from neurogenpy import GaussianNode

parameters = {1: GaussianNode(0, 1, [], []), 2: GaussianNode(0, 1, [1], [0.8])}

### Discrete case

In the discrete case, we use `pgmpy` as the core package, and the parameters of the network are `pgmpy.TabularCPD` objects. Suppose node 1 has 3 possible categories 0, 1 and 2 and its table CPD is

|     |     |
|-----|-----|
|1(0) |0.3  |
|1(1) |0.3  |
|1(2) |0.4  |

and node 2 also has 3 possible categories and its table CPD is 

|1    |1(0) |  1(1)| 1(2)  |
|-----|-----|------|-------|
|2(0) |0.1  |  0.1 |  0.1  |
|2(1) |0.1  |  0.1 |  0.1  |
|2(2) |0.8  |  0.8 |  0.8  |

In [None]:
from pgmpy.factors.discrete.CPD import TabularCPD

cpd1 = TabularCPD(1, 3, [[0.3], [0.3], [0.4]])
cpd2 = TabularCPD(2, 3, [[0.1,0.1,0.1], [0.1,0.1,0.1],[0.8,0.8,0.8]], evidence=[1], evidence_card=[3])

parameters = {1: cpd1, 2: cpd2}

Once you have both the `graph` and `parameters`, the network can be instantiated the usual way. In the discrete case, the user needs to pass as an argument `data_type='discrete'` and in the continuous case `data_type='continuous'`.

In [None]:
from neurogenpy import BayesianNetwork

bn = BayesianNetwork(graph=graph, parameters=parameters, data_type='continuous')

## Learning the full network from data
As said before, it is possible to learn the structure and parameters of a Bayesian network from data. First of all, you should create a `pandas DataFrame` from your data with the following structure:

| Instances  |    Feature 1 |    Feature 2 | ... |    Feature N |
|------------|:------------:|:------------:|:---:|:------------:|
| Instance 1 | $Value_{11}$ | $Value_{12}$ | ... | $Value_{1N}$ |
| Instance 2 | $Value_{21}$ | $Value_{22}$ | ... | $Value_{2N}$ |
| ...        |     ...      |       ...    | ... |       ...    |
| Instance n | $Value_{n1}$ | $Value_{n2}$ | ... | $Value_{nN}$ |

In our example, we create it by reading a CSV file.

In [None]:
import pandas as pd

df = pd.read_csv('data.csv')

Once data is in the correct format, there are two ways for learning the network: using arguments of `fit` function or using particular `LearnStructure` and `LearnParameters` subclasses. They are analogous and we particularly recommend the first one as it is simpler.

### Set the structure and parameter learning methods using arguments
Once you have read the file, you can fit it using the `fit` method and setting the structure learning algorithm, estimation method.

In [None]:
bn = BayesianNetwork().fit(df, data_type='continuous', estimation='mle', algorithm='pc')

Additional parameters for the structure learning or parameters estimation algorithm can be provided too.

In [None]:
bn = BayesianNetwork().fit(df, data_type='continuous', estimation='mle', algorithm='pc', penalty=0.01)

### Instance a particular `LearnStructure` or `LearnParameters` subclass
Another option is to use the desired subclass of `LearnStructure` or `LearnParameters`.

In [None]:
from neurogenpy import PC, GaussianMLE

pc = PC(df, data_type='continuous')

mle = GaussianMLE(df)

bn = BayesianNetwork().fit(algorithm=pc, estimation=mle)

## Combinations
You can use combinations of the above methods to build your network.

If you are only interested in the graph structure, it is possible to just learn the structure and not the parameters by not providing any value for the attribute `estimation`.

In [None]:
bn = BayesianNetwork().fit(df, data_type='continuous', algorithm='pc')

On the other hand, if you already have a graph structure and want to learn the parameters, you can provide it in the constructor or load it before calling `fit` with `skip_structure` set to `True`.

In [None]:
bn = BayesianNetwork(graph=graph)
bn.fit(df, data_type='continuous', estimation='mle')

bn = BayesianNetwork().load('adjacency_matrix.csv')
bn.fit(df, data_type='continuous', estimation='mle')