Here is a common network example about a sidewalk.  Maybe it will inform a package-delivery agent as to the right speed or tires to use on its delivery route!

We have four variables:
* Season, which has domain 0 to 3 (Winter, Spring, Summer, Fall) 
* Sprinkler on, Rain,  Sidewalk Wet, and Sidewalk Slippery, all of which are binary

The network tells us that the variable **Season** directly influences both **Sprinkler** and **Rain** -- we discovered  that the sprinkler system is on a preset schedule that depends only on the season, and not on whether or not it is raining.  **Sprinkler** and **Rain**  both in turn influence **Wet**, which in turn influences **Slippery**.


![Slippery](SlipperyPicture.GIF)

We have a data set with historical observations about the variables.  In this case we have sampled from the joint distribution fully.

The data set is in the file slippery.csv.  In this file, Season is coded as 0 to 3 (Winter, Spring, Summer, Fall) and the other variables are binary (0 for false 1 for true).

If we were diligent data scientists, we would have to verify that the conditional independence assumptions implicit in the model are actually reflected in our sample.  For example, the network embodies the assumption that **Wet** is independent of both **Season** and **Sprinkler** conditioned on **Rain**.  This is either (approximately) true or false in the data set.

But rather than that, we will use the sample to get the probability parameters we need to build our network.

In [None]:
# Read the file into a data frame and look at the first few rows
import pandas as pd
df = pd.read_csv("slippery.csv", sep=",")

In [None]:
type(df)

In [None]:
df.head()

In [None]:
df.shape

In [None]:
df.columns

In [None]:
# The columns came from the csv file
df.columns

In [None]:
len(df['Season'])

In [None]:
type(df.Season)

In [None]:
df.shape

In [None]:
df.Season.value_counts()

In [None]:
df.Season.value_counts()

In [None]:
# This is marginal probability of Season
df.Season.value_counts() / df.shape[0]

In [None]:
(df.Season.value_counts() / df.shape[0]).sort_index()

In [None]:
# P(Sprinkler | Season)
pd.crosstab(df.Sprinkler, df.Season, normalize='columns')

In [None]:
pss = pd.crosstab(df.Sprinkler, df.Season, normalize='columns')
print(f"Distribution conditioned on Season=2: {list(pss[2])}")
print(f"P(Sprinkler = 1 | Season=3): {pss[3][1]}")

In [None]:
# P(Wet | Sprinkler, Rain)
pd.crosstab(df.Wet, [df.Sprinkler, df.Rain], normalize='columns')

In [None]:
pwsr = pd.crosstab(df.Wet, [df.Sprinkler, df.Rain], normalize='columns')
print(f"P(Wet = 1 | Sprinker=1, Rain=0): {pwsr[1][0][1]}")

In [None]:
# P(Slippery | Wet)
pd.crosstab(df.Slippery, df.Wet, normalize='columns')