## Probability Mass Function
In probability and statistics, a probability mass function (PMF) is a function that gives the probability that a discrete random variable is exactly equal to some value. Sometimes it is also known as the discrete density function. The probability mass function is often the primary means of defining a discrete probability distribution, and such functions exist for either scalar or multivariate random variables whose domain is discrete.

The value of the random variable having the largest probability mass is called the mode.

## Example 1

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
%matplotlib inline

In [None]:
m = np.random.randint(2, 10, 40)
m

In [None]:
# Changing to DataFrame
df = pd.DataFrame(m)

# Calculating each time the number is repeated
df = pd.DataFrame(df[0].value_counts())
df

In [None]:
length = len(m)
length

In [None]:
data = pd.DataFrame(df[0])
data

In [None]:
data.columns = ["Counts"]
data

In [None]:
# Calculating Probability Mass Function
data["Prob"] = data["Counts"] / length
data

In [None]:
plt.bar(data["Counts"], data["Prob"])

In [None]:
sns.barplot(data["Counts"], data["Prob"])

## Example 2

In [None]:
data = {
    "Candy": ["Blue", "Orange", "Green", "Purple"],
    "Total": [30000, 18000, 20000, 12000],
}
df = pd.DataFrame(data)
df

In [None]:
df["pmf"] = df["Total"] / df["Total"].sum()
df

In [None]:
plt.bar(df["Candy"], df["pmf"])

In [None]:
sns.barplot(df["Candy"], df["pmf"])

## Probability Density Function
A probability mass function differs from a probability density function (PDF) in that the latter is associated with continuous rather than discrete random variables. A PDF must be integrated over an interval to yield a probability.

In [None]:
data = np.random.normal(size=100)
data = np.append(data, [1.2, 1.2, 1.2, 1.2, 1.2])
sns.distplot(data)

In [None]:
import scipy.stats as stats

In [None]:
mu = 20
sigma = 2
h = sorted(np.random.normal(mu, sigma, 100))
plt.figure(figsize=(10, 5))
fit = stats.norm.pdf(h, np.mean(h), np.std(h))
plt.plot(h, fit, "-o")
plt.hist(h, normed=True)

# Blue line is the PDF

## Cumulative Distribution Function
In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at y, is the probability that X will take a value less than or equal to y.

In [None]:
import scipy.stats as ss

In [None]:
x = np.linspace(-5, 5, 5000)
mu = 0
sigma = 1

y_pdf = ss.norm.pdf(x, mu, sigma)
y_cdf = ss.norm.cdf(x, mu, sigma)

plt.plot(x, y_pdf, label="PDF")  # Blue color
plt.plot(x, y_cdf, label="CDF")  # Orange color

In [None]:
mu = 20
sigma = 2
h = sorted(np.random.normal(mu, sigma, 100))
plt.figure(figsize=(10, 5))
fit = stats.norm.cdf(h, np.mean(h), np.std(h))
plt.plot(h, fit, "-o")
plt.hist(h, normed=True)

# Blue line is the CDF