# Descriptive vs Predictive vs Prescriptive Analytics

Analytics can be broadly categorized into three types: **Descriptive**, **Predictive**, and **Prescriptive**.

- **Descriptive analytics** summarizes past data.
- **Predictive analytics** forecasts future outcomes.
- **Prescriptive analytics** recommends actions to achieve specific goals.

---

## Descriptive Analytics

- **Focus**: Summarizes past data to reveal patterns and trends.
- **Purpose**: Helps understand what has happened in the past or is currently happening.
- **Examples**:
  - Calculating averages, percentages, and frequency counts
  - Creating charts and tables
- **Tools**:
  - Data aggregation
  - Data mining
  - Statistical analysis

---

## Predictive Analytics

- **Focus**: Uses past data to forecast future outcomes.
- **Purpose**: Helps understand what might happen in the future.
- **Examples**:
  - Predicting sales
  - Identifying customer churn
  - Detecting fraud
- **Tools**:
  - Statistical modeling
  - Machine learning
  - Forecasting techniques

---

## Prescriptive Analytics

- **Focus**: Recommends actions to achieve specific goals or overcome challenges.
- **Purpose**: Helps decide what should be done to achieve desired outcomes.
- **Examples**:
  - Optimizing inventory management
  - Recommending personalized marketing campaigns
  - Suggesting treatment plans
- **Tools**:
  - Optimization algorithms
  - Simulation models
  - Decision support systems
---

## Descriptive statistics

- Descriptive statistics summarize and analyze the features of a dataset.
- Deals with actual, observed data.
- Descriptive statistics summarize and describe the features of a dataset. They give insight into the data without making predictions or generalizations.

| Measure                    | Description                                             |
| -------------------------- | ------------------------------------------------------- |
| **Mean**                   | The average value.                                      |
| **Median**                 | The middle value when data is ordered.                  |
| **Mode**                   | The most frequent value.                                |
| **Range**                  | Difference between the maximum and minimum.             |
| **Standard Deviation (σ)** | Measures how spread out the values are around the mean. |
| **Variance (σ²)**          | The square of standard deviation; measures dispersion.  |
| **Skewness**               | Describes the asymmetry of the data distribution.       |
| **Kurtosis**               | Measures the "tailedness" of the distribution.          |

In [4]:
import numpy as np
import pandas as pd

data = [10, 12, 14, 14, 15, 18, 20, 22, 24, 24, 30]

print("Mean:", np.mean(data))
print("Median:", np.median(data))
print("Mode:", pd.Series(data).mode()[0])
print("Standard Deviation:", np.std(data))
print("Variance:", np.var(data))


Mean: 18.454545454545453
Median: 18.0
Mode: 14
Standard Deviation: 5.8366185160998265
Variance: 34.066115702479344


## Inferential statistics

- Inferential statistics is used to make predictions or inferences about a population based on a sample of data.
- It involves using sample data to draw conclusions about a larger population, often with some level of uncertainty.

| Concept                  | Description                                                                |
| ------------------------ | -------------------------------------------------------------------------- |
| **Population vs Sample** | Population: entire group; Sample: subset used for analysis                 |
| **Hypothesis Testing**   | Testing assumptions using statistical evidence                             |
| **Confidence Interval**  | A range within which the true population parameter is expected to lie      |
| **p-value**              | Probability of obtaining results at least as extreme as the observed ones  |
| **T-tests / Z-tests**    | Compare means of groups (used based on sample size and variance knowledge) |
| **Chi-Square Test**      | Test for categorical data independence                                     |
| **Regression Analysis**  | Modeling relationship between variables                                    |

In [5]:
from scipy import stats

sample_data = [12, 14, 15, 16, 17, 18, 20]
population_mean = 15

t_statistic, p_value = stats.ttest_1samp(sample_data, population_mean)

print("T-Statistic:", t_statistic)
print("P-Value:", p_value)


T-Statistic: 1.0
P-Value: 0.35591768374958205


| Feature       | Descriptive Statistics           | Inferential Statistics              |
| ------------- | -------------------------------- | ----------------------------------- |
| Purpose       | Describe data                    | Make predictions / test hypotheses  |
| Based On      | Entire dataset                   | Sample of the data                  |
| Output        | Graphs, charts, summary values   | Probability, estimates, predictions |
| Example Tools | Mean, Median, Standard Deviation | t-test, ANOVA, Regression, p-values |
