# Amplitude

# Amplitude Infrastructure Overview

Amplitude provides a **collaborative, integrated, and self-service product suite** designed to bring data science and product analytics capabilities to both technical and non-technical users.  

Its **Digital Optimization System** combines real-time data management, behavioral insights, experimentation, and recommendations to help teams continuously improve product experiences.

---

## Digital Optimization System

![opt_system](optimisation_system.png)

### Real-Time Data Management
- **Plan, integrate, and govern** all customer and product data.  
- Create a **single, secure, high-quality view of each customer**.

### Behavioral Graph
- The **brain of the system**: a purpose-built database for complex, iterative behavioral queries.  
- Supports analysis of **historic and real-time data**, while enabling **prediction of future outcomes**.  

---

## Core Components

### 1. Amplitude Analytics
Enables all teams to **explore and understand product usage**:  
- Funnel analysis (Pathfinder)  
- Retention analysis  
- Feature engagement → discover what drives **Lifetime Value (LTV)**  
- Cohort definitions for segmentation  
- Dashboards and notebooks for visibility across teams  

### 2. Amplitude Recommend
Transforms insights into **personalized actions**:  
- Identify target audiences  
- Deliver tailored recommendations  

### 3. Amplitude Experiment
Empowers teams to **test and validate product decisions**:  
- Hypotheses and A/B testing frameworks  
- Configure and run user experience experiments  
- Automate testing and iteration  

---


# Setting Up an Amplitude Experiment (A/B Test)

Amplitude Experiments allow teams to **test product changes on real users** and measure their impact.  
This ensures **data-driven decisions** while minimizing risk to the overall user experience.

---

## Key Concepts

### 1. Separate User Traffic
- Each experiment must have its **own isolated user traffic**, so results are not contaminated by overlapping experiments.

### 2. Feature Flags
- Dynamically controllable **switches** that enable or disable features for specific user segments.  
- Useful for rolling out new features gradually or targeting subsets of users.

---

## Creating an Experiment

1. **Click `Create Experiment` → Feature**
2. **Set up details:**  
   - Name, description, relevant tags
   - Optionally, specify a related product or feature

3. **Design Experiment**
   - **Define Goal (Hypothesis):**  
     - Choose a primary metric to measure success (e.g., conversion rate, retention, engagement).  
     - Specify **success criteria**: does the metric increase or decrease?  
     - Set **Minimum Detectable Effect (MDE)** to ensure statistical power.
   - **Additional Metrics:**  
     - Track secondary metrics to monitor unintended consequences.  
     - Only **one primary goal metric** is allowed.

4. **Exposure Event**  
   - Usually defaults to the key user action that triggers inclusion in the experiment.

5. **Variants**
   - **Control:** Baseline experience, often the current product behavior.  
   - **Treatment:** New experience or change being tested.  
   - Optional: multiple treatment groups for multivariate experiments.

6. **Targeting**
   - **Audience:**  
     - All users, or a **specific cohort** defined by user properties or pre-existing segments.  
   - **Distribution:**  
     - Percent of audience in each variant. Typically 50/50 for simple A/B tests.  
   - **Rollout:**  
     - Percent of the total audience to be included in the experiment.  
     - Useful for staged rollouts to minimize risk.

---

## Tips & Best Practices

- Always **randomize assignment** to variants to avoid bias.  
- Consider **sample size** and experiment duration before starting.  
- Track both **primary and secondary metrics** to understand broader impact.  
- Monitor for **overlapping experiments** that may affect results.  
- Document the **hypothesis, target audience, and expected outcomes** before launching.  

---

## Reference
- Amplitude Experiments Tutorial: [YouTube Video](https://www.youtube.com/watch?v=uUlJn8s5jd8)


# Amplitude Experiment & A/B Test – Technical Notes

Amplitude Experiments allow teams to **run controlled experiments on product changes** and measure impact using statistical analysis. Below are technical definitions, implementation notes, and statistical concepts used in experiments.

---

## 1. Event Tracking (Implementation)

Amplitude experiments rely on **instrumented events** to measure user behavior.  

**Example:** Tracking a completed purchase in a web or backend application:

In [None]:
# Example: Python backend implementation of an Amplitude event
from amplitude import Amplitude

amplitude_client = Amplitude(api_key="YOUR_API_KEY")

# Track a purchase event
amplitude_client.track(
    user_id=str(12345),
    event_type="Complete Purchase",
    event_properties={
        "total_price": 59.99,
        "items_in_cart": 3,
        "payment_method": "credit_card"
    }
)


## 1. Event Implementation

Amplitude events can be tracked in **frontend** or **backend** environments:

**Frontend:** JavaScript, React, or mobile SDKs (iOS/Android)  
**Backend:** Python, Node.js, or other server-side SDKs  

**Purpose:**  
- Events become **exposure or outcome events** for experiments.  
- Allow Amplitude to **measure metrics** such as conversion, revenue, retention, or engagement.  

**Notes:**  
- `event_type` is the metric being measured.  
- `event_properties` allow segmentation and cohort analysis.

## 2. Hypothesis Testing Terminology

Amplitude calculates **statistical results automatically** to evaluate experiment outcomes.

| Term | Definition | Amplitude Use |
|------|------------|---------------|
| Hypothesis | Proposed change to test (e.g., "Adding a new checkout button increases purchase rate") | Defined during experiment setup; forms the basis for success metric |
| Control Group | Users who see the current product experience | Baseline for comparison |
| Treatment Group | Users exposed to the new feature | Compared against control to test hypothesis |
| Metric | Measurable outcome (conversion, clicks, revenue) | Primary goal or additional metrics |
| p-value | Probability of observing results as extreme as current under null hypothesis | Determines statistical significance |
| t-score / z-score | Standardized difference between groups | Used internally to compute p-values |
| Significance Level (α) | Threshold to reject null hypothesis (e.g., 0.05) | Defined in experiment settings |
| Statistical Power (1-β) | Probability of detecting an effect if it exists | Used for sample size recommendations |
| Minimum Detectable Effect (MDE) | Smallest effect size considered practically relevant | Set when defining experiment goals; affects sample size |

## 3. Metrics & Significance in Amplitude

**Statistical Significance:**  
- Shows whether differences between control and treatment are **unlikely due to chance**.  
- Shown in Amplitude as **confidence intervals, p-values, probability to beat control**.  

**Practical Significance:**  
- Indicates whether the effect size is **large enough to matter** in real-world product decisions.  
- Often compared against **Minimum Detectable Effect (MDE)**.  

**Where to see it:**  
- **Experiment Dashboard:** control vs. treatment metrics, p-values, confidence intervals, effect size.  
- **Secondary Metrics Table:** monitors additional metrics for unintended effects.

# Confidence Intervals in A/B Testing

A **confidence interval (CI)** is a range around your observed metric (e.g., conversion rate, click-through rate, revenue) that likely contains the **true metric value for the population**.

- Typically expressed at a **95% confidence level**.  
- This means: *if we ran the same experiment 100 times, 95 of those times the true effect would fall within this interval*.

In **Amplitude Experiments**:  
- Amplitude computes confidence intervals for **control and treatment groups**.  
- They show **uncertainty around observed metrics**, helping you decide whether the treatment really has an effect.

| Variant   | Conversion Rate | 95% CI        |
| --------- | --------------- | ------------- |
| Control   | 10.0%           | 8.0% – 12.0%  |
| Treatment | 13.0%           | 11.0% – 15.0% |

**Interpretation:**  
- The **observed difference** between control (10%) and treatment (13%) is 3%.  
- The **true difference** could vary between roughly 1% and 7%.  
- If the **confidence intervals do not overlap**, it often suggests **statistical significance**.

## 4. Where Statistics Are Applied

**During Experiment Analysis:**  
- Amplitude computes:  
  - Differences between control and treatment metrics  
  - p-values, confidence intervals, t/z scores  
  - Probability of beating control  

**Before Experiment Launch (Optional):**  
- Power Analysis to determine **required sample size** for detecting MDE  
- Avoids underpowered experiments that fail to detect real effects

## 5. Implementation Notes & Best Practices

**Events:**  
- Instrument all relevant user actions using `amplitude_client.track`.

**Experiment Setup:**  
- Define goal, variants, audience, rollout, and exposure events.

**Analysis:**  
- Check dashboard for statistical significance  
- Compare effect size to MDE for practical significance  
- Monitor secondary metrics for unintended effects

**Best Practices:**  
- Avoid overlapping experiments on the same cohort  
- Use random assignment for control/treatment  
- Monitor experiment duration to avoid temporal biases
