# <h1 style="text-align: center;" class="list-group-item list-group-item-action active" data-toggle="list" role="tab" aria-controls="home">Sampling in Python</h1>

Sampling in Python is the cornerstone of inference statistics and hypothesis testing. It's a powerful skill used in survey analysis and experimental design to draw conclusions without surveying an entire population. In this Sampling in Python course, you’ll discover when to use sampling and how to perform common types of sampling—from simple random sampling to more complex methods like stratified and cluster sampling. Using real-world datasets, including coffee ratings, Spotify songs, and employee attrition, you’ll learn to estimate population statistics and quantify uncertainty in your estimates by generating sampling distributions and bootstrap distributions.

<a id="toc"></a>

<h3 class="list-group-item list-group-item-action active" data-toggle="list" role="tab" aria-controls="home">Table of Contents</h3>
    
* [1. Introduction to Sampling](#1)
    - Sampling and point estimates
    - Convenience sampling
    - Pseudo-random number generation

* [2. Sampling Methods](#2) 
    - Simple random and systematic sampling
    - Stratified and weighted random sampling
    - Cluster sampling
    - Comparing sampling methods
    
* [3. Sampling Distributions](#3)
    - Relative error of point estimates
    - Creating a sampling distribution
    - Approximate sampling distributions
    - Standard errors and the Central Limit Theorem
    
* [4. Bootstrap Distributions](#4)
    - Introduction to bootstrapping
    - Comparing sampling and bootstrap distributions
    - Confidence intervals

## Explore Datasets

Use the DataFrames imported in the first cell to explore the data and practice your skills!

In [3]:
# Importing the course packages
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import scipy.stats
import scipy.interpolate
import statsmodels.formula.api as smf

# Importing the course arrays
attrition = pd.read_feather("datasets/attrition.feather")
spotify = pd.read_feather("datasets/spotify_2000_2020.feather")
coffee = pd.read_feather("datasets/coffee_ratings_full.feather")

## <a id="1"></a>
<font color="lightseagreen" size=+2.5><b>1. Introduction to Sampling</b></font>

<a href="#toc" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Table of Contents</a>

Learn what sampling is and why it is so powerful. You’ll also learn about the problems caused by convenience sampling and the differences between true randomness and pseudo-randomness.

## <a id="2"></a>
<font color="lightseagreen" size=+2.5><b>2. Sampling Methods</b></font>

<a href="#toc" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Table of Contents</a>

It’s time to get hands-on and perform the four random sampling methods in Python: simple, systematic, stratified, and cluster.

## <a id="3"></a>
<font color="lightseagreen" size=+2.5><b>3. Sampling Distributions</b></font>

<a href="#toc" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Table of Contents</a>

Let’s test your sampling. In this chapter, you’ll discover how to quantify the accuracy of sample statistics using relative errors, and measure variation in your estimates by generating sampling distributions.

## <a id="4"></a>
<font color="lightseagreen" size=+2.5><b>4. Bootstrap Distributions</b></font>

<a href="#toc" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Table of Contents</a>

You’ll get to grips with resampling to perform bootstrapping and estimate variation in an unknown population. You’ll learn the difference between sampling distributions and bootstrap distributions using resampling.