# Point Estimates

> The sample statistic is calculated from the sample data and the population parameter is inferred (or estimated) from this sample statistic. Let me say that again: Statistics are calculated, parameters are estimated. - James Jones [(Source)](https://people.richland.edu/james/lecture/m170/ch08-int.html)

## Data

Today we will be using an API to download wage and classification information for 500 random employees out of the total 12,401 City of Seattle Employees as of October 2019.

## Import Necessary Packages

In [None]:
import pandas as pd
from matplotlib import pyplot as plt
import requests
import random

## Import Necessary Data

In [None]:
total_employees = 12401
sample_size = 500

## Call the API to Retrieve the City of Seattle Wage Data

Randomly select 500 employees of the 12,401. How would you do this?

You may find this helpful: https://dev.socrata.com/docs/queries/. 

### 1. Make a request to the API

### 2. Limit request to obtain one single employee

### 3. Pick the 50th employee

### 4. Pick one single random employee

_Note: use [random.sample()](https://docs.python.org/3/library/random.html#random.sample) to create a list of unique random integers. Then index that list to retrieve a random integer. Use that integer with the `$offset` parameter to help you retrieve a random employee_

### 5. Pick 500 random employees

*Note: store this list as `city_wages`*

In [None]:
random.seed(2019)

url = "https://data.seattle.gov/resource/2khk-5ukd.json"

random_ints = random.sample(population=range(total_employees), 
                            k=sample_size)

city_wages = []

for random_int in random_ints:
    params = {"$limit": 1,
              "$offset": random_int}
    request = requests.get(url=url, params=params)
    city_wages.extend(request.json())

### 6. Transform `city_wages` into a DataFrame (`city_wages_df`) and calculate the mean `hourly_rate` from your random sample

In [None]:
city_wages_df = pd.DataFrame(city_wages)
city_wages_df.head()

In [None]:
city_wages_df["hourly_rate"].astype(float).mean()

### 7. Make a visualization that shows the distribution of `hourly_rate`

*Note: please assign a title to your plot and label your axes as well*

In [None]:
fig, ax = plt.subplots()

city_wages_df["hourly_rate"].astype(float).hist(ax=ax)
ax.set_xlabel("Hourly rate")
ax.set_ylabel("Frequency")
ax.set_title("Distribution of hourly rate amongst City of Seattle Employees")

fig.tight_layout()