# Central Tendency for Housing Data

In this project, you will find the mean, median, and mode cost of one-bedroom apartments in three of the five New York City boroughs: Brooklyn, Manhattan, and Queens.

Using your findings, you will make conclusions about the cost of living in each of the boroughs. We will also discuss an important assumption that we make when we point out differences between the boroughs.

We worked with Streeteasy.com to collect this data. While we will only focus on the cost of one-bedroom apartments, the [dataset includes a lot more information](https://github.com/Codecademy/datasets/tree/master/streeteasy) if you’re interested in asking your own questions about the Brooklyn, Manhattan, and Queens housing market.


## Tasks

### Observing your Data

1. We’ve imported data about one-bedroom apartments in three of New York City’s boroughs: Brooklyn, Manhattan, and Queens. We saved the values to:

    - `brooklyn_one_bed`
    - `manhattan_one_bed`
    - `queens_one_bed`

    In this project, we only care about the price of apartments, so we saved the price of apartments in each borough to:

    - `brooklyn_price`
    - `manhattan_price`
    - `queens_price`

    If you want to see what these arrays look like, you can use print statements to see them in the output terminal.

    <details>
        <summary>Stuck? Get a hint</summary>
    
    Use the following print statements for any variable you’re interested in viewing:

    ```python
    print(brooklyn_one_bed)
    print(manhattan_one_bed)
    print(queens_one_bed)
    
    print(brooklyn_price)
    print(manhattan_price)
    print(queens_price)
    ```
    </details>


### Find the Mean

2. Before starting the next few steps, delete any `print()` statements you’ve added.

    Find the average value of one-bedroom apartments in Brooklyn and save the value to `brooklyn_mean`.

    <details>
        <summary>Stuck? Get a hint</summary>
    
    Use the following NumPy `.average()` function to find the average and save it to a variable:

    ```python
    mean_value = np.average(example_array)
    ```
    </details>

3. Find the average value of one-bedroom apartments in Manhattan and save the value to `manhattan_mean`.

    <details>
        <summary>Stuck? Get a hint</summary>
    
    Use the following NumPy `.average()` function to find the average and save it to a variable:

    ```python
    mean_value = np.average(example_array)
    ```
    </details>

4. Find the average value of one-bedroom apartments in Queens and save the value to `queens_mean`.

    <details>
        <summary>Stuck? Get a hint</summary>
    
    Use the following NumPy `.average()` function to find the average and save it to a variable:

    ```python
    mean_value = np.average(example_array)
    ```
    </details>


### Find the Median

5. Find the median value of one-bedroom apartments in Brooklyn and save the value to `brooklyn_median`.

    <details>
        <summary>Stuck? Get a hint</summary>
    
    Use the following NumPy `.median()` function to find the median and save it to a variable:

    ```python
    median_value = np.median(example_array)
    ```
    </details>

6. Find the median value of one-bedroom apartments in Manhattan and save the value to `manhattan_median`.

    <details>
        <summary>Stuck? Get a hint</summary>
    
    Use the following NumPy `.median()` function to find the median and save it to a variable:

    ```python
    median_value = np.median(example_array)
    ```
    </details>

7. Find the median value of one-bedroom apartments in Queens and save the value to `queens_median`.

    <details>
        <summary>Stuck? Get a hint</summary>
    
    Use the following NumPy `.median()` function to find the median and save it to a variable:

    ```python
    median_value = np.median(example_array)
    ```
    </details>



### Find the Mode

8. Find the mode value of one-bedroom apartments in Brooklyn and save the value to `brooklyn_mode`.

    <details>
        <summary>Stuck? Get a hint</summary>
    
    Use the following stats `.mode()` function to find the mode and save it to a variable:

    ```python
    mode_value = stats.mode(example_array)
    ```
    </details>

9. Find the mode value of one-bedroom apartments in Manhattan and save the value to `manhattan_mode`.
    <details>
        <summary>Stuck? Get a hint</summary>
    
    Use the following stats `.mode()` function to find the mode and save it to a variable:

    ```python
    mode_value = stats.mode(example_array)
    ```
    </details>

10. Find the mode value of one-bedroom apartments in Queens and save the value to `queens_mode`.

    <details>
        <summary>Stuck? Get a hint</summary>
    
    Use the following stats `.mode()` function to find the mode and save it to a variable:

    ```python
    mode_value = stats.mode(example_array)
    ```
    </details>



### What does our data tell us?

11. Now what?

    We don’t find the mean, median, and mode of a dataset for the sake of it.
    
    The point is to make inferences from our data. What can you say about the housing prices in Brooklyn, Queens, and Manhattan? Besides, “It’s really expensive to live in any of them.”
    
    Take a minute to think through it. We added our thoughts to the hint.

    <details>
        <summary>Stuck? Get a hint</summary>
    
    It looks like the average cost of one-bedroom apartments in Manhattan is the most, and in Queens is the least. This pattern holds for the median and mode values as well.

    While the mode is not the most important indicator of centrality, the fact that mean, median, and mode are within a few hundred dollars for each borough indicates the data is centered around:

    - $3,300 for Brooklyn
    - $3,900 for Manhattan
    - $2,300 for Queens


    </details>

12. Did you make any assumptions when you drew inferences in the previous task?

    If so, what assumptions did you make? We added our thoughts to the hint.

    <details>
        <summary>Stuck? Get a hint</summary>
    
    We assumed that the data from Streeteasy is representative of housing prices for the entire borough. Given that Streeteasy is only used by a subset of property owners, this is not a fair assumption. A quick search on rentcafe.com will tell you the averages are more like:

    - $2,695 for Brooklyn one-bedroom apartments
    - $4,188 for Manhattan one-bedroom apartments
    - $2,178 for Queens one-bedroom apartments

    This is an interesting finding. Why may the cost from rentcafe.com be higher in Manhattan than in Brooklyn or Queens?

    Although we don’t have the answer to this question, it’s worth thinking about the possible differences between our Streeteasy data and where rentcafe is pulling their data.
    </details>

13. Finally, think about what the histogram for each dataset will look like.

    If you have the time, take a minute to make a rough sketch of the histograms for the cost of a one-bedroom apartment in Brooklyn, Manhattan, and Queens.
    
    You can see someone else’s attempt at a sketch of the Brooklyn histogram.
    
    ![Brooklyn Sketch](./assets/brooklyn-histogram.webp)
    
    When you’re finished, open the hint to take a look at the actual histograms for Brooklyn, Manhattan, and Queens.

    <details>
        <summary>Stuck? Get a hint</summary>
    
    ![Histograms for Brookly, Manhattan, and Queens](./assets/one-bedroom-costs-nyc.webp)
    </details>


In [None]:
# Import packages
import numpy as np
import pandas as pd
from scipy import stats

# Read in housing data
brooklyn_one_bed = pd.read_csv('data/brooklyn-one-bed.csv')
brooklyn_price = brooklyn_one_bed['rent']

manhattan_one_bed = pd.read_csv('data/manhattan-one-bed.csv')
manhattan_price = manhattan_one_bed['rent']

queens_one_bed = pd.read_csv('data/queens-one-bed.csv')
queens_price = queens_one_bed['rent']

# Add mean calculations below 



# Add median calculations below



# Add mode calculations below








##############################################
##############################################
##############################################







# Don't look below here
# Mean
try:
    print("The mean price in Brooklyn is " + str(round(brooklyn_mean, 2)))
except NameError:
    print("The mean price in Brooklyn is not yet defined.")
try:
    print("The mean price in Manhattan is " + str(round(manhattan_mean, 2)))
except NameError:
    print("The mean in Manhattan is not yet defined.")
try:
    print("The mean price in Queens is " + str(round(queens_mean, 2)))
except NameError:
    print("The mean price in Queens is not yet defined.")
    
    
# Median
try:
    print("The median price in Brooklyn is " + str(brooklyn_median))
except NameError:
    print("The median price in Brooklyn is not yet defined.")
try:
    print("The median price in Manhattan is " + str(manhattan_median))
except NameError:
    print("The median price in Manhattan is not yet defined.")
try:
    print("The median price in Queens is " + str(queens_median))
except NameError:
    print("The median price in Queens is not yet defined.")
    
    
#Mode
try:
    print("The mode price in Brooklyn is " + str(brooklyn_mode[0][0]) + " and it appears " + str(brooklyn_mode[1][0]) + " times out of " + str(len(brooklyn_price)))
except NameError:
    print("The mode price in Brooklyn is not yet defined.")
try:
    print("The mode price in Manhattan is " + str(manhattan_mode[0][0]) + " and it appears " + str(manhattan_mode[1][0]) + " times out of " + str(len(manhattan_price)))
except NameError:
    print("The mode price in Manhattan is not yet defined.")
try:
    print("The mode price in Queens is " + str(queens_mode[0][0]) + " and it appears " + str(queens_mode[1][0]) + " times out of " + str(len(queens_price)))
except NameError:
    print("The mode price in Queens is not yet defined.")


