# Data Dictionary

This notebooked is intended to view the data, data features, and data types to aid in creating a data dictionary.

In [1]:
import pandas as pd

### reviews.csv

In [2]:
# read in reviews data
reviews = pd.read_csv('/Users/jessieowens2/Desktop/general_assembly/airbnb_data/reviews.csv')

In [3]:
reviews.columns

Index(['listing_id', 'id', 'date', 'reviewer_id', 'reviewer_name', 'comments'], dtype='object')

In [4]:
reviews.dtypes

listing_id        int64
id                int64
date             object
reviewer_id       int64
reviewer_name    object
comments         object
dtype: object

### listings.csv

In [5]:
# read in listings data
listings = pd.read_csv('/Users/jessieowens2/Desktop/general_assembly/airbnb_data/listings.csv')

In [6]:
listings.columns

Index(['id', 'name', 'host_id', 'host_name', 'neighbourhood_group',
       'neighbourhood', 'latitude', 'longitude', 'room_type', 'price',
       'minimum_nights', 'number_of_reviews', 'last_review',
       'reviews_per_month', 'calculated_host_listings_count',
       'availability_365'],
      dtype='object')

In [7]:
listings.dtypes

id                                  int64
name                               object
host_id                             int64
host_name                          object
neighbourhood_group               float64
neighbourhood                      object
latitude                          float64
longitude                         float64
room_type                          object
price                               int64
minimum_nights                      int64
number_of_reviews                   int64
last_review                        object
reviews_per_month                 float64
calculated_host_listings_count      int64
availability_365                    int64
dtype: object

### neighbourhoods.csv

In [8]:
# read in neighbourhoods data
neighbourhoods = pd.read_csv('/Users/jessieowens2/Desktop/general_assembly/airbnb_data/neighbourhoods.csv')

In [9]:
neighbourhoods.columns

Index(['neighbourhood_group', 'neighbourhood'], dtype='object')

In [10]:
neighbourhoods.dtypes

neighbourhood_group    float64
neighbourhood           object
dtype: object

### calendar.csv

In [11]:
# read in calendar data
calendar = pd.read_csv('/Users/jessieowens2/Desktop/general_assembly/airbnb_data/calendar_9_19.csv')

In [12]:
calendar.columns

Index(['listing_id', 'date', 'available', 'price', 'adjusted_price',
       'minimum_nights', 'maximum_nights'],
      dtype='object')

In [13]:
calendar.dtypes

listing_id          int64
date               object
available          object
price              object
adjusted_price     object
minimum_nights    float64
maximum_nights    float64
dtype: object

## Data Dictionary (Files)

| File | Description |
| --- | --- |
|Reviews | All reviews that have been left on for DC listings, including those that are no longer listed. |
|Listings | All current listings in DC and details associated with each listing (see variables for listings). |
|Calendar | All listings and their prices for each date in the year following the time the data was collected. |
|Neighbourhoods | All DC neighbourhoods. |

## Data Dictionary (All Variables)

| File | Variable Name | Data Type | Description |
| --- | --- | --- | --- |
| reviews | listing_id | integer | corresponds to the unique id of the listing that the review was left for |
| reviews | id | integer | unique id of the review |
| reviews | date | object | date the review was left |
| reviews | reviewer_id | integer | unique id of the reviewer |
| reviews | reviewer_name | object | username of the reviewer | 
| reviews | comments | object | comments left by reviewer |
| listings | id | integer | unique id for the listing |
| listings | name | object | name of the listing |
| listings | host_id | integer | unique id of the host |
| listings | host_name | object | username of the host |
| listings | neighbourhood_group | float | group a neighborhood is associated with (all values are missing) |
| listings | neighbourhood | object | D.C. neighbourhood in which the listing is located |
| listings | latitude | float | approximate latitude of location of listing |
| listings | longitude | float | approximate longitude of location of listing |
| listings | room_type | object | four possible room types for each listing:<br>-entire home/apt<br>-private room<br>-shared room<br>-hotel room |
| listings | price | integer | price per night for listing at time data was collected |
| listings | minimum_nights | integer | Minimum number of nights per year a given listing can be booked (set by the host) |
| listings | number_of_reviews | integer | total number of reviews left for unique listing |
| listings | last_review | object | date of last review left |
| listings | reviews_per_month | float | number of reviews left per month (average) |
| listings | calculated_host_listings_count | integer | number of listings associated with unique host id |
| listings | availability_365 | integer | number of days listing is available for rent in 365 day span |
| calendar | listing_id | integer | corresponds to the unique id of the listing that the date and price are associated with |
| calendar | date | object | future date (based on time data was pulled) that given listing is available for rent |
| calendar | available | object | true or false value representing if listing was available for booking at time data was collected |
| calendar | price | object | price per night (at time data was collected) of listing for date provided |
| calendar | adjusted_price | object | an adjustment of price per night |
| calendar | minimum_nights | float | minimum number of nights listing is available to be booked for (based on availability and minimum number of nights required) |
| calendar | maximum_nights | float | maximum number of nights listing is available to be booked for (based on availability and minimum number of nights required) |
| neighbourhoods | neighbourhood_group | float | group a neighborhood is associated with (all values are missing) |
| neighbourhoods | neighbourhood | object | D.C. neighbourhood |