# NYC Apartment Search - Group 1

### Purpose of the Project:
The project uses data-driven approaches to analyze and visualize New York City apartment data, 311 complaints, and urban forestry data to help understand urban living dynamics. This analysis is intended to aid in making informed decisions about apartment rentals based on environmental and urban living conditions.

### Sections and Key Functions:
1. **Setup**
   - Initializes the environment with necessary libraries and settings.

2. **Part 1: Data Preprocessing**
   - Functions to load and clean data from various sources (ZIP codes, 311 complaints, tree census, Zillow rent data).
   - Quality checks and basic data explorations are conducted.

3. **Part 2: Storing Data**
   - Database setup functions to create tables and indices.
   - Functions to convert geometries for database insertion and to insert cleaned data into a PostgreSQL database.
   - Data retrieval functions to fetch and display samples from each database table.

4. **Part 3: Understanding the Data**
   - Functions to execute SQL queries and to extract meaningful insights from the database.
   - Various SQL queries analyze the relationship between apartment prices, complaints, and tree census data.

5. **Part 4: Visualizing the Data**
   - Multiple visualizations to represent data insights graphically, including trends over time and spatial distributions.

## Setup

In [None]:
# Standard library imports
import os
import pathlib
import subprocess
from datetime import datetime, timedelta
from typing import Tuple

# Third-party imports
import geopandas as gpd
import matplotlib.pyplot as plt
import matplotlib.animation as animation
import pandas as pd
import requests
import seaborn as sns
from sqlalchemy import create_engine
from sqlalchemy.engine.base import Engine
from shapely.geometry import Point
from geoalchemy2 import Geometry, WKTElement

In [None]:
# Path configuration
DATA_DIR = pathlib.Path("data")
ZIPCODE_DATA_FILE = DATA_DIR / "nyc_zipcodes" / "nyc_zipcodes.shp"
ZILLOW_DATA_FILE = DATA_DIR / "zillow_rent_data.csv"
QUERY_DIR = pathlib.Path("queries")  # Directory for saving DB queries

# API configuration
APP_TOKEN = "J9t5fS2TcfDISWng9WsnCdvCP"
COMPLAINTS_URL = 'https://data.cityofnewyork.us/resource/erm2-nwe9.geojson'
TREES_URL = 'https://data.cityofnewyork.us/resource/uvpi-gqnh.geojson'

# Database configuration
DB_NAME = "nyc_data"
DB_USER = "williamsjs"
DB_URL = f"postgresql+psycopg2://{DB_USER}@localhost/{DB_NAME}"
engine = create_engine(DB_URL)

In [None]:
def ensure_directory_exists(directory: pathlib.Path):
    """Ensure that a directory exists; if not, create it."""
    try:
        directory.mkdir(parents=True, exist_ok=True)
    except Exception as e:
        print(f"Error creating directory {directory}: {e}")

# Make sure the directories exist
ensure_directory_exists(DATA_DIR)
ensure_directory_exists(QUERY_DIR)