# Finding the Best Pittsburgh Neighborhood to Survive a Zombie Apocalypse

This project aims to identify the best Pittsburgh neighborhood to survive a zombie apocalypse based on proximity to hospitals and primary care access. 

## Datasets

We will use the following datasets for our analysis:

1. PGHSNAP - Neighborhood Census Data
2. Allegheny County Hospitals
3. Allegheny County Primary Care Access

## Data Analysis Process

### 1. Importing Libraries and Loading Datasets

First, we import the necessary libraries (pandas, NumPy, geopy) and load the datasets using pandas' `read_csv()` function.

### 2. Data Cleaning and Preprocessing

Before analyzing the data, we clean and preprocess it, which may include removing missing values, filtering out irrelevant columns, and extracting latitude and longitude information.

### 3. Calculating Distances to Hospitals and Primary Care Facilities

Using the `geopy.distance.great_circle()` function, we calculate the distance from each neighborhood to the nearest hospital and primary care facility.

### 4. Scoring System

We create a scoring system that assigns higher scores to neighborhoods with shorter distances to hospitals and primary care facilities.

### 5. Finding the Best Neighborhood

By sorting the neighborhoods based on their survival scores, we identify the best neighborhood for surviving a zombie apocalypse.

## Data Visualization

We use matplotlib to create the following visualizations:

1. **Bar Plot**: The top 10 neighborhoods with the highest survival scores
2. **Scatter Plot**: The neighborhoods, hospitals, and primary care facilities in Pittsburgh
3. **Heatmap**: The survival scores of Pittsburgh neighborhoods

## Conclusion

Through this analysis, we have identified the best Pittsburgh neighborhood to survive a zombie apocalypse based on proximity to hospitals and primary care access. The visualizations help us better understand the analysis results and present our findings.


Import necessary libraries:

In [2]:
import pandas as pd
import numpy as np
import geopandas

Load the datasets

In [18]:
hospitals_data = pd.read_csv('data-hospitallocations.csv')
primary_care_data = pd.read_csv('data-primary-care-access-facilities.csv')

In [19]:
print(hospitals_data.head())
print(primary_care_data.head())

           Facility                                      Address          Y  \
0  UPMC Children's         4401 Penn Avenue Pittsburgh, PA 15224  40.467315   
1        UPMC Magee      300 Halkett Street Pittsburgh, PA 15213  40.436889   
2   UPMC McKeesport         1500 5th Avenue McKeesport, PA 15132  40.351343   
3        UPMC Mercy      1400 Locust Street Pittsburgh, PA 15219  40.436137   
4    UPMC Passavant  9100 Babcock Boulevard Pittsburgh, PA 15237  40.573319   

           X  
0 -79.953590  
1 -79.960700  
2 -79.849457  
3 -79.985285  
4 -80.014525  
                                 GROUP_NAME              PRACTICE_ADDR_1  \
0                  Picciotti, Isabella M MD           1 Allegheny Square   
1               UPMC Emergency Medicine Inc   1 Childrens Hospital Drive   
2              Hoover Medical Associates PC             100 Delafield Rd   
3  Partners in Nephrology and Endocrinology           100 Delafield Road   
4                   Bahl and Bahl Med Assoc  100 Delafi