## Introduction to GeoPandas

### Part 3: Spatial Joins

Spatial joins allow you to join datasets based on geographic information

Import the libraries

```import pandas as pd```

```import geopandas as gpd```

```import matplotlib.pyplot as plt```

```%matplotlib inline```

Create a GeoDataFrame called ```hoods``` from a geoJSON file representing neighborhoods in San Francisco

We will import this file from a URL: https://data.sfgov.org/resource/aivd-8yrg.geojson

```hoods = gpd.read_file('https://data.sfgov.org/resource/aivd-8yrg.geojson')```

Rename the ```neighborho``` column to ```Neighborhoods```

```hoods.rename(columns={'neighborho':'Neighborhood'}, inplace=True)```

Explore the data

Create a GeoDataFrame called ```libraries``` from the libraries.geojson file created in Part 1

```libraries = gpd.read_file('data/libraries.geojson')```

Select only the name, address, zip_code, gross_sq_ft, and geometry columns

```libraries = gpd.read_file('data/libraries.geojson')
libraries = libraries[['common_name', 'address', 'gross_sq_ft','geometry']]
libraries.rename(columns={'common_name':'Name','address':'Address','gross_sq_ft':'Square Ft'}, inplace=True)```

Join the ```libraries``` and ```hoods``` frames using ```sjoin```

```joined = gpd.sjoin(
    libraries, 
    hoods, 
    how='inner',
)```

Explore the data

Sorting and Grouping

Use ```sort_values()``` to sort by column

```joined['Neighborhood'].sort_values()```



The ```groupby()``` function collects data into groups and allows for aggregate functions to be performed on them

```joined.groupby('Neighborhood')['Name'].count()```

We can also find the library with the largest or smallest amount of space by using ```max()``` and ```min()```

```largest = joined["Square Ft"].max()
joined[joined['Square Ft'] == largest]```