## Tutorial 06-02 - Creating a Map in a Notebook

Let’s go back to our GeoNinjas PythonAnalytics job.  Today, we’ve been tasked with some spatial analysis regarding public works data in San Francisco.  We’ve been given a CSV file of all the 311 calls for the month of November 2023.  A data scientist on the team noticed that this data has Latitude and Longitude columns and is wondering if we can display the data spatially for them.

### Import Data from CSV

#### 1.  Import the pandas package

You’re going to use a package called pandas to read our csv.  If you’re not familiar with pandas or DataFrames in general, don’t worry too much.  We’ll go into more detail on DataFrames in another chapter.  In this case, you’re going to use pandas to read this csv with the intent to create a spatially enabled DataFrame.

Start by executing the following code to import the pandas package.

In [6]:
import pandas

#### 2.  Use pandas to read the CSV data.

Now you can use the pandas package to read data from the CSV file.  If your Notebook isn’t in the same folder as your CSV, you may want to use the full path to your csv.

In [7]:
df_311 = pandas.read_csv('./311_Cases.csv')

NOTE - The dot notation that you see in my file path is a relative path. This is a handy way to use file paths that aren’t specific to any one machine.  The dot at the beginning of the path means that the path starts in the folder that my Notebook is in.  If the csv is in the same folder as the Notebook you created, this format should work just fine.

#### 3.  Explore the 311 data using pandas

Now that you’ve read the CSV, you have a DataFrame you can use.  There are three things you can do to get a good idea of what our data looks like.  The first thing is you can find out how many records and columns there are by using this line of code.

In [8]:
df_311.shape

(47540, 48)

Next, you can look at a description of the columns and what kinds of data are in the DataFrame using the following code.

In [9]:
df_311.dtypes

CaseID                                                    int64
Opened                                                   object
Closed                                                   object
Updated                                                  object
Status                                                   object
Status Notes                                             object
Responsible Agency                                       object
Category                                                 object
Request Type                                             object
Request Details                                          object
Address                                                  object
Street                                                   object
Supervisor District                                     float64
Neighborhood                                             object
Police District                                          object
Latitude                                

Finally, you can take a look at the first five rows of the DataFrame to get an idea of what you data really looks like using the following code:

In [3]:
df_311.head()

Unnamed: 0,CaseID,Opened,Closed,Updated,Status,Status Notes,Responsible Agency,Category,Request Type,Request Details,...,DELETE - HSOC Zones,Fix It Zones as of 2018-02-07,"CBD, BID and GBD Boundaries as of 2017",Central Market/Tenderloin Boundary,"Areas of Vulnerability, 2016",Central Market/Tenderloin Boundary Polygon - Updated,HSOC Zones as of 2018-06-05,OWED Public Spaces,Parks Alliance CPSI (27+TL sites),Neighborhoods
0,17610753,11/30/2023 10:59:00 PM,12/01/2023 06:52:43 AM,12/01/2023 06:52:43 AM,Closed,Case Resolved,Recology_Abandoned,Street and Sidewalk Cleaning,Bulky Items,Furniture,...,1.0,6.0,,,2.0,,1.0,,,32.0
1,17610752,11/30/2023 10:56:00 PM,12/15/2023 08:08:00 AM,12/15/2023 08:08:00 AM,Closed,Case is Invalid,311 Supervisor Queue,Noise Report,mechanical_equipment,Noise Report - mechanical_equipment,...,,,,,1.0,,,,,15.0
2,17610749,11/30/2023 10:44:00 PM,12/05/2023 03:23:00 PM,12/05/2023 03:23:00 PM,Closed,Case is a Duplicate - Case is a duplicate and ...,DPH - Environmental Health - Tobacco Queue,General Request - DPH,complaint,environmental_health_tobacco - complaint,...,,,,,,,,,,
3,17610745,11/30/2023 10:41:25 PM,12/01/2023 06:01:17 AM,12/01/2023 06:01:17 AM,Closed,Case Resolved,Recology_Overflowing,Street and Sidewalk Cleaning,City_garbage_can_overflowing,City_garbage_can_overflowing,...,,5.0,,,2.0,,,,,104.0
4,17610743,11/30/2023 10:38:00 PM,12/01/2023 01:38:17 PM,12/01/2023 01:38:17 PM,Closed,Case Resolved,DPW Ops Queue,Street and Sidewalk Cleaning,General Cleaning,Other Loose Garbage,...,,,,,,,,,,


### Create a Spatially Enabled Data Frame

#### 1.  Import the ArcGIS API for Python

First you’ll need to import the ArcGIS API for Python using the following code:

In [18]:
import arcgis

#### 2.  Create a Spatially Enabled DataFrame

Next you’ll use the spatial accessor to turn our DataFrame into a spatially enabled DataFrame.

In [11]:
df_311 = pandas.DataFrame.spatial.from_xy(df_311,
                                         x_column = 'Longitude',
                                         y_column = 'Latitude')

Now if you look at the DataFrame, you’ll notice a new column called “SHAPE”.  This contains geometry that can be used for analysis and display.

In [20]:
df_311[['Category','Latitude','Longitude','SHAPE']].head()

Unnamed: 0,Category,Latitude,Longitude,SHAPE
0,Street and Sidewalk Cleaning,37.776009,-122.4102,"{""spatialReference"": {""wkid"": 4326}, ""x"": -122..."
1,Noise Report,37.798813,-122.424171,"{""spatialReference"": {""wkid"": 4326}, ""x"": -122..."
2,General Request - DPH,0.0,0.0,"{""spatialReference"": {""wkid"": 4326}, ""x"": 0.0,..."
3,Street and Sidewalk Cleaning,37.793764,-122.407756,"{""spatialReference"": {""wkid"": 4326}, ""x"": -122..."
4,Street and Sidewalk Cleaning,0.0,0.0,"{""spatialReference"": {""wkid"": 4326}, ""x"": 0.0,..."


### Display Data on a Map

When we imported the ArcGIS API for Python in our notebook, we also imported some extra functionality into our Notebook.  One of the extra pieces of functionality we imported was a “Map View” widget.  This will let us create a map inside our notebook and add our data to it.

#### 1. Create a map widget object

You can start by creating a map view.  You’ll need to create a GIS object first.  Then you can use that GIS object to create a map centered on your area of interest (which is San Francisco in this case).

In [13]:
gis = arcgis.GIS()

In [14]:
map_view = gis.map("San Francisco, CA")

#### 2.  Add the Spatially Enabled DataFrame to the Map View

Now that you’ve instantiated a map view, you can add the DataFrame as a layer.  In this case, you can set some extra parameters for renderer_type and col to indicate that you’d like a unique values renderer (renderer_type = ‘u’) based on the “Category” column (col = ‘Category’).

In [15]:
df_311.spatial.plot(map_widget = map_view,
                    renderer_type = 'u',
                    col = 'Category'
                )

True

Now you’ve got your layer added to your map view, but you aren’t seeing a map.  you’ll need to return the map view to display it and interact with it.  Executing the following code will return the map view.

In [16]:
map_view

MapView(layout=Layout(height='400px', width='100%'))

In [17]:
map_view.legend = True