# Shapefiles

Shapefiles are a popular data format for geospatial data. A shapefile usually consists of several related files, such as:

- a `.shp` file that stores the geometric location and shape of each feature
- a `.prj` file that stores the map projection
- a `.dbf` file containing tabular data about each feature

Shapefiles are especially useful for drawing boundaries of countries, states, counties, etc. You can typically find shapefiles by searching on Google.

## Preparation

Download the shapefile of the countries of the world from [this website](https://public.opendatasoft.com/explore/dataset/world-administrative-boundaries/export/?flg=en-us) as a `.zip` file. Upload this `.zip` file to the Colab runtime and unzip it by running the cell below.

In [None]:
!unzip world-administrative-boundaries.zip

Now, you should be able to see these files in the file browser. (If you can't, press the refresh button.)

To read in a shapefile, we can use Geopandas.

In [None]:
import geopandas as gpd

# You can read in any of the files in the shapefile and get the same result.
gdf_world = gpd.read_file("world-administrative-boundaries.shp")
gdf_world

Now, we can visualize this map by calling `.plot()` on this `GeoDataFrame`, as we saw in lecture.

In [None]:
gdf_world.plot()

We can find out the map projection that was used. (This information would have been in the `.prj` file in the shapefile.)

In [None]:
gdf_world.crs

Can you figure out how to customize the colors of the borders and the countries?

_Hint:_ Take a look at the documentation for [`.plot()`](https://geopandas.org/en/stable/docs/reference/api/geopandas.GeoDataFrame.plot.html#geopandas.GeoDataFrame.plot).

## Mystery 1

The file https://datasci112.stanford.edu/data/dotmap_world.csv contains a mystery data set consisting of locations over the world.

Make a **dot map** of this data on top of a world map. Can you figure out what this is a data set of?

_Hint:_ https://datasci112.stanford.edu/data/dotmap_world_future.csv contains additional data, if you are stuck.

## Mystery 2

The file https://datasci112.stanford.edu/data/dotmap_us.csv contains a mystery data set consisting of locations in the United States.

Make a dot map of this data on top of a U.S. map. You can download a shapefile of the U.S. states from [the U.S. Census Bureau website](https://www.census.gov/geographies/mapping-files/time-series/geo/cartographic-boundary.html).

Can you figure out what this is a data set of?