In [None]:
%matplotlib inline


Creating a GeoDataFrame from a DataFrame with coordinates
---------------------------------------------------------

Create a ``GeoDataFrame`` when starting from a *regular* ``DataFrame`` that has coordinates in WKT format.


In [None]:
import pandas as pd
import geopandas
import matplotlib.pyplot as plt
plt.rcParams['figure.figsize'] = [15, 8]

We use the shapefile set ``states_21basic`` to map the US States and get their geometries. Lets load the data into a ``GeoDataFrame``:

In [None]:
usa = geopandas.read_file("states_21basic/states.shp")

Check out the ``head`` of the dataframe:

In [None]:
usa.head()

The geometry column contains POLYGON shapes! These polygons are a tuple of longitude/latitude points that make up the border of each US state. We’ve already got enough info to make a basic plot:

In [None]:
usa.plot(color='white', edgecolor='black')

You can check out individual states:

In [None]:
usa[usa.STATE_ABBR == 'CA'].plot(color='white', edgecolor='black')

A ``GeoDataFrame`` needs a ``shapely`` object.

In [None]:
from shapely import wkt

We use the ``geo_sparql_query`` module to retrieve the collection of WKT we would like to plot.

In [None]:
from geo_sparql_query import get_local_gid_df, get_osm_df

In [None]:
INSPECTED_GID = 72

In [None]:
wkt_df = get_local_gid_df(INSPECTED_GID)

We use ``shapely.wkt`` sub-module to parse wkt format:

In [None]:
wkt_df['Coordinates'] = wkt_df['Coordinates'].apply(wkt.loads)

Lets inspected the ``GeoDataFrame``:

In [None]:
gdf = geopandas.GeoDataFrame(wkt_df, geometry='Coordinates')
gdf.insert(0, 'OSM', 0)

gdf.head()

We can plot our ``GeoDataFrame`` on top of a state:

In [None]:
#ax = usa[usa.STATE_ABBR == 'CA'].plot(color='white', edgecolor='black')
#gdf.plot(ax=ax, color='blue')
#plt.show()

Or as a standalone (no state borders):

In [None]:
gdf.plot(color='blue')

In [None]:
osm_df = get_osm_df(INSPECTED_GID)

In [None]:
osm_df['Coordinates'] = osm_df['Coordinates'].apply(wkt.loads)
osm_gdf = geopandas.GeoDataFrame(osm_df, geometry='Coordinates')
osm_gdf.insert(0, 'OSM', 1)
osm_gdf.head()

In [None]:
osm_gdf.plot(color='red')

In [None]:
frames = [gdf, osm_gdf]
result = pd.concat(frames)
result

In [None]:
result.plot(column='OSM', cmap='bwr')

In [None]:
osm_gdf.to_excel("geolinking_results_g.xlsx") 

In [None]:
for idx in range(len(osm_gdf.index)):
    osm_inst_uri = osm_gdf.iloc[idx]['Instance']
    sub_osm_gdf = osm_gdf[osm_gdf.Instance == osm_inst_uri]
    frames = [gdf, sub_osm_gdf]
    result = pd.concat(frames)
    #print(sub_osm_gdf.iloc[0]['Types'])
    result.plot(column='OSM', cmap='bwr')