# Overview

1. [Introduction to GeoPandas](#1)
2. [Datetime in pandas](#2)


# Introduction to GeoPandas<a class="anchor" id=1></a>

GeoPandas is a python library for geospatial data in pandas DataFrames. 

    import pandas as pd
    import geopandas as gdp
    
    df  = pd.DataFrame()
    gdf = gpd.GeoDataFrame()

A GeoDataFrame can be regarded as a DataFrame plus an extra column: a GeoSeries.
GeoSeries can contain the following geometries
 
    POINTS, MULTIPOINTS     # e.g. for an address
    LINES, MULTILINES       # e.g. for a street
    POLYGONS, MULTIPOLYGONS # e.g. for city boarders
    
Additionally, the GeoSeries has an attribute for the [Coordinate Reference System](https://en.wikibooks.org/wiki/Coordinate_Reference_Systems_and_Positioning) (Mercator projection etc.), `GeoSeries.crs`. For instance `crs="EPSG:4326"` references the World Geodetic System (WGS84) which is the typical projection used for GPS data. 

There can actually be more than one GeoSeries columns within a GeoDataFrame, but always only one `geometry` is active at a time.

Geometries of geographical data are typically stored in shape files, e.g. with ending .shp. The shape file only contains the GPS-data itself, and must be accompannied by other files with ending .shx (e.g. city names corresponding to shapes), .prj with the projection, and others. Another common file type is GeoJSON, which contains all the information in *one* file.
    
See also: [geopandas User Guide](https://geopandas.org/en/stable/docs/user_guide/data_structures.html).


In [None]:
import geopandas as gpd
import pandas as pd
from matplotlib import pyplot as plt

#### Data directory

In [None]:
data_dir = '../data/'

## (Download and) load the shape of Spain
Downloaded from http://centrodedescargas.cnig.es/CentroDescargas/index.jsp > Mapas vectoriales y Bases Cartográficas y Topográficas:

    BCN500
    Description: Base Cartográfica Nacional a escala 1:500.000.
    SGR: ETRS89. Coordenadas geográficas longitud y latitud.
    Download entity: toda España y por capas temáticas.
    Format: shapefile (.shp)

In [None]:
# Load shape file into GeoDataFrame
spain = gpd.read_file(data_dir + "carto/BCN500_0101S_LIMITE_ADM.shp", crs="EPSG:4326") 


In [None]:
# Filter countries out, with boolean filtering, just as you would do for pandas DataFrames
# CCAA is the abbreviation for Autonomous region in Spain
spain = spain[~spain["CCAA"].isin(["SAHARA OCCIDENTAL", "ALGERIA", "PORTUGAL", 'MARRUECOS', 'MAURITANIA','FRANCIA', 'ANDORRA'])]


In [None]:
# plot spain
spain.plot()

In [None]:
plt.clf()
# separate balears and canary islands, to plot them independently
spain_main_land = spain[~spain["CCAA"].isin(['Illes Balears', 'Canarias'])]
spain_canarias  = spain[spain["CCAA"].isin(['Canarias'])]
spain_baleares  = spain[spain["CCAA"].isin(['Illes Balears'])]

# create canvas where balears and canary islands are plotted independently
f, ax = plt.subplots(figsize=(10,10))
axin1 = ax.inset_axes([-0.05, -0.05, 0.3, 0.3])
axin1.xaxis.tick_top()
axin1.yaxis.tick_right()

axin2 = ax.inset_axes([0.8, 0.4, 0.25, 0.25])
axin2.yaxis.tick_right()

spain_main_land.plot(ax=ax, color="lightgray")
spain_canarias.plot(ax = axin1, color="lightgray")
spain_baleares.plot(ax = axin2, color="lightgray")

### Writing files

Just as you can load geodata with read_file, you can save geodata to files with to_file

In [None]:
# in one geojson
spain_baleares.to_file(data_dir + "carto/spain_baleares_gj.geojson", driver="GeoJSON")

# in several files including a shp file
spain_baleares.to_file(data_dir + "carto/spain_baleares.shp")

# Datetime in Pandas<a class="anchor" id=2></a>

In [None]:
#!/usr/bin/env python

# make sure to install these packages before running:
# pip install pandas
# pip install sodapy

import pandas as pd
from sodapy import Socrata

# Unauthenticated client only works with public data sets. Note 'None'
# in place of application token, and no username or password:
client = Socrata("analisi.transparenciacatalunya.cat", None)

# Example authenticated client (needed for non-public datasets):
# client = Socrata(analisi.transparenciacatalunya.cat,
#                  MyAppToken,
#                  username="user@example.com",
#                  password="AFakePassword")

# First 2000 results, returned as JSON from API / converted to Python list of
# dictionaries by sodapy.
results = client.get("pvrz-iijx", limit=10000)

# Convert to pandas DataFrame
results_df = pd.DataFrame.from_records(results)

In [None]:
# drop datetimes that are nans
results_df = results_df[results_df.data_naixement_infant.notna()]
results_df

In [None]:
# add datetime column with python datetimes
results_df["data_naixement_infant_datetime"] = pd.to_datetime(results_df.data_naixement_infant)

In [None]:
results_df

### Time zones

In [None]:
from datetime import datetime
import pytz
datetime_in_Madrid = datetime.now(pytz.timezone('Europe/Madrid'))

### Filter for time ranges

In [None]:
start = datetime.strptime('01/01/2001', '%d/%m/%Y')
end = datetime.strptime('02/01/2001', '%d/%m/%Y')

results_df_range = results_df[results_df.data_naixement_infant_datetime < end]
results_df_range = results_df_range[results_df_range.data_naixement_infant_datetime>=start]
results_df_range

### Filter for daytimes

If any of you has not only the days but actually the daytimes, they could filter for certain times of the day like this:

    results_df_with_daytimes.set_index("daytimes").between_time("00:05", "00:10")