# from GSoC import splot

**A package providing lightweight plotting and mapping to facilitate spatial analysis with PySAL.** 

By [Stefanie Lumnitz](https://github.com/slumnitz), [Levi John Wolf](https://github.com/ljwolf), [Dani Arribal-Bel](https://github.com/darribas), [Sergio Rey](https://github.com/sjsrey), [Taylor Oshan](https://github.com/TaylorOshan) and [Joris Van den Bossche](https://github.com/jorisvandenbossche).

The primary goal of this ['Google Summer of Code 2018' project](https://github.com/pysal/pysal/wiki/GSoC-2018---Geovisualization-Module-by-Stefanie-Lumnitz) was to design and implement the visualization package `splot`. [`splot`](https://github.com/pysal/splot) connects spatial analysis done in the [Python Spatial Analysis Library](https://github.com/pysal), `PySAL`, to visualization toolkits like [`matplotlib`](https://matplotlib.org). It provides users quick access to visualize popular PySAL objects, offering different views on their spatial analysis workflows. The `splot` package facilitates the creation of both static plots ready for publication and interactive visualizations for quick iteration and spatial data exploration. The project has successfully achieved its primary goal and `splot` was first released in the [PySAL 2.0 2rc](https://github.com/pysal/pysal/releases) on [July 19 2018](https://pypi.org/project/PySAL/2.0rc2/#files).

While developing `splot`, the potential integration of different popular visualization packages like [`bokeh`](https://bokeh.pydata.org/en/latest/) was explored. Based on the results API decisions were made, a cooperation with [`geopandas`](http://geopandas.org) was leveraged and the outline of the project changed slightly. The project now allows `splot` to potentially grow with interactive toolkits like `bokeh` in future, but focuses the current workflow on the creation of views with a `matplotlib` and `geopandas` backend. Additionally, developing new tools helped to assess PySAL's current code base and provided the opportunity to add or tweak functionality alongside `splot`'s visualizations to create an even more userfreindly library.

This notebook provides a summary of API decisions made, functionality developed and next steps planned towards a thriving `PySAL` and `splot` community in the context of the GSoC project.


## Contents

1. API Decisions
2. New `splot` functionality for:
    * esda
    * giddy
    * libpysal
    * mapping
3. Code beyoned splot
4. Remaining Work & Next Steps
5. Community Outreach

### GitHub and project links

* [GSoC GitHub project](https://github.com/pysal/splot/projects/1)
* [merged pull requests](https://github.com/pysal/splot/pulls?q=is%3Apr+author%3Aslumnitz+is%3Aclosed)
* [unmerged pull requests](https://github.com/pysal/splot/pulls/slumnitz)
* [Gitter channel](https://gitter.im/pysal-gsoc18/Lobby)
* [Project Blog](https://blogs.python-gsoc.org/stefanie-lumnitz/author/stefanie-lumnitz/)

## API Decisions

**Challanges and Project Plan Changes**

During the first Phase of GSoC we create different visualisations in both a static version using `matplotlib` and an interactive version based on `bokeh`, using `esda.moran` objects as our first example. The original GSoC project plan proposed to design a common API for easy access to both versions, ensuring that users would be able to switch between `bokeh` and `matplotlib` backends without the need to change their code. However, after creating the `splot.esda.plot_moran` composite view, it was clear that the we would have to cut back on the advantages each backend offers in order to provide a function signature that would be identical for both versions. A common API, would for example restrict full figure design accessibility for views in `matplotlib` on one hand, and limit the interactive tools, like the`hover` tool, including the information displayed during the interactivity in the `bokeh` backend. Hence, we decided early on in GSoC to focus on `matplotlib` as the main backend.

Since the exisitng user base of `PySAL` consists of a large proportion of scientific researchers, it was important to us to ofer a maximum of control over the design of the visualization. Customization is important in creating visualisations that could also be used in scientific publications. Other reasons why we chose `matplotlib` over a `bokeh` backend include:
* `matplotlib` is already based on a much larger user and developer community. More example visualizations and good documentation already exist on diverse platforms. Furthermore, the majority of the `PySAl` userbase is more familiar `matplotlib` due to its popularity.
* Novel extentions like `IPyWidgets` or `matploltibs` `notebook magic` already allow a certain degree of interactivity. Additionally, quick interation over different or slightly changed views is another way of interactively exploring data. 
* Laslty, becasue of `matplotlibs` maturity it is easier and quicker to develop new views and build ontop of other packages leveraging `mapltotlib` including `geopandas` or `seaborn`.

The decision to focus on a backend using `matplotlib` and `geopandas` provided teh advantage to build a userfirendly, flexible API and package structure.

**splot's structure and API**

Building ontop of our experiences, `splot`'s functionality can now be accessed in two main ways: Basic `splot` visualisations are exposed as `.plot` methods on PySAL objects and all visualisations can be found and called using a `splot` & `.PySAL_submodule` namespace. 

`.plot` methods are hereby calling `splot` under the hood. Exposing simple plots in other `PySAL` submodules ensures that the user has the quickest possible access to visualisations connected to the `PySAL` object that was created. This is especially useful for an instantaneous sanity check if the spatial analysis done in PySAL is correct, or if there are any errors present in the data used. A conventional workflow could look like this:

Furthermore, all visualisations conencted to analysis done in a specific PySAL submodule can be called using the `splot.'submodule'` namespace. This is especially helpful in complex spatial analysis workflows when multiple PySAL objects, submodules and visualizations need to be integrated. Or if the user needs full access and control over the different visualisations. A simple example how to access visualisations through this option is given here:

Lastly, `splot.utils` and `splot.mapping` provide aditional utility and mapping functionality that can greatly enhance geospatial analysis and information, but are not directly tied to one specific `PySAL` submodule. The `.utils` and `.mapping` namespaces therefore universally enhance workflows specific to all other `PySAL` submodules and are accessed separately. 

**References and additional information**
* [API Discussion](https://github.com/pysal/splot/issues/9)
* [GSoC Blogpost](https://blogs.python-gsoc.org/stefanie-lumnitz/2018/06/11/designing-the-splot-api/)

## New splot functionality

We followed three main guidlines developing `splots` functionality:
1. To provide users with simple visualisations and multi-views that allow for a more in depth spatial analysis workflow where possible. Based on our own experience in spatial analysis we are aiming to shine light on different angles of the analysed problem utilizing simple multi-plots.
2. To choose sensible design settings and analysis defaults to enhance the ease and speed of use.
3. To allow customization of all visualisations using keyword dictionaries. To allow scientific users to leverage `splot` for scientific presentation.

All visualisation functionality takes a `matplotlib` Axes instance as an input argument which defaults to `ax=None`. `ax=` is needed since users may already have a Figure instance with one or more `matplotlib` Axes instances defined (often in a custom layout), and want to use an `splot` function to draw a plot inside one of those already defined Axes. Futhermore, all `splot` functionality commonly returns a `matplotlib` Figure instance and Axes instance. A `fig` instance may be needed for high level operations like display, close or save a figure (e.g. `fig.savefig('moran_scatteprlot.png')`). Respectively, a `ax` instance may be needed to modify the plot itself, e.g. customize a title, or plot a scatter plot on top. The following sections will give a detailed overview on advantages of the chosen API design and which functionality was added during the summer of code. The idea collection and discussion onw hat functionality to support in `splot` can be found [here](https://github.com/pysal/splot/issues/10).

### splot.esda

**Enhancing Exploratory Spatial Data Analysis (esda) with `splot` views**

[`esda`]() is one of the most widely used sub-packages of PySAL and provides tools for exploratory spatial data analysis that consider the role of space in a distribution of attribute values. It therefore was the first of the `PySAL` subpackages functionality was developed for.

`splot` primarily supports Moran analytics done with `esda.moran` and offers a range of single visualisations and combined views for more complex analysis. Currently we have worked towards `splot` supporting the following objects:
* `esda.moran.Moran`
* `esda.moran.Moran_BV`
* `esda.moran.Moran_Local`
* `esda.moran.Moran_Local_BV`
* `esda.moran.Moran_matrix`

**moran_scatterplot()**

A simple `moran_scatterplot()` call will create a classic Moran Scatterplot for any of the four Moran analytics objects (Moran, Moran_BV, Moran_Local, Moran_Local_BV):

# used with the same data introduced in the first chapter
moran_scatterplot(moran)
plt.show()

In addition, users have the choice to display their original or standardised data, to color points by significance in case `Moran_Local` or `Moran_local_BV` are input objects and of course customize the plot using `scatter_kwds` and `fitline_kwds`.

**plot_moran_simulation()**

`plot_moran_simulation()` and `plot_moran_bv_simulation()` offer users a view to compare the calculated Moran value to a Reference distribution and check wether the calculated Moran or bivariate Moran statistics are significant.

In [None]:
from splot.esda import plot_moran_simulation

# used with the same data introduced in the first chapter
plot_moran_simulation(moran)
plt.show()

**plot_moran()**

Provides a combined view of `plot_moran_scatterplot()` or `plot_moran_bv_simulation()` and `moran_scatterplot()`. The name `plot_moran()` was chosen to direct the user to call this function first since a combined view enhances the understanding of the calculated Moran Statistics siginificantly. The idea is to only call an individual view of `plot_moran_scatterplot()` or `moran_scatterplot()` for specific purposes, e.g. to include this visualisation in another customized visualization using the `ax=` argument.

**lisa_cluster()**

**plot_local_autocorrelation()**

**moran_facet()**

**References and Pull Requests**

### splot.giddy

**Visualising spaceâ€“time analytics that consider the role of space in the evolution of distributions over time**

**dynamic_lisa_heatmap()**

**dynamic_lisa_rose()**

**dynamic_lisa_vectors()**

**dynamic_lisa_composite()**

**dynamic_lisa_composite_explore()**

### splot.libpysal

**Visualisations for all core components of Python Spatial Analysis Library in `libpysal`**

As demonstrated by previous examples `libpysal` and especially the `weights` functionality is frequently used for many different spatial analysis workflows done with all `PySAL` submodules. Due to new functionality `libpysal.weights.util.nonplanar_neighbors` added by Levi during the refactoring of PySAL, we decided to support `weight` objects next.

(`nonplanar_neighbors` offers a tool to calculate spatial weights in case a digitization error lead to neighboring polygones incorrectly not sharing edges and nodes.)

`libpysal` functionality supported by splot includes:
* `libpysal.weights`

**plot_spatial_weights()**

This functionality simply takes a `weights` object and a `geopandas` GeoDataFrame as an parameter input and maps the spatial weights network on top of the underlying polygons:

from splot.libpysal import plot_spatial_weights

# used with the same data introduced in the first chapter
plot_spatial_weights(w, gdf)
plt.show()

In case `nonplanar_neighbors` were calculated beforehand, the functionality automatically detects this in the weights object and plots all new weights in a different default color. Of course like in other cases, dictionaries can be used to implement personal design choices.

**related links and PR's**

* [merged PR: creating `plot_spatial_weights()`](https://github.com/pysal/splot/pull/14)
* [Blog Post](https://blogs.python-gsoc.org/stefanie-lumnitz/2018/07/09/milestone-2-sprinting-towards-ansplot-release/)

### splot.mapping

**Universal Choropleth visualizations and mapping utilities**

**value_by_alpha_cmap()**

**vba_choropleth()**

**vba_legend()**

**mapclassify_bin()**

**shift_colormap()**

**truncate_colormap()**

## Code beyoned splot

## Remaining work & next steps

## Community Outreach

During the Summer of Code and within the scope of developing `splot`, the whole development team had the chance to collaborate and communicate with other package developers in the Scientific Python Open Source Software Community. 

We would like to give many thanks to [Rebecca Bilbro](https://github.com/rebeccabilbro) and [Benjamin Bengfort](https://github.com/bbengfort) to share their experiences and knowledge with us developing the [`yellowbrick`](http://www.scikit-yb.org/en/latest/) package. We made good use of their advice that it is good to just get started with one API and always possible to create another one, for example leveraging other packages as backend later on. Also thank you to [Joris van den Bossche](https://github.com/jorisvandenbossche) and the geopandas development team whithout whoms timing to release release geopandas 0.4.0 the first `splot` release would not have been possible.

Furthermore, [Sergio Rey](https://github.com/sjsrey) [Dani Arribas-Bel](https://github.com/darribas), [Levi John Wolf](https://github.com/ljwolf) and [I](https://github.com/slumnitz) were able to meet for a common `PySAL` and `splot` sprint at [SciPy 2018](https://scipy2018.scipy.org/ehome/index.php?eventid=299527&). This provided the basis for the first release and I was also able to present `splot` in a [lighntning talk](https://youtu.be/kriQOJMycIQ?t=2381) to the broader Scientific Python community. Laslty, many thanks to my mentors and teh whole PySAL development team for loads of fun coding and google hangouts sessions. I had a great time this summer and am looking forward to beeing part of this welcoming python community way beyoned the Google summer of Code!