# 00. Introduction and Setup

>### Code and course materials, including this notebook, are available at [bit.ly/ndgeoworkshop](http://bit.ly/ndgeoworkshop)


## Overview

Sessions held in 339 O'Shaughnessy Hall. See the [campus map](https://map.nd.edu/#/placemarks/1052/zoom/16/lat/41.700346/lon/-86.238899) for directions.

* **Tuesday morning:** Introduction and setup. Computing basics, plotting, working with text.
* **Tuesday afternoon:** Textual geography, NLP, mapping (static and interactive).

* **Wednesday morning:** Geographic data assessment and curation, advanced topics.
* **Wednesday afternoon:** Working with QGIS (Matt Sisk).


## Setup

Materials for the course are [on GitHub](https://github.com/wilkens/geospatial-workshop).

To work with live code during our sessions, you'll need two things:

* A **working Python environment**. Python 3 is required; Continuum's [Anaconda](https://www.continuum.io/downloads) environment is strongly recommended. Make sure you get the right package for your operating system.
* A **copy of the course materials** on your own machine. Download or, if you know what you're doing, clone these from [GitHub](https://github.com/wilkens/geospatial-workshop).

## Getting Python

Anaconda (see link above) includes almost all of the packages that you'll need for the course (and, indeed, that you'll _ever_ need to do data science in Python). But there are four packages that we'll add:

* `seaborn`, a set of improvements and extensions to Python's `matplotlib` for statistical data visualization.
* `cartopy`, which helps with map making.
* `folium`, for creating interactive maps.
* `googlemaps`, an interface to Google's mapping APIs.

The easiest way to add packages to Anaconda is either from the command line with `conda install` or via the graphical installer that's part of the Anaconda Navigator application.

To install `seaborn` from the command line using conda, type:

```
conda install seaborn
```

You'll likely be prompted to install some additional packages along with `seaborn`. That's good - it's what `conda` is designed to do. Type `yes` when prompted. 

The others are just marginally more complicated, because they aren't in the main Anaconda repository. `cartopy` and `folium` are available from other `conda` repos, so you can install them with just a little extra information:

```
conda install -c conda-forge folium cartopy
```

`googlemaps` isn't available as a conda package at all, so it's installed with `pip`, the general-purpose Python package installer, like so:

```
pip install googlemaps
```

## Running Python notebooks

Once you've installed Python and downloaded (and unzipped) the course materials, you're almost ready to go. The last step is to start what's called a Jupyter notebook server, which will allow you to load, read, modify, and run course code.

> An aside: If you've done some programming before, you may know that notebooks are a relatively recent invention, one used more for teaching and research than for writing production code. That's true. Notebooks are valuable because they make it easy to integrate code, documentation, notes, and visualizations in a single document, as well as to run small pieces of a larger program individually.

I generally start a notebook server from the command line with `jupyter notebook`. You can also use the Anadonca Navigator app to launch a notebook server.  The Navigator app is located in your Anaconda install directory and might also be linked from your desktop and from your Applications or Programs directory. Its icon looks like a scaly green 'O':

![Navigator logo](Images/Navigator.png)

When you start it, you'll see a screen that looks like the one below. Click the 'Launch' button under the Jupyter notebook logo.

![Navigator start screen](Images/NavigatorScreenshot.png)

From there, a browser window will open, displaying the files in your home directory. Use it to navigate to the folder containing the course materials and click on any notebook to start it in a new window. You're up and running!

## Python docs and info

For more information on Anaconda, see [Continuum's documentation site](https://docs.continuum.io). For a user's guide to Jupyter notebooks, see either the built-in help from the 'Help' menu at the top of any running notebook or the [Jupyter project's documentation](https://jupyter.readthedocs.io/en/latest/index.html).

As noted elsewhere, this workshop is nowhere near a proper introduction to programming in general or to Python in particular. When I teach programming material to my own students, I use John Guttag's [_Introduction to Computation and Programming Using Python_](https://mitpress.mit.edu/books/introduction-computation-and-programming-using-python-1). Guttag is also currently offering a [free EdX version of his MIT Intro to CS course](https://www.edx.org/course/introduction-computer-science-mitx-6-00-1x-8#.U4x_iSiJKEk) if you can't get a proper intro CS course at your home university.

## Java

We'll work entirely in Python, but many of the best production-quality Natural Language Processing (NLP) packages are written in Java (which long had much better multilingual text support than did other programming languages). I'll show a bit of [Stanford's Java-based NLP tools](http://nlp.stanford.edu/software/); if you want to try them on your own, you'll need to download and install a Java Development Kit (JDK). I recommend [Oracle's official version](http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html).

## QGIS

On the second day of the workshop, we'll use a graphical GIS tool called [QGIS](http://www.qgis.org/en/site/). You may want to download it now.

> Note that the latest version of QGIS for Mac isn't working for me. The older, long-term release (v. 2.14) seems to be fine.
