<img align="right" width="300" src="libraries_short_color.png" alt="NYU Libraries Logo">

# Humanities Data That Stays Human
## Strategies for Building Critical Quantitative Data Literacy in the Classroom

**Nicholas Wolf**<br/>
[ORCID 0000-0001-5512-6151](https://orcid.org/0000-0001-5512-6151)

This lesson is licensed under a [Creative Commons Attribution-NonCommercial 4.0 International License](https://creativecommons.org/licenses/by-sa/4.0/).

**Overview**

Visualization of data can be a powerful way to understand subjects of study from a different point of view, eliciting patterns or areas of analysis that might not otherwise be detected through reading a source only.

There is also a need for humanists to diversify the approaches that they use to map humanities data. The tendency to select points as a representation of a source is strong. But there are other ways to represent spatial relationships. We'll work with one in particular, line segments, which can be a powerful way to model itineraries, journeys, and other forms of movement across landscape.

Finally, in this session we'll consider a new "container" for how we can hold data, JSON (and specifically, GeoJSON, a container for geographic data).

**Materials**

 - A good text editor, Google Maps, and a [GeoJSON validator/viewer](http://geojson.io).
 - [Rand McNally Guide for New York City, 1904](https://catalog.hathitrust.org/Record/100631149); alternatively available should we find HathiTrust access issues at [this link to the full guide](https://drive.google.com/file/d/1kQWEenTqgjESGcdGHvZjujAoOcRcKWse/view?usp=sharing).
 

### 1. Reflection

1. We will be deriving our data from a tourist's guide to a city from the early 20th century. What would be the expected biases, rhetorical techniques, mediated views, or assumptions that we would expect in a source like this? What critical stance would we need to adopt to profitably interpret this source?

2. For this session, we will concentrate on two sections of the guide entitled "A Tour of the City" and "A Ramble at Night." Let's talk a little bit about what early 20th conceptions of cities during the day and night might entail.

### 2. Geocoding (aka Geolocating)

Before we can visualize the itineraries contained in this tourist guide, we'll need to become proficient at finding geographic coordinates for the places mentioned in the source. 

Most mapping visualization systems rely on decimal-degree versions of Cartesian map coordinates -- in other words, the number of degrees from the equator (latitude) and number of degrees from the Prime Meridian (longitude), converted from degrees/minutes/seconds to a decimal number. Positive latitude or longitude indicate northern and easterly directions from 0 degrees, and negative indicate westerly and southern.

A fast way to find geographic coordinates for a location is to use Google Maps. Let's practice:

1. Enter the street address or cross street description into https://maps.google.com.

2. Right-click (or on a single-button mouse, control + click) on the map at the spot where the subject stands (or stood) and select “What’s Here?” <br clear="both">

<img align="left" width="500" height="500" src="screencaptures/whats-here-gmaps.png" alt="Google Maps What's Here Function"><br clear="both"><br/>

3. Google will supply a small pop-up window with the decimal coordinates of the location given with a hyperlink. Click on that hyperlink.<br clear="both">

<img align="left" width="500" height="500" src="screencaptures/decimal-degrees.png" alt="Google Maps Finding Decimal Degrees"><br clear="both"><br/>

4. This will bring the latitude/longitude up into an info window on the left. These coordinates are what you will need for making maps.

<img align="left" width="500" height="500" src="screencaptures/lat-long.png" alt="Google Maps Finding Decimal Degrees"><br clear="both"><br/>

We are now ready to visualize!

### 3. Mapping McNally NYC Day/Night Tourist Itineraries

We now have everything we need to produce an interesting comparison of day and night itineraries as laid out in our 1904 sample guides. Let's divide into two groups, day group and night group.

Complete the following steps:

 1. Introduce yourselves and say hello.
 
 2. Make sure everyone understands what the task at hand is.
 
 3. Each group will take one of two itineraries listed in the guide. Compiling a list of locations that can serve as segments of a full itinerary, our aim is to build a table of coordinates sketching out the travel described by the guide. Use the main (large) headers as locations, and only as many of the bold subheaders that are needed to make a complete and unbroken line.

The **day group** should work through the itinerary laid out in the chapter "A Tour of the City" that starts on page 82 and continues to page 125 (HT [link to the start of this page](https://babel.hathitrust.org/cgi/pt?id=uiug.30112089222134&view=1up&seq=92) here. 

The **night group** should work on the itinerary laid out in "A Ramble at Night" that starts on page 147 and continues to page 156 (HT [link to start page](https://babel.hathitrust.org/cgi/pt?id=uiug.30112089222134&view=1up&seq=157)).

Use your discretion on what inflection points you select -- we don't need to be hyper detailed, but we do want to produce a good itinerary.

 4. Collectively compile the locations in your group's shared Google Sheet using Google Maps to locate the start and end point of each leg of the itinerary. Use the first column so that each group member can claim a row to be working on.

<table>
    <tr><th>GroupMember_Initials</th><th>Start_Point_Location</th><th>Coordinates_Start</th><th>Coordinates_End</th></tr>
    <tr><td>NW</td><td>Police Headquarters</td><td>[-73.994868, 40.724906]</td><td>[-74.000415, 40.714432]</td></tr>
    <tr><td>LC</td><td>Five Points</td><td>[-74.000415, 40.714432]</td><td>{{Add next coordinates}}</td></tr>
</table>
<br/>

...where the next set of coordinates to add here will be the location of the next stop in the itinerary.

[Google Sheet for Day Group](https://docs.google.com/spreadsheets/d/1mljTCR6HfMFxpKPIvxvQJU6FMzmnPwVM0_Jyjl70uwA)

[Google Sheet for Night Group](https://docs.google.com/spreadsheets/d/12JCoBf3GoxmS8l_C7_RMtDD1T-cPfxZ8vLnAAevU5JM)

 5. Decide as group on one "variable" to collect for each itinerary leg from the prose description and add it in Column E. For example, you might consider a categorical variable such as the type of city feature predominantly described. A variable that consists of "monument," "city administration," "culinary", etc., for example, referring to what type of features included in the itinerary, could lead to interesting questions about what "uses" were envisioned for certain neighborhoods in the city.
 
 
 6. Elect a DJ to put on some nice background music while you work ([example on YouTube](https://www.youtube.com/watch?v=dfeQ7witu7s)).

**NOTE!**

The end point of each location row is the starting point for the next row. Keep them in the order that the itinerary would take you!

### 4. Introduction to JSON and to GeoJSON

We have a variety of options open to us as to how we can structure representations of our data so that the data can be written to a computer file for storage. For the most part, we seek forms that are easy to read for humans who view the characters that together make up a reprsentation of that data, and yet are highly structured (e.g. predictable) enough that a computer can interpret their contents consistently.

Often, when working with tabular data, we work in terms of character-separated rows of records. CSVs, or comma-separated systems, are very common. 

But another useful system is called javascript objection notation, JSON, and it involves moving our tabular records/observations to a series (array) of repeated variable-value pairs. Think of this as repeatedly linking a row value to its column (variable) label. A drawback to this is that the resulting object contains a lot more characters (our column/variable names are repeated a lot), but among the advantages are that the order that we list out our "cell values" no longer matters -- every value is linked to its variable as well as to its accompanying values. It is also more extensible. Since the contents of any JSON value can itself be rendered in JSON, we can have a highly complex, multidimensional data structure described using a relatively minimal container.

**Example**

Suppose we have a table like this (examples from NYU's [Arabic Collections Online](http://dlib.nyu.edu/aco/)):

<table>
    <tr><th>Year</th><th>Title</th><th>Holding Library</th></tr>
    <tr><td>1962</td><td>al- Murshid ilá mawāṭin al-āthār wa-al-ḥaḍārah</td><td>NYU</td></tr>
    <tr><td>1946</td><td>ʻAwdaẗ al-rūḥ</td><td>NYU</td></tr>
</table>

Our JSON equivalent would look like this:
<pre>
[
  {"Year":"1962",
   "Title":"al- Murshid ilá mawāṭin al-āthār wa-al-ḥaḍārah",
   "Holding Library":"NYU"
   },
   
  {"Year":"1946",
   "Title":"ʻAwdaẗ al-rūḥ",
   "Holding Library":"NYU"
   }
]
</pre>
Now suppose we have more than one author for each work in our table. Using tables, this gets a little awkward to convey and still maintain a "tidy data" approach:

<table>
    <tr><th>Year</th><th>Title</th><th>Authors</th><th>Holding Library</th></tr>
    <tr><td>1962</td><td>al- Murshid ilá mawāṭin al-āthār wa-al-ḥaḍārah</td><td>Bāqir, Ṭāhā. Safar, Fuʼād.</td><td>NYU</td></tr>
    <tr><td>1946</td><td>ʻAwdaẗ al-rūḥ</td><td>Ḥakīm, Tawfīq.</td><td>NYU</td></tr>
</table>
<br/>
But using JSON, we can render this quite easily:

<pre>
[
  {"Year":"1962",
   "Title":"al- Murshid ilá mawāṭin al-āthār wa-al-ḥaḍārah",
   "Authors":["Bāqir, Ṭāhā","Safar, Fuʼād"],
   "Holding Library":"NYU"
   },
   
  {"Year":"1946",
   "Title":"ʻAwdaẗ al-rūḥ",
   "Authors":["Ḥakīm, Tawfīq"],
   "Holding Library":"NYU"
   }
]
</pre>


**GeoJSON and Spatial Data**

You may have encountered spatial data in a variety of other formats such as ESRI's shapefiles or the KML. Another way to store and compactly convey spatial data is via JSON that follows a [GeoJSON formatting standard](http://wiki.geojson.org/Main_Page). Consider approach a way of storing all of the points needed to make a point, line or polygon on a map, and any associated data that goes along with the spatial feature.

Here, for example, is how you store a point in GeoJSON:
<pre>
{
   "type": "Point",
   "coordinates": [100.0, 0.0]
}
</pre>

Those coordinates are our decimal degrees once again, though in a specific order: x, y, **or longitude, latitude (take note!)**

And here is how we store a line string (two points plus the line that connects them):
<pre>
{
   "type": "LineString",
   "coordinates": [
       [100.0, 0.0], [101.0, 1.0]
   ]
}
</pre>

Note that we need two points groupings for a linestring, one for the start of the line and one for the end. Here is how the linestring above looks when rendered in [geojson.io](https://geojson.io):

<img align="left" width="300" src="screencaptures/sample-linestring.png" alt="GeoJSON Linestring Example"><br clear="both"><br/>

And we can even add a little bit of styling to give a color attribute to the feature we are creating once displayed on a map:

<pre>
{
   "type": "LineString",
   "coordinates": [
       [ [100.0, 0.0], [101.0, 1.0] ],
       [ [102.0, 2.0], [103.0, 3.0] ]
   ],
   "style": {
        "fill":"red",
        "stroke-width":"3",
        "fill-opacity":1.0
    }
}
</pre>


**NOTE!**
1. You have to reverse the order of the decimal degree locations given by Google Maps. Maps gives you latitude-longitude, but the order for GeoJSON is longitude, latitude!!

### 5. Building our GeoJSON and Adding to a Map

There are ways to automate pulling data from a table and making it into GeoJSON. But because we need specific formatting for GeoJSON anyway, it is worthwhile for us to perform our final step in a text editor. Using this template for each line segment, collectively add a JSON segment for each row in your table that looks something like this:

<pre>

{
   "type": "Feature",
   "id": "0",
   "geometry": {
       "type": "LineString",
       "coordinates": [
           [102.0, 0.0], [103.0, 1.0], [104.0, 0.0], [105.0, 1.0]
       ]
   },
   "properties": {
               "Start_Point_Location": "Police Station"
           },
   "style": {
        "fill": "red",
        "stroke-width": "3"
   }
}

</pre>

**Day Group**: enter your GeoJSON here: [https://code.etherpad.com/p/mcnally-day](https://code.etherpad.com/p/mcnally-day)

**Night Group**: enter your GeoJSON here: [https://code.etherpad.com/p/mcnally-night](https://code.etherpad.com/p/mcnally-night)

Once we have all of these compiled, we can test them out on [geojson.io](https://geojson.io).