# BTAA-GIN Technology Orientation

<small> 2025</small>

## 🎯 Goals for Today 

- Identify resources for the BTAA Geoportal  
- Learn how to submit metadata  
- Practice fixing common metadata issues  

**Format:** Short explanations + hands-on exercises  

##  The BTAA Geoportal 

![geoportal](geoportal.png)
- A discovery tool for geospatial resources
- Built with [GeoBlacklight](https://geoblacklight.org), an open-source software application
- A metadata catalog (not the data itself!)
- (Most) resources are public domain;  free & open data

## Geoportal Special Features

- Searching with the map (requires coordinates in the metadata)
- Resource previews (if available from provider):
	- Maps: IIIF ([example](https://geo.btaa.org/catalog/c5b43fd8-0030-4089-80df-b692e6bda8b2))
	- Datasets: web services ([example](https://geo.btaa.org/catalog/6f3a1ad4-21f2-494e-a382-dee1255e29e3))


# 1. Geoportal Content

## Formats

**Always in scope**
- Shapefile, GeoJSON, other spatial formats
- Geodatabase, geopackage
- TIFF, JPEG, JPEG2000, PNG
- GeoTIFF

**Added by request**
- Static websites, Interactive databases, Storymaps
- Tabular data, PDFs, Text files

## Sources

**Always in scope**

- Local government: state, county, city, regional
- Nongovernment orgs: nonprofits, historical societies  
- Academic: libraries, departments, research institutes

**Added by request**

- Federal data
- Licensed data
- Maps in copyright

## Geography

**Always in scope**

- Data covering an area within the Big Ten
- Data covering any area, but created by a Big Ten researcher
- Maps of any area held at a Big Ten university

**Added by request**

- Data from another state or nation 
- Maps held at a non-Big Ten university

**Your Primary Role:** Keep looking for new resources!  

## 🛠️ Activity

### Find GIS Data Sources

🔍 **Task:** Look up a county in your state & check if it has a GIS data portal  

🤔 **Ask Yourself:**  
- What data is available?  
- Would it be a good fit for the Geoportal?  

💬 **Discussion:** What did you find?  

# 2. How to submit resources

This is an informal process. Every collection has a slightly different workflow. Our goal is to obtain the metadata. Here are some things to look for.

## GIS Data

- Is it hosted on a standard platform, such as ArcGIS Hub or CKAN portal?  Try adding `/data.json` to the end of the base URL. This is the API, and it can be harvested programmatically.


- If not, are there individual metadata documents anywhere, such as a catalog of XMLs?


## Scanned Library Maps

- Ask your IT department for a full export of metadata
- Make sure the export includes IDs and links

### A website

- Get the URL and publisher
- Create a title and description

## 💬 Discussion

How will we get the metadata from scanned maps held at your library?


# 3. Metadata Profile

## GeoBTAA Metadata Profile.

Includes:

1. [OpenGeoMetadata Aardvark](https://opengeometadata.org/ogm-aardvark/)
2. [Custom elements](https://gin.btaa.org/metadata/b1g-custom-elements/)

## OpenGeoMetadata (OGM) Aardvark

- designed for GeoBlacklight
-  intended for discoverability
- often generated from more complete geospatial metadata, such as ISO 19139 or FGDC
- mainly a mix of Dublin Core and GeoBlacklight application-specific fields

## Custom GeoBTAA Elements

- augments OGM Aardvark 
- intended to serve as standalone metadata
- includes geospatial technical fields, like projection & scale
- multiple fields for life cycle tracking

## 🔍 Template

Let's take a look at the Primary Metadata Template [z.umn.edu/b1g-template](https://z.umn.edu/b1g-template)

Key things to know about the template
- Make a copy or request a customized template
- Separate multiple values with a pipe (|)


# 4. Tricky Fields

### Bounding Boxes

- Format as decimal degrees (instead of degrees-minutes-seconds)
- Use the order West,South,East,North
- If coordinates are missing, consider adding in batches with identical coverage or assign to student workers

## Klokan Bounding Box Demo

- [Klokan Bounding Box](https://boundingbox.klokantech.com) is a tool for generating extents in various formats
- Select the "CSV" output option
- More detailed instructions: [https://gin.btaa.org/metadata/recipes/add-bbox](https://gin.btaa.org/metadata/recipes/add-bbox/)

## 🛠️ Activity

### Troubleshoot a Bounding Box format

Find 3 things to fix about the following bounding box for Chicago:

`W87°56', -87°31', 42°01', 41°38'`

### Answer

1. It needs to be in decimal degrees
2. The western coordinate has a "W" instead of a negative sign
3. The coordinates are in the wrong order (These are `W,E,N,S`; we need `W,S,E,N`)

Format for the BTAA Geoportal:

`-87.9,41.6,-87.5,42.0`

## Place Names

- Format all place names as [FAST subject headings](http://fast.oclc.org/searchfast/)
- For local US data, the format looks like:
	- `state--county` 
	- `state--city`
	- `state`

Example:

`Illinois--Chicago`

## 🛠️ Activity

### Reformat place names

What is the FAST format for the following place names?

- Portland (Or.)
- Seattle (Wash.)


### Answer

- For "Portland (Or.)" → `Oregon--Portland`

- For "Seattle (Wash.)" → `Washington (State)--Seattle`

To improve user searches, we add the state as a separate entry (separated by a pipe), like this:

`Oregon--Portland|Oregon`


# 5. Distributions

The BTAA Geoportal does not host data or maps, so we need **links** with the metadata.

The Geoportal has over 2 dozen types of links. Common types:

- Landing page
- Download (can be multiple)
- IIIF Image API or Presentation (manifest) API 
- OpenIndexMap (GeoJSON)
- Geospatial web services from ArcGIS or GeoServer
- Supplemental metadata file

## 🔍 Template

Let's take a look at the Distributions Metadata Template [z.umn.edu/b1g-template](https://z.umn.edu/b1g-template)

Key things to know about the Distributions template
- One link per line
- Use the ID from the Primary template
- The same record may have multiple rows of links

# 6. Full Resource Lifecycle

![ResourceLifecycle](resourceLifecycle.png)

## 1. Identify

Team members seek out new content for the geoportal. 


## 2. Obtain and Process Metadata

We harvest the metadata, convert it to the GeoBTAA Schema, edit, and validate it.

### a. Harvest

Here are the most common ways that we harvest the metadata:
1. a BTAA-GIN Team Member sends us the metadata as files or CSV
2. an API
3. scrape an HTML page with Python
4. we manually copy and paste the metadata into a spreadsheet
5. a combination of one or more of the above

### b. Crosswalk

We "crosswalk" or convert the metadata into the schema needed for the Geoportal. Our goal is to end up with a spreadsheet containing columns matching our [metadata template](https://z.umn.edu/b1g-template).

### c. Edit

Manually fix, improve, and augment the metadata as needed.

### d. Validate

Run a validation and cleaning script to ensure the records conform to the required elements of our schema. 

## 3. Index Metadata

### a. Ingest to GBL Admin

We upload the completed spreadsheet to GBL Admin, which serves as the administrative interface for the Geoportal. If GBL Admin detects any formatting errors, it will issue a warning and may reject the upload.


### b. Publish new records to the Geoportal

Once the metadata is successfully uploaded to GBL Admin, we can publish the records to the Geoportal. The technology that actually stores the records and enables searching is called [Solr](https://solr.apache.org). 

### c. Unpublish

Periodically, we need to remove records from the Geoportal. To do this, we use GBL Admin to either delete them or change their status to "unpublished."



## 4. Maintenance

### a. Monitor sources

We monitor our sources to check for new and retired content.

### b. Monitor Geoportal

We regularly assess currentness of the content in the Geoportal and check for broken links.


### c. Schedule re-harvests

We schedule re-harvests from sources based on how frequently they update their content. See the [Collections Dashboard](https://github.com/orgs/geobtaa/projects/4) for this schedule.

# ❓Questions

# 💬 Discussion