Skip to content

Documentation, How-tos, and example code for using Great Lakes to process U-M CoreLogic data.

Notifications You must be signed in to change notification settings

umich-arc/corelogic-on-greatlakes

Repository files navigation

corelogic-on-greatlakes

Documentation, How-tos, and example code for using Great Lakes to process U-M CoreLogic data.

Overview

This repository demonstrates a workflow for processing the CoreLogic Data on the Great Lakes (GL) cluster at the University of Michigan.

The repository is organized in several directories that each demo one step in the following workflow:

  • [intro-to-corelogic-data]: describes the CoreLogic data and how you can get access at the University of Michigan
  • [running-jupyter-spark-gl-ondemand]: describes how a user can start a Jupyter + Spark notebook in an Open on Demand session on the Great Lakes (GL) cluster
  • [processing-corelogic-using-pyspark]: demonstrates how the CoreLogic data can be processed (read, explore, filter, save/write) using PySpark
  • [github-and-greatlakes]: explains how a user can clone from/commit to a Github repository from their home directory on the Great Lakes (GL) cluster

About

Documentation, How-tos, and example code for using Great Lakes to process U-M CoreLogic data.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published