The Home Mortgage Disclosure Act (HMDA) Platform is a Regulatory technology application for financial institutions to submit mortgage information as described in the Filing Instruction Guide (FIG). The HMDA-Platform parses data as submitted by mortgage leading institutions and validates the information for edits (Syntactical, Validity, Quality, and Macro as-per the instructions in the FIG) before submitting the data. The HMDA-Platform supports quarterly and yearly filing periods. For detailed information on Home Mortgage Disclosure Act (HMDA), checkout the About HMDA page on the CFPB website.
Please watch this short video to view how HMDA Platform transforms the data upload, validation, and submission process.
|Frontend||https://github.com/cfpb/hmda-frontend||ReactJS Front-end repository powering the HMDA Platform|
|HMDA-Help||https://github.com/cfpb/hmda-help||ReactJS Front-end repository powering HMDA Help - used to resolve and troubleshoot issues in filing|
|LARFT||https://github.com/cfpb/hmda-platform-larft||Repo for the Public Facing LAR formatting tool|
|HMDA Test Files||https://github.com/cfpb/hmda-test-files||Repo for automatically generating various different test files for HMDA Data|
|HMDA Census||https://github.com/cfpb/hmda-census||ETL for geographic and Census data used by the HMDA Platform|
|HMDA Data Science||https://github.com/cfpb/HMDA_Data_Science_Kit||Repo for HMDA Data science work as well as Spark codebase for Public Facing A&D Reports|
- TS and LAR File Specs
- End-to-End filing GIF
- Technical Overview
- HMDA Platform Technical Architecture
- HMDA Data Browser Technical Architecture
- Running with sbt
- One-line Cloud Deployment to Dev/Prod
- Docker Hub
- One-line Local Development Environment (No Auth)
- Automated Testing
- Postman Collection
- API Documentation
- Sprint Cadence
- Code Formatting
- Development Process
- Open source licensing info
- Credits and references
TS and LAR File Specs
The data is submitted in a flat pipe (
|) delimited TXT file. The text file is split into two parts: Transmission (TS) File -- first line in the file and Loan Application Register (LAR) -- all remaining lines of the file. Below are the links to the file specifications for data collected in years 2018 - current.
End-to-End filing GIF
The hmda-frontend uses Cypress to test the end-to-end filing process from the end user perspective. The GIF below shows the automated filing process via Cypree - no human intervention.
This repository contains the code for the entirety of the public facing HMDA Platform backend. This platform has been designed to accommodate the needs of the HMDA filing process by financial institutions, as well as the data management, publication, aggregation, reporting, analyzing, visualizing, and downloading the HMDA data set.
The HMDA Platform follows a loosely coupled event driven micro-services architecture with API-first (API Documentation) design principles. The entire platform is built on open source frameworks and remains cloud vendor agnostic.
The code base contained in this repository includes the following microservices that work together in support of the HMDA Platform.
HMDA Platform: The entire backend API for public facing filing platform. Used for processing the uploaded TXT files and validating them in a non-blocking, I/O streaming way. The APIs are built to be able to process various file sizes, from small (few lines) to large (1.5M+ lines), text files simultaneously without impeding the scalability or availability of the platform. The platform contains code for customizable data edits, a Domain Specific Language (DSL) for coding the data edits, and submitting events to Kafka topics.
Check Digit: The entire backend API for public facing check digit tool. The Check Digit tool is used to (1) Generate a two character check-digit based on an Legal Entity Identifier (LEI) and (2) Validate that a check-digit is calculated correctly for any complete Universal Loan Identifier (ULI). This APIs are built to process multiple row CSV files as well as one time processing.
Institutions API: Read only API for fetching details about an LEI. This microservice also listens to events put on the
institutions-apiKafka topic for Creating, updating, and deleting institution data from PostgreSQL.
Data Publisher: This microservice runs on a scheduled basis to make internal / external data available for research purposes via object stores such as S3. The schedule for the job is configurable via K8s config map
Ratespread: Public facing API for the ratespread calculator. This calculator provides rate spreads for HMDA reportable loans with a final action date on or after January 1st, 2018. This API supports streaming CSV uploads as well as one-time calculations.
Modified LAR: Event driven service of modified-lar reports. Each time a filer successfully submits the data, the modified-lar micro-service generates a modified-lar report and puts it in the public object store (e.g. S3). Any re-submissions automatically re-generate new modified-lar reports.
HMDA Reporting: Real-time, public facing API for getting information (LEI number, institution name, and year) on LEIs who have successfully submitted their data.
HMDA Analytics: Event driven service to insert, delete, update information in PostgreSQL each time there is a successful submission. The data inserted maps with the Census data to provide information for MSAMds. It also adds race, sex, and ethnicity categorization to the data.
HMDA Dashboard: Authenticated APIs to view realtime analytics for the filings happening on the platform. The dashboard includes summarized statistics, data trends, and supports data visualizations via frontend.
Email Service: Event driven service to send an automated email to the filer on each successful submission.
HMDA Platform Technical Architecture
The image below shows the cloud vendor agnostic technical architecture for the HMDA Platform.
HMDA Data Browser Technical Architecture
Running with sbt
git clone https://github.com/cfpb/hmda-platform.git cd hmda-platform export CASSANDRA_CLUSTER_HOSTS=localhost export APP_PORT=2551 sbt [...] sbt:hmda-root> project hmda-platform sbt:hmda-platform> reStart
Access locally build platform
Build hmda-platform Docker image
Docker Image is build via Docker plugin utilizing sbt-native-packager
sbt -batch clean hmda-platform/docker:publishLocal
The image can be built without running tests using:
sbt "project hmda-platform" dockerPublishLocalSkipTests
One-line Cloud Deployment to Dev/Prod
The platform and all of the related microservices explained above are deployed on Kubernetes using Helm. Each deployment is a single Helm command. Below is an example for the deployment of the email-service:
helm upgrade --install --force \ --namespace=default \ --values=kubernetes/hmda-platform/values.yaml \ --set image.repository=hmda/hmda-platform \ --set image.tag=<tag name> \ --set image.pullPolicy=Always \ hmda-platform \ kubernetes/hmda-platform
All of the containers built by the HMDA Platform are released publicly via Docker Hub: https://hub.docker.com/u/hmda
One-line Local Development Environment (No Auth)
The platform and it's dependency services, Kafka, Cassandra and PostgreSQL, can run locally using Docker Compose.
# Bring up hmda-platform, hmda-analytics, institutions-api docker-compose up
The entire filing plaform can be spun up using a one line command. Using this locally running instance of Platform One, no authentication is needed.
# Bring up the hmda-platform docker-compose up hmda-platform
Additionally, there are several environment varialbes that can be configured/changed. The platform uses sensible defaults for each one. However, if required they can be overridden:
CASSANDRA_CLUSTER_HOSTS APP_PORT HMDA_HTTP_PORT HMDA_HTTP_ADMIN_PORT HMDA_HTTP_PUBLIC_PORT MANAGEMENT_PORT HMDA_CASSANDRA_LOCAL_PORT HMDA_LOCAL_KAFKA_PORT HMDA_LOCAL_ZK_PORT WS_PORT
The HMDA Platform takes a rigorous automated testing approach. In addtion to Travis and CodeCov, we've prepared a suite of Newman test scripts that perform end-to-end testing of the APIs on a recurring basis. The testing process for Newman is containerized and runs as a Kubernetes CronJob to act as a monitoring and alerting system. The platform and microservices are also testing for load by using Locust.
In addition to using Newman for our internal testing, we've created a HMDA Postman collection that makes it easier for users to perform a end-to-end filing of HMDA Data, including upload, parsing data, flagging edits, resolving edits, and submitting data when S/V edits are resolved.
Our team works in two week sprints. The sprints are managed as Project Boards. The backlog grooming happens every two weeks as part of Sprint Planning and Sprint Retrospectives.
Our team uses Scalafmt to format our codebase.
Below are the steps the development team follows to fix issues, develop new features, etc.
- Create a fork of this repository
- Work in a branch of the fork
- Create a PR to merge into master
- The PR is automatically built, tested, and linted using: Travis, Snyk, and CodeCov
- Manual review is performed in addition to ensuring the above automatic scans are positive
- The PR is deployed to development servers to be checked using Newman
- The PR is merged only by a separate member in the dev team
CFPB is developing the
HMDA Platform in the open to maximize transparency and encourage third party contributions. If you want to contribute, please read and abide by the terms of the License for this project. Pull Requests are always welcome.
We use GitHub issues in this repository to track features, bugs, and enhancements to the software.
Open source licensing info
Credits and references
- https://github.com/cfpb/hmda-frontend - ReactJS Front-end repository powering the HMDA Platform
- https://github.com/cfpb/hmda-help - ReactJS Front-end repository powering HMDA Help - used to resolve and troubleshoot issues in filing
- https://github.com/cfpb/hmda-platform-larft - Repo for the Public Facing LAR formatting tool
- https://github.com/cfpb/hmda-test-files - Repo for automatically generating various different test files for HMDA Data
- https://github.com/cfpb/hmda-census - ETL for geographic and Census data used by the HMDA Platform
- https://github.com/cfpb/HMDA_Data_Science_Kit - Repo for HMDA Data science work as well as Spark codebase for Public Facing A&D Reports