Our main goal is to help the BLDUP create a machine learning platform to analyze housing and construction project data.
Our Client: BLDUP
- BLDUP is a Saas platform that provides valuable data for real estate investment and development.
- Background: some permit data sources lack permit classifications that BLDUP depends on to automate status updates site-wide
- Goal: to architecture, train, and deploy a machine learning model as a microservice integrated to the BLDUP platform that enables permit classification prediction to allow BLDUP to achieve a greater level of automation
- Enhance real estate data for BLDUP clients
- Reduce BLDUPs cost by automating data labeling
To implement and deploy a trained model that can accurately predict the classification of a permit based on its description, and deployed as an endpoint of an API that can be called repeatedly within the BLDUP infrastructure.
Goals:
- Identify BLDUPs data analysis main shortcomings
- Create and train best identified machine learning model
- Create web API to expose the model's functionality
- Deploy all models as microservices that can be used by BLDUP to improve their platform
The BLDUP is a B2B platform for users who want to trade properties directly to the business owner or look for recent information about certain properties or construction projects.
It does target:
- Individuals looking for new opportunities to trade/buy properties
- Companies looking for new opportunities to trade/buy properties
- Individuals who are employed by companies within the construction industry
- Individuals looking to understand housing and construction data
This project will be scoped to the following:
- Identifying a machine learning classification and estimation use cases with BLDUPs currently acquired data.
- Training, testing and deploying identified machine learning models.
- Design and develop an API interface in order to extract data from the trained model.
- Develop a microservice framework that can be used to deploy and maintain the machine learning model.
High-level outline of the solution: The main shortcoming of the current BLDUP service that we will be creating a solution for this semester . First, it does not classify different types of construction work well using the current data processing pipeline. Second, blank images are used as placeholders whenever the service isn’t able to find suitable images of the construction project. Machine learning can be used to mitigate these shortcomings. We plan to explore state of the art models (classification (regression, CNN, MLP)) to address the first project categorization problem. For the image generation goal, we plan to explore generative adversarial networks (GANs) to generate arbitrary images related to each individual type of real estate construction project.
Research and train a machine learning model to enhance the BLDUP data processing pipeline. This model will serve as microservices that will interact with the current BLDUP framework. The machine learning model will enhance the BLDUP data processing pipeline by classifying public construction permit data by different types of permit. Additionally, the model will be exposed through an API and integrated into the BLDUP platform. In general, we are going to use machine learning to enhance the BLDUP web platform. Currently, some of properties’ information is manually inputted and we want to train the models to reduce the labor during adding or updating new properties’ information to the website.
- Learn the demands and requirements from mentor
- Explore BLDUPs platform to familiarize ourselves with the scope and implications of the project
- Understand the data acquired by BLDUP
- Investigate and research classification algorithms
- Identify at least 2 ML algorithms to implement
-
Initial model explotation and paper discussion: https://www.ashrae.org/file%20library/conferences/specialty%20conferences/2020%20building%20performance/papers/d-bsc20-c083.pdf
- The paper tackles a similar problem we are trying to solve
- Provides excellent methods to devise a solution to our project
- Paper results serve as a baseline for our work
-
Initial analysis of the BERT machine learning model
-
Demo2 Link: https://drive.google.com/file/d/1wfLV-DiJDwQIxkG1ziCiuqr1edmJ4VoB/view?usp=sharing
- Train the first BERT model on Boston data without permit IDs
- Tune BERT model parameters
- Formalize API implementation and deploy
- Demo3 Link: https://www.youtube.com/watch?v=P2m0KtebSH0
-
Make API route more robust
- Endpoint error checking for invalid inputs
- Docker containerize endpoint and resources
-
Investigate implementation for DevOps/CI framework
-
Regroup output categories of the costruction permit types to 4 general types (Building, Mechanical, Plumbing, Electrical)
- Retrain model on heterogenous data to make a generalized model
- Divide data into pre-construction and post-construction sets
- Finalize deployment automation through bash script
Andrey Turovsky andrey@bldup.com
Professor Orran Krieger: okrieg@bu.edu
Professor Peter Desnoyers: pjd-nu or pjd@ccs.neu.edu
Anqi Guo: anqianqi1
-
From within the directory containing the Dockerfile run the following to build the docker image:
docker build -t webapp-build:latest .
-
You should see the image listed when running:
docker image ls
-
Finally, in order to run the app:
docker run -d -p 5000:5000 webapp-build
-
Now, the app should be availble at (use Postman or a browset to verify):
http://127.0.0.1:5000/