Skip to content

pj-simpson/VDR-Integration

Repository files navigation

Virtual Data Room Platform to Cloud Storage Integration

This web app has been created as a proof of concept for a common integration use-case in the LegalTech world. Virtual Data Rooms commonly facilitate M&A Transactions and VDR Platforms can allow Law Firms to run many transactions at once, all managed from a single SaaS Platform.

Imagined use case

This app assumes that there is a remote VDR service which contains many sites (i.e Data Rooms!). It provides a basic dashboard for searching and viewing metadata associated with these sites. Individual sites can be viewed in detail and from there, the remote system's REST API can be leveraged to download the file & folder structures of these sites onto the server this app is running on, as well as an object storage repository like AWS S3.

VDR Platforms generally charge by the amount of data stored, so an integration which can provide efficient transfer out of the system, is a valuable tool for system administrators.

The individual data rooms can also be 'soft' (i.e recycle bin) deleted or even 'hard' (irretrievably) deleted,

Technologies Used

  • Django
  • Django AllAuth
    • Allows us to use the remote VDR system to provide authentication.
  • Pydantic
    • Gives a much clean interface for the data transfer object. Also provides validation.
  • Pytest
    • Easy monkey-patching and mocking is perfect for testing an app reliant on network calls to a 3rd party service.
  • Celery
    • We expect some 'jobs' to be long-running. This allows us to define them as 'tasks' and push to a background worker.
  • Django-Solo
    • Settings can be defined by users in an admin interface. This allows us to do so as a 'singleton' object.
    • Memcached
      • Caching the settings, for as fast as possible access.
  • Boto3
    • For interacting with the S3 functionality.
  • Bottle
    • API Proxy between the Django App and remote system.
    • PyPy
    • Gunicorn
      • PyPy and Gunicorn Async workers to improve throughput for the API Proxy.

Screenshots

Data Room List Data Room Detail Task Buttons Polling the Celery Progress Reporting on the Background Jobs Report Details Settings

Warning about the endpoints

This repository is for demo purposes only! I would advise against trying to run it, as the requests to the remote system urls are 'sort of' fake. I've deliberately obfuscated them, as I dont want to reveal the endpoints of the proprietary API. When running in my local, I have an API proxy written in Bottle forwarding the requests to the 'real' service (not checked into version control). Post about it HERE. Having said that, if the purpose of each endpoint is clear to you and could easily 'map' onto a system you are familiar with, feel free to write your own proxy service!

ToDo

  • Only really considers the 'happy path': improve error handling.
  • Research more efficient algorithms for file-deletion (reverse level order traversal?).
  • More Tests. More Integration in tests.
  • Get Memcached to play more nicely with Django-Solo.
  • Bug where the 'Site Copy' task returns success message too early.
  • Improve the interface (add more filtering and ordering options).

About

Example Django App for VDR platform Integration

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published