Grow your team on GitHub
GitHub is home to over 28 million developers working together. Join them to grow your own development teams, manage permissions, and collaborate on projects.Sign up
Herd is a managed data lake for the cloud. The Herd unified data catalog helps separate storage from compute in the cloud. Manage petabytes of data and make it accessible for data processing and analytical purposes by any cloud compute platform.
Herd-MDL, a turnkey managed data lake in the cloud. See https://finraos.github.io/herd-mdl/ for more information.
Herd-UI is a search and discovery tool for business and technical users. Everyone in your organization can use Herd-UI to browse and understand the contents of your Herd managed data lake.
A Spark-based data comparison tool at scale which facilitates software development engineers to compare a plethora of pair combinations of possible data sources. Multiple execution modes in multiple environments enable the user to generate a diff report as a Java/Scala-friendly DataFrame or as a file for future use. Comes with out of the box Spa…
Gatekeeper is self-serviced web application allowing users to make requests for temporary access to EC2 & RDS instances running in AWS and gain access instantly
MLiy (pronounced “Emily”) is a machine-learning platform that allows data scientists to provision and manage processing power in the cloud. It provides an easy-to-use website to install customizable sets of machine learning software for use in data analysis and exploration. This allows data scientists to focus on data analysis rather than how to…
Extensions for WebDriver is an enhancement to the powerful WebDriver API, with robust features that keep your browser automation running smoothly. It includes a widget library, improved session management and extended functions over the existing WebDriver API.
Aphelion is a web application that captures and visualizes your AWS services usage limits. It continuously collects data in the background and you can visualize the data in easy-to-see graphs and charts.
MSL (pronounced 'Missile') stands for Mock Service Layer. Our tools enable quick local deployment of your UI code on Node and mocking of your service layer for fast, targeted testing.
FINRA open source projects landing page.
yum-nginx-api is a go API for uploading RPMs to yum repositories and configurations for running NGINX to serve them. It is a deployable solution with Docker or a single 16MB dynamically linked Linux binary. yum-nginx-api enables CI tools to be used for uploading RPMs and managing yum repositories.
CTGrazer is code you can use to create an AWS Lambda Function that will collect all of your AWS CloudTrail logs and efficiently send them to your Splunk HEC (HTTP Event Collector) server.
DataGenerator is a Java library for systematically producing large volumes of data. DataGenerator frames data production as a modeling problem, with a user providing a model of dependencies among variables and the library traversing the model to produce relevant data sets.
Test your Hive scripts inside your favorite IDE with HiveQLUnit! Increase your developers productivity by testing on all operating systems including Windows, Linux and Mac OSX. Build continuous integration and delivery tests to control the releases of your big data products.
XCore is a framework to define and execute automated tests. It enables automation code development in Java, test script development in XML via domain specific language, and execution & reporting via JUnit.
Plugin for Karma Test Runner to integrate MSL (Mock Service Layer)
Elastic Discovery - Help applications that don't quite work in the cloud better handle autoscaling and other cloud events.
UMD bitcamp challenge solutions.