Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Welcome to the Carnegie Mellon Database Application Catalog (CMDBAC) wiki.
What is CMDBAC?
The CMDBAC is a collection of open-source database applications that you can run locally for benchmarking and experimentation. We have created an on-line repository that allows you to search for applications that have workload properties that are relevant to your research.
How Does CMDBAC Accomplish Its Goals?
The first component is a crawler that finds database applications hosted on open-source repositories (e.g., GitHub). The crawler uses heuristics that allows it to identify whether a project uses a database for storage. We target web-based applications that use well-known web frameworks. Thus, we can identify whether a project is relevant if its source code references libraries from one of these frameworks.
We then developed a tool for automatically deploying an application in a VM sandbox. Targeting applications that use the common web frameworks listed above makes this step easier because they provide an object-relational mapping library that does not depend on a particular DBMS. Their configurations are also likely to be the same (e.g. setting the DBMS credentials in a common configuration file).
The CMDBAC currently contains over 1000 applications of varying complexity. We target Web applications based on popular programming frameworks because (1) they are easier to find and (2) we can automate the deployment process. We support applications that use the Django, Ruby on Rails, Drupal, Node.js, and Grails frameworks.