In this exciting project, my team and I have brought the Lineage Based Data Storage (L-Store) Architecture to life through a practical python implementation. This database, whose development was heavily influenced by and based on the Industrial and Applications Paper "L-Store: A Real-time OLTP and OLAP System" by Mohammad Sadoghi, Souvik Bhattacherjee, Bishwaranjan Bhattacharjee, and Mustafa Canim supports both OLTP(transactional processes) and OLAP(analytical processes) to enable a wide range of application within a single system. One of the primary advantages of this system is its fast query performance, which can be attributed to the lineage-based methodology through maintaining a detailed record of the data as it is processed and stored in our database. By leveraging this data lineage in our database, we have created a system that features contention-free and lazy staging of columnar data. This process transitions data from a write-optimized format, which is ideal for OLTP, to a read-optimized format, which is ideal for OLAP, all while maintaing transactional consistency. Therefore this approach for our database supports both querying and retention of both current and historical data. This balance between in-memory processing and disk storage ensures optimal read and write speeds, which makes this database an excellent choice for handling complex workloads as efficiently as possible. Along with this, the Lineage based architecture utilizes an efficient lock management system to minimize conflict between our threads, which helps to maintain data integrity and boost overall performance. The test class for this project evaluates functionality and includes performance metrics related to execution times for each of the fundamental operations, including the insertion, selection, updating, and deletion of records as well as aggregation of data sets.
This project demonstrates the potential of a lineage-based storage architecture and what we could potentially achieve in terms of enhanced data mangement and performance. This approach not only showcases the system's flexibility but also opens the door to future advancements in data storage and processing technologies.