marp | theme | class | backgroundColor |
---|---|---|---|
true |
gaia |
invert |
black |
A laughably-minimalist, integer-only, read-only Relational Database Management System that makes the author question why they ever bothered to write it up!
19th August, 2020 Wednesday
- Relational Algebra Operators
- Integers Only
- No update operations
- No aggregate operations
- No nested queries
- No transaction management
- Single thread programming only
- No identifiers should have spaces in them
There are 2 kinds of commands in this database.
- Assignment statements
- Non-assignment statements
Note: Not all operators have been implemented, some have been omitted for you to implement in later phases
Non-assginment statements do not create a new table (except load which just loads an existing table) in the process
- LOAD
- LIST
- RENAME
- EXPORT
- CLEAR
- QUIT
The following haven't been implemented
- INDEX
Syntax:
LOAD <table_name>
- To successfully load a table, there should be a csv file names <table_name>.csv consisiting of comma-seperated integers in the data folder
- None of the columns in the data file should have the same name
- every cell in the table should have a value
Run: LOAD A
Syntax
LIST TABLES
This command lists all tables that have been loaded or created using assignment statements
Run: LIST TABLES
Run: LOAD B
, LIST TABLES
Syntax
PRINT <table_name>
- Displays the first PRINT_COUNT (global variable) rows of the table.
- Less number of rows can be printed if the table has only a few rows
Run: PRINT B
Syntax
RENAME <toColumnName> TO <fromColumnName> FROM <table_name>
- Naturally <table_name> should be a loaded table in the system and should be an exsiting column in the table
- should not be another column in the table
Run: RENAME b TO c FROM B
Syntax
EXPORT <table_name>
- All changes made and new tables created, exist only within the system and will be deleted once execution ends (temp file)
- To keep changes made (RENAME and new tables), you have to export the table (data)
Run: EXPORT B
Syntax
CLEAR <table_name>
- Removes table from system
- The table has to have previously existed in the system to remove it
- If you want to keep any of the changes you've made to an old table or want to keep the new table, make sure to export !
Run: CLEAR B
Syntax
QUIT
- Clear all tables present in the system (WITHOUT EXPORTING THEM) (temp file - empty)
Run: QUIT
Syntax:
INDEX ON <columnName> FROM <table_name> USING <indexing_strategy>
Where <indexing_strategy> could be
BTREE
- BTree indexing on columnHASH
- Index via a hashmapNOTHING
- Removes index if present
-
All assignment statements lead to the creation of a new table.
-
Every statement is of the form
<new_table_name> <- <assignment_statement>
-
Naturally in all cases, <new_table_name> shouldn't already exist in the system
- CROSS
- PROJECTION
- SELECTION
The following haven't been implemented
- DISTINCT
- JOIN
- SORT
Syntax
<new_table_name> <- CROSS <table_name1> <table_name2>
- Both the tables being crossed should exist in the system
- If there are columns with the same names in the two tables, the columns are indexed with the table name. If both tables being crossed are the same, table names are indexed with '1' and '2'
Run: cross_AA <- CROSS A A
A(A, B) x A(A, B) -> cross_AA(A1_A, A1_B, A2_A, A2_B)
Syntax
<new_table_name> <- SELECT <condition> FROM <table_name>
Where is of either form
<first_column_name> <bin_op> <second_column_name>
<first_column_name> <bin_op> <int_literal>
Where <bin_op> can be any operator among {>, <, >=, <=, =>, =<, ==, !=}
- The selection command only takes one condition at a time
Run: R <- SELECT a >= 1 FROM A
S <- SELECT a > b FROM A
Syntax
<new_table_name> <- PROJECT <column1>(,<columnN>)* FROM <table_name>
- naturally all columns should be present in the original table
Run: C <- PROJECT c FROM A
Syntax
<new_table_name> <- DISTINCT <table_name>
- naturally table should exist
Exmample: D <- DISTINCT A
Syntax
<new_relation_name> <- JOIN <table1>, <table2> ON <column1> <bin_op> <column2>
Where <bin_op> means the same as it does in the SELECT operator
- Implicitly assumes is from and if from
Example: J <- JOIN A, B ON a == a
Syntax
<new_table_name> <- SORT <table_name> BY <column_name> IN <sorting_order>
Where <sorting_order> can be ASC
or DESC
Example: S <- SORT A BY b IN ASC
Syntax
SOURCE <query_name>
- Special command that takes in a file script from the data directory
- file name should end in ".ra" indicating it's a query file
- File to be present in the data folder
- Used in last phase of project
-
Buffer Manager
-
Cursors
-
Tables
-
Executors
Run: LOAD A
with debugger
see: load.cpp
- Splits the query into query units
see: syntacticParser.h syntacticParser.cpp
- Makes sure your query makes semantic sense
see: semanticParser.h semanticParser.cpp
Every command(COMMAND) has a file in the executors directory, within that directory you'll find 3 functions
syntacticParseCOMMAND
semanticParseCOMMAND
executeCOMMAND
-
Load splits and stores the table into blocks. For this we utilise the Buffer Manager
-
Buffer Manager follows a FIFO paradigm. Essentially a queue
- The table catalogue is an index of tables currently loaded into the system
A cursor is an object that acts like a pointer in a table. To read from a table, you need to declare a cursor.
Run: R <- SELECT a == 1 FROM A
with debugger
Every function call is logged in file names "log"
- Phase 1: Code Familiarity (to be released today/tomorrow max)
- Phase 2: 2 Phase Merge Sort
- Phase 3: Indexing
- Phase 4: Indexing Optimized Operators
- Phase 4: Optimisation (SOURCE)
Note: may include duplicate elimination
- Tentative
- Plagiarism: F
- Not sticking to submission guidelines will lead to penalties and at times to scoring 0
- Project phases build on top of each other, failing to do one phase may hinder the rest
- If for any reason you fail to complete the project on time, please mail the Prof directly for extensions and not the TAs, the TAs have no jurisdiction in these cases
- No informal contact with the TAs, you may post on Moodle regarding any doubts. A forum will be created for the same
- TA Hours have been updated on Moodle
- If you need to contact the TAs for matters that don't concern the whole class you may mail us here - datasystems_tas_m20@IIITAPhyd.onmicrosoft.com
- GitHub Repo - SimpleRA
- Build and run instructions will be provided later