Skip to content
This repository has been archived by the owner on May 12, 2024. It is now read-only.

Plan #5

Open
4 tasks done
Samyak2 opened this issue Jun 13, 2022 · 10 comments
Open
4 tasks done

Plan #5

Samyak2 opened this issue Jun 13, 2022 · 10 comments

Comments

@Samyak2
Copy link
Collaborator

Samyak2 commented Jun 13, 2022

Summer of Code Weekly Plans

Week 1 Plan

@tyt2y3
Copy link
Member

tyt2y3 commented Jun 14, 2022

Can you first write the data structures for the data store?
i.e. Database -> Schema -> Table ?

@Samyak2
Copy link
Collaborator Author

Samyak2 commented Jun 14, 2022

Good idea at this stage. I have updated the plan.

@Samyak2
Copy link
Collaborator Author

Samyak2 commented Jun 19, 2022

Week 1 update

Progress made this week

  • Signed copyright doc
  • Opened issues for week 1 plans (Plan #5)
  • Selected a parser for SQL - sqlparser-rs (Select an SQL parser #3)
  • Setup a Rust project with a temporary directory structure and sqlparser as dependency. The files had empty structs and enums (feat: setup Rust project and directory structure #6)
  • Designed data structures for data storage (wip: data structures #8)
    • Designed the hierarchy of Database -> Schema -> Table
    • Columns are represented by a name and the sqlparser::ast::DataType of it -> avoids re-writing all possible types again.
    • A row is vec of ColumnKind which is an enum of Types and the data stored inside it.
      • This means that a few sqlparser::ast::DataTypes will be mapped to exactly one variant of ColumnKind and the storage for them will be the same.
      • The storage could be optimized by using highly specific variants for each type, but that would increase complexity for performance. Since performance is a non-goal, this is a good trade-off.
    • The VirtualMachine keeps a HashMap of registers.
      • A register can contain a Table or a Filter applied on a Table.
      • The VM executes an IntermediateCode which is just a vec of Instructions.
      • The design of the instructions is yet to be done.
    • Tests are placeholders for now as there's no functionality yet.
  • Since there wasn't any actual functionality implemented yet, I would consider this to be a part of the thinking or design phase - although at a code level.

Learnings:

  • I need to provide better deadlines and scope the tasks better - one of the main tasks for this week was unbounded and did not have a concrete goal. Chris gave a better idea - designing the data structures - that I went ahead with.
  • Prefer using to_owned() instead of to_string() when converting from a &str to a String -> Thank you Sanford for this.
  • Explored some of sqlparser

Next steps:

  • Add doc strings for all the data structures (structs and enums) and methods.
  • Design the Instruction set.

@tyt2y3 tyt2y3 changed the title Week 1 Plan Plan Jun 19, 2022
@tyt2y3
Copy link
Member

tyt2y3 commented Jun 19, 2022

Week 2 Plan

@Samyak2
Copy link
Collaborator Author

Samyak2 commented Jun 26, 2022

Week 2 Update

Progress made this week

Learnings

  • Explored quite a bit of MySQL to see what kind of queries it supports.
    • Learnt about all the different types of JOINs possible and how MySQL supports them.
    • Learnt to read MySQL's grammar - it used a variant of EBNF
  • Explored a few other Databases written in Rust to see how they implement nullable Value among other things.
  • Learnt how a new type instead of a built-in type helps when it has a specific semantic meaning.

@Samyak2
Copy link
Collaborator Author

Samyak2 commented Jun 26, 2022

Week 3 Plan

@tyt2y3
Copy link
Member

tyt2y3 commented Jun 26, 2022

Yeah actually the MySQL documentation is awesome. I remember reading it a lot when I first wrote SeaQuery. And reading it a second time when writing SeaSchema

@Samyak2
Copy link
Collaborator Author

Samyak2 commented Jun 27, 2022

Yeah actually the MySQL documentation is awesome. I remember reading it a lot when I first wrote SeaQuery. And reading it a second time when writing SeaSchema

The documentation is pretty good. Although, in this particular case of JOINs, I prefer the PostgreSQL version. It explains clearly what types of JOINs are possible and the difference between them. In contrast, the MySQL version reads more like a list of quirks with JOINs in MySQL. To be clear, it also provides the same information as the PostgreSQL version, but it's much more scattered and harder to read.

@Samyak2
Copy link
Collaborator Author

Samyak2 commented Jul 5, 2022

Week 4 Plan

@Samyak2
Copy link
Collaborator Author

Samyak2 commented Jul 19, 2022

Week 5 Plan

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
Status: Triage
Development

No branches or pull requests

2 participants