-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data Schema #226
Comments
I will take care of bug_count.html and line_metrics_showcase.html and Waylon will handle the other two. |
@Rubegen @waylonho I've modified the private project so you can both commit the .mbw to the private repo. Simple follow the instructions on how to commit it has and send the file there. In addition, please post the .png of what you currently have as a comment here (preferably by tomorrow so I have time to look at again). For this week, let's just consider this Notebook since the unit tests will keep you both busy too: The table of interest is towards the end. Try to also understand from the Notebook what the data is trying to tell us and we can iterate on call. In addition, try to understand the "data granularity". Up to this point we were looking at a change of a file (commit) as the granularity of the data. You will notice this contain commit intervals. |
Adding for the record, since it was only discussed on call: Task for Week 4 - Sep 22 was to add the Git Log to this database schema. @waylonho did you find the schema with the table that was due for this week to post here? |
Hi Weylon, I am not sure the table for Exploring Git Log makes much sense: It is just a list of filepaths? Could you explain it to me? The social smells table connects via commit interval --- but you have to design the tables around the fact it is a commit interval, rather than a single commit. In theory they would connect to your Explore Git Log, since thats where your commit hashes are...but I am not sure what is going on your current table. Can you paste a screenshot of what table you are using? It should have been a project_git table. |
I used the table from http://itm0.shidler.hawaii.edu/kaiaulu/articles/gitlog_showcase.html#visualizing-the-git-log. Looking over it, it probably makes sense it was supposed to be the table with commit_hash from http://itm0.shidler.hawaii.edu/kaiaulu/articles/gitlog_entity_showcase.html. Will fix shortly. |
Hi, I see now. There is a misunderstanding: When we last discussed the Project Git Log Notebook, I mentioned the table was not available on itm0. I sent it on Zoom, and you confirmed downloading it. The table you should be using is this: https://drive.google.com/drive/u/2/folders/1XdSZ4YEZFYRTKz8EGAf2UnysKpyoLrWW Git Log Entity Table is something else we have not discussed yet. |
Hi, this is good, thank you! Since we are closing in on your milestone report, and I want to make sure you can devote time to finish the git sample fake data, I expanded the diagram for you both. A few caveats that may be incomplete or missing on what I put together:
In essence, I am removing from you the task of reverse engineering the tables that are relevant from the Notebooks, and decreasing your effort in seeing how they connect with this .mwb file. Your goal is now to improve this than create from scratch. There are also some table that are stubs and need thinking on how to connect. You can think of these as the last steps. If you can work those out on your own and with minimal Q&A towards the end, I'd say you both got the deliverable and the mindset to do this in the future. I also want you to check how I used the layers from the left pane, and also established the foreign keys. On the .mwb, double click any table, and on the bottom there is a list of tabs. Pick a table that has the connection, and select the foreign key tab. One of the two tables connected will have information there. Use that pane instead of the drag and drop so it doesn't generate additional columns. If what left pane and bottom tabs doesn't make sense to you, please ask me on call. The tables you have no idea where to begin, you should know the drill by now: Ask questions here. As you can see, Kaiaulu scope of analysis is fairly large. So it is vital you continue to ask for information on where things are if they are not clear. That, in itself, much like your experience report, is indication documentation is lacking on the Notebooks too. That being said, it is important you at least try to locate or guess where the information is too. The relationships on the expanded table, I hope, should facilitate you seeing the forest for the trees for what we last talked about: "Source Code", "Git Log", "Issue Tracker". I also included table stubs for the ones you made, and layers so you can more easily see how some tables relate. Please try and understand what the data is capturing too, or you will be unable to explain what you are working on your presentations. I promise there is logic to the madness. Moving forward, lets try to pick one layer at a time, and refine them as we go comparing to the Notebooks. You can find the editable .mwb here, where some sample files that are not visible on the itm0 are stored: https://drive.google.com/drive/u/2/folders/1XdSZ4YEZFYRTKz8EGAf2UnysKpyoLrWW Let me know if you have questions. |
Hi. I've connected some of the table stubs that weren't connected before. I see how everything is set up and connected more thoroughly now, but still some points of confusion: I added the project_git table and connected it to Social Smells, is this correct? Also, just want to make sure, there are some tables in the previous ERD that aren't on the bigger one. Are we supposed to add every table from our old diagram that isn't on this one to this one? Just want to clarify because I am not sure. Will edit it more as we go on. |
You should not add the Don't worry about the social smells table for now. Try to fill the information on Source Code, Commit and Issue Tracker. Have a look on Google Drive too, I should have added a few more tables there. Feel free to check with me here the URL of the tables before filling information to save you time. |
Hello, I have added information from the tables on Google Drive to the diagram. Just one more question, is the table at http://itm0.shidler.hawaii.edu/kaiaulu/articles/depends_showcase.html the "files" table in the Source Code part? |
@waylonho No, the table in the notebook is the "dependencies" table. Depends, the tool the Notebook depends on, outputs a graph of dependencies between files. A graph is represented by a "nodes" table, and an "edgelist" table. The table you see on the Notebook is just the edgelist table. So it is safe to say the "files" table is complete, just fill the "dependencies" table :) |
Not sure if it's supposed to be, but the project_dependencies file in the src folder on drive is a json. |
Kaiaulu architecture was partially planned, partially built as features were added. Without familiarity with the domain, it is not always easy to see where the tables connect.
The goal of this issue is making some of these tables relationship more explicit and give you a better understanding of how the issue tracker data connects to other tables the tool can mine.
Our goal is to examine a few notebooks in itm0 (note you do not need to compile them, just read straight from itm0) and identify how the tables can be connected. Contrary to #225, there is a single deliverable here: The MySQL Workbench file with an entity relationship diagram of the data, so it is expected you work together for this issue. The tables in the Notebook should be represented as tables and columns in the MySQL Workbench, and the columns with relation connected. If you see the potential of columns being connected, even if the information is not a exact match, please note them too.
The Notebooks you should divide understanding should be (See table at the end of each notebook):
Please let me know in this issue how you will split the Notebooks between in you two for the next week. For Notebook questions, post on Discussions. For questions concerned the task post here. Please pull request the .mwb to the private repo by Friday 09/15.
The text was updated successfully, but these errors were encountered: