Skip to content

Eoghan4/MachineLearningJava

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OOP Assignment - Eoghan McGough

Java Prediction Application

Video Demonstration

My given theme was whether or not a product will be in stock, however this application can be used on any dataset that is given in the same format csv file.

It takes in the csv file and allows the user to train the predictor based on the data, add custom rows to the dataset and retrain the predictor, make a prediction as to whether or not an item will be in stock (different if another dataset is provided) based on the 4 features, and finally it allows the user to test the predictors accuracy by splitting the data into training and test data.


Frequency Table

DemandLevel SellingSpeed RestockFrequency SupplierReliability no yes Grand Total % InStock (yes)
High Fast Frequent Reliable 6 7 13 54%
High Fast Frequent Unreliable 7 5 12 42%
High Fast Rare Reliable 5 4 9 44%
High Fast Rare Unreliable 9 9 18 50%
High Slow Frequent Reliable 7 8 15 53%
High Slow Frequent Unreliable 11 6 17 35%
High Slow Rare Reliable 4 5 9 56%
High Slow Rare Unreliable 3 11 14 79%
Low Fast Frequent Reliable 2 6 8 75%
Low Fast Frequent Unreliable 4 3 7 43%
Low Fast Rare Reliable 11 2 13 15%
Low Fast Rare Unreliable 8 6 14 43%
Low Slow Frequent Reliable 8 9 17 53%
Low Slow Frequent Unreliable 5 5 10 50%
Low Slow Rare Reliable 7 3 10 30%
Low Slow Rare Unreliable 5 9 14 64%
Total 102 98 200

Note: The frequency table for the data (generated by ChatGPT) can also be found as a Pivot Table in the ProductIsInStock(Excel).xlsx file.


Class Structure

My application is split into 5 classes:

  • Control
  • FileHandler
  • DataHandler
  • DataItems
  • Screen

Control

Starts the application, has the main method that instantiates the Screen class.

FileHandler

Simple class for handing the reading/parsing of csv files.

DataHandler

This class does a lot of the heavy lifting of the application. This class contains methods for training the predictor based on the data provided by the FileHandler class, generating frequency tables for the data, and testing the accuracy of the predictor by splitting the data into 150 lines of training data and 50 lines of test data (stratified as to keep the same ratio of yes/no).

DataItems

Simple class for allowing the DataHandler class to instantiate data items as objects, with their four features and one label as attributes.

Screen

GUI class for the application. Handles all visual elements such as buttons and text boxes, and instantiates the FileHandler and DataHandler classes to implement functionality.


Functionality

When the user opens the application, they will be greeted with a screen that includes a number of buttons, textboxes and labels. At the top right hand of the screen is a label indicating hat no file has been selected. The Select Training Data button once clicked allows the user to navigate a file explorer and choose a csv file from which to load their data. Once the data is loaded, the user is presented with a number of options. The predictor can be trained with the provided data by clicking the Train button. New rows can be added by filling in the text boxes labelled with the data's features, clicking the desired label (yes/no) and clicking the Add Row button. Once a row has been added, the user can retrain the data with these new rows by clicking the Train button once again. Once the user has filled the text boxes with the desired features and trained the predictor, they can click the Predict button, which will give a prediction (yes/no) with the confidence (%). If the predictor has been trained, the user can also click the Test Accuracy button, which will split the data into 2 sets, 150 rows of (stratified) training data and 50 of test data, and will test the 50 predictions. Based on this, it will display the accuracy of the model.


What I Would Add

If given more time, I would improve the GUI by adding multiple screens, rather than have everything crammed into one screen. I would also improve the accuracy test by having the stratified data be shuffled beforehand, to ensure more fair testing. I also would've improved some of the funcionality in the DataHandler class to reduce reusing certain aspects of the code.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages