Skip to content

Predicting Hard Drive Failures attempts to tackle this problem via the use of different Machine Learning models. In this repo I have used various models for classifcation as well as regression, this is easily done via the use of PyCaret.

License

Notifications You must be signed in to change notification settings

Risdorn/Predicting-Drive-Failure

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Predicting Drive Failure

This code goes through 3 different methodologies for predicting drive failures.

Dataset Features used

capacity_bytes = Capacity of the Hard Drive
smart_5_normalized and smart_5_raw = Reallocated Sectors Count 
smart_187_normalized and smart_187_raw = Reported Uncorrectable Errors 
smart_188_normalized and smart_188_raw = Command Timeout 	
smart_197_normalized and smart_197_raw = Current Pending Sectors Count	
smart_198_normalized and smart_198_raw = Offline Uncorrectable Sectors Count	
date_diff = Number of days left till failure
failure = Whether hard drive has failed or not

How to Run

The code can be run by opening the .ipynb file in colab and running it.

For running it locally, the file path will have to be changed accordingly.

Data

Data has been taken from BackBlaze. We are using data from the year 2016.

https://www.backblaze.com/cloud-storage/resources/hard-drive-test-data

References

https://www.backblaze.com/blog/using-machine-learning-to-predict-hard-drive-failures/

https://en.wikipedia.org/wiki/Self-Monitoring,_Analysis_and_Reporting_Technology#ATA_S.M.A.R.T._attributes

https://medium.com/geekculture/a-complete-solution-to-the-backbaze-com-kaggle-problem-cf1fab1af529

About

Predicting Hard Drive Failures attempts to tackle this problem via the use of different Machine Learning models. In this repo I have used various models for classifcation as well as regression, this is easily done via the use of PyCaret.

Topics

Resources

License

Stars

Watchers

Forks