This repo is a collection of individual scripts that implement various data science and machine learning algorithms using python libraries. The scripts can be used independently of one another depending on the application and can be adapted to different input data sets.
Each script folder represents a different week of course work done for the Data Science & Machine Learning Development Program I completed in 2021.
- Python
Coming into this project, I had previous experience using Python. The biggest challenge; however, was successfully applying that knowledge to implement data science and machine learning algorithms that have a lot of fine details. For example, it was not enough to just find a particular function in the scipy documentation and include it in the script. Instead, you had to dig down into each of the function arguments, understand the impact they had on the result being returned, and then decide what to include. It made for a lot of additional reading, but was a good exercise in digging deep into the documentation that is available for many of these great libraries online.
This project made clear to me the importance of not only understanding the code, but also the underlying scientific / engineering principles behind the solutions you are implementing. Even though a script may run and output a result, it can be meaningless if the author does not know whether it can be taken as reasonable or not.