- Automated Product Book Titles search task by developing a proof of concept tool that uses Machine learning/Natural Language Processing to perform Fuzzy String Matching on Book Title data to return probable Book Titles matches with a confidence score.
- Automated Data cleaning by developing a proof of concept tool that uses Machine learning to predict duplicate records in the database.
- Developed the solutions in an agile environment on Wiley Data Science Environment that contains JupyterLab and Cloud compute machine.
- Implemented solutions in Python with Dedupe, Fuzzywuzzy libraries to perform Fuzzy String Matching and do data cleaning by predicting the duplicates records in database having more than 450,000 Titles.
Technologies : Python, JupyterLab, Jupyter Notebook, Spyder, Miniconda, Pandas, Dedupe, Fuzzywuzzy, MySQL, PyMySQL, AWS, AWS RDS, git, cmder