• Project application implemented in Python. Analyzed the database that contained uncertain and imprecise references (dirty data). • Cleaned the dataset using proper transformation rules and spelling checks in Python. • Implemented Edit Distance and Jaccard Similarity to query the dataset.
-
Notifications
You must be signed in to change notification settings - Fork 0
bmahaj2/Data-cleaning-and-integration
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|