Scalable identity resolution, entity resolution, data mastering and deduplication using ML
-
Updated
Nov 18, 2024 - Java
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
Wrangler Transform: A DMD system for transforming Big Data
Data transformation framework for ETL processing with SQL-like syntax and GIS extensions, based on Apache Spark
Preprocessing of data (e.g. filling missing values, normalization,etc.) in field of Data Mining (Knowledge Discovery).
The project efficiently processes user data, demonstrating key components. Explore the code for a structured approach to large-scale data transformations.
🗓️ iCalendar proxy reshaping the data for your needs
Pluggable framework that can be used to spider websites and extract data.
Apache Spark based 'Dist' utility to supplement Data Cooker ETL tool
[👨🎓 BSc thesis] merGeo: Integration Platform For Linked Data Management Tools
Api to receive IoT data from an end device
DeltaFi is a flexible, code-light data transformation and normalization platform.
Add a description, image, and links to the data-transformation topic page so that developers can more easily learn about it.
To associate your repository with the data-transformation topic, visit your repo's landing page and select "manage topics."