Popular repositories Loading
-
data-engineering-blueprints
data-engineering-blueprints PublicPatterns and concepts for building resilient data pipelines in Python and Scala
-
kafka-local
kafka-local PublicBasic single broker Kafka cluster - docker compose using confluent image
-
-
streamsets_json_schema_validator_processor
streamsets_json_schema_validator_processor PublicA streamsets dc sample processor for validation records with a specified JSON schema
Java 1
-
Repositories
- ntu-ktp-data-quality Public
This repository is part of the Knowledge Transfer Partnership (KTP) between Nottingham Trent University (NTU) and Bigspark. The aim of this project is to address data quality issues in large datasets specifically in Finance using advanced techniques for error detection, error correction, duplicate detection, and beyond.
itsbigspark/ntu-ktp-data-quality’s past year of commit activity - dbt-airflow-dapr-docker Public
itsbigspark/dbt-airflow-dapr-docker’s past year of commit activity - data-engineering-blueprints Public
Patterns and concepts for building resilient data pipelines in Python and Scala
itsbigspark/data-engineering-blueprints’s past year of commit activity - genai-presidio Public
Repository for PII Anonymizer code package and sample FastAPI API to use it to talk to LLM
itsbigspark/genai-presidio’s past year of commit activity - nuxtjs-template Public template
itsbigspark/nuxtjs-template’s past year of commit activity