The content of this repository supports the onboarding tutorial and guides in Soda documentation.
Access the Take a sip of Soda tutorial to use the example data in this repo to set up and run a simple Soda scan for data quality.
Refer to the full instructions in the Take a sip of Soda tutorial.
To enable you to take a first sip of Soda, you can use Docker to quickly build an example PostgreSQL data source against which you can run scans for data quality. The example data source contains data for AdventureWorks, an imaginary online e-commerce organization.
With Docker running, run the following command in Terminal to set up the prepared example data source.
docker run \
--name sip-of-soda \
-p 5432:5432 \
-e POSTGRES_PASSWORD=secret \
sodadata/soda-adventureworks
When the output reads data system is ready to accept connections
, your data source is set up and you are ready to proceed with the tutorial.
Soda is a platform that enables Data Engineers to test data for quality where and when they need to.
Is your data fresh? Is it complete or missing values? Are there unexpected duplicate values? Did something go wrong during transformation? Are all the data values valid? These are the questions that Soda answers for Data Engineers. Read more.
- Learn how to add Soda data quality checks to your CI/CD workflow.
- Learn how to add Soda data quality checks to your data pipeline after ingestion and transformation.
- Learn how to set up Soda to empower Data Analysts and Scientists to write their own checks for data quality.
Need help? Join the Soda community on Slack.