README: Data Quality Repo Using Great Expectations on Databricks

This repository facilitates developers in implementing data quality checks using Great Expectations on Databricks. By following the provided instructions, developers can quickly set up and run validation flows for their data pipelines.

Getting Started

To begin using the repository, follow these steps:

Sample Data Generation: Execute the notebooks in the sample_data folder to generate two managed tables. These tables serve as the foundation for generating expectation suites and validation flows.
Expectation Suite Generation: Run the notebooks in the great_expectations/suite_generators folder to generate expectation suites. Two suites will be created, one for customers and one for consumers, tailored to your specific data.
Configuring Data Docs: If you wish to host the index.html file at an Azure Storage Account, please refer to the official Great Expectations documentation here.
Validation Flow Execution: After configuring the data docs, execute the notebooks in the great_expectations/validation_flows folder to generate validation results. This step ensures that your data meets the defined expectations.

Folder Structure

sample_data: Contains notebooks for generating sample data.
great_expectations/suite_generators: Includes notebooks for generating expectation suites.
great_expectations/validation_flows: Contains notebooks for executing validation flows.

Contributing

Contributions to enhance the repository are welcome! Feel free to open issues or submit pull requests.

License

This project is licensed under the MIT License.

Acknowledgments

Special thanks to the Great Expectations community for their valuable contributions and support.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
great_expectations		great_expectations
sample_data		sample_data
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

README: Data Quality Repo Using Great Expectations on Databricks

Getting Started

Folder Structure

Contributing

License

Acknowledgments

About

Releases

Packages

Languages

backstagecph/data_quality

Folders and files

Latest commit

History

Repository files navigation

README: Data Quality Repo Using Great Expectations on Databricks

Getting Started

Folder Structure

Contributing

License

Acknowledgments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages