GitHub - ngangawairimu/Data-Validation-using-python: Agricultural dataset validated using python code for usage. Building a data pipeline that will ingest and clean data with the press of a button.

Data Validation Project

Objective:

To validate the MD_agric_df dataset against weather station data, ensuring its accuracy and reliability for agricultural insights.

Key Steps:

Data Pipeline Development:

Built an automated data pipeline for seamless ingestion and cleaning of the MD_agric_df and weather datasets, significantly enhancing code readability and maintainability.

Hypothesis Testing:

Conducted hypothesis testing to evaluate the representation of the MD_agric_df dataset against actual weather conditions, focusing on both means and variances of the distributions. This involved:

Creating a null hypothesis. Cleaning and importing the MD_agric_df dataset. Mapping and comparing it with nearby weather station data. Performing t-tests to interpret results and validate data reliability. Data Quality Checks: Implemented rigorous data validation tests using Python and pytest, checking for:

Correct DataFrame shapes.

Valid column names. Non-negative elevation values. Valid crop types and positive rainfall measurements.

Tools Used:

Python, Pandas, pytest, Jupyter Notebook for exploratory data analysis.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.ipynb_checkpoints		.ipynb_checkpoints
__pycache__		__pycache__
Maji_Ndogo_farm_survey_small.db		Maji_Ndogo_farm_survey_small.db
README.md		README.md
Validating_our_data.ipynb		Validating_our_data.ipynb
data_ingestion.py		data_ingestion.py
field_data_processor.py		field_data_processor.py
validate_data.py		validate_data.py
weather_data_processor.py		weather_data_processor.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Data Validation Project

Objective:

Key Steps:

Data Pipeline Development:

Hypothesis Testing:

Correct DataFrame shapes.

Tools Used:

About

Uh oh!

Releases

Packages

Languages

ngangawairimu/Data-Validation-using-python

Folders and files

Latest commit

History

Repository files navigation

Data Validation Project

Objective:

Key Steps:

Data Pipeline Development:

Hypothesis Testing:

Correct DataFrame shapes.

Tools Used:

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages