## Data Project Workflow Checklist

### 1. Data Acquisition

* Identify data source and confirm credibility.
* Determine retrieval method (API, file, web scrape, etc.).
* Confirm data covers required time period or scope.
* Save raw data in an unaltered state for reference.

### 2. Data Inspection

* Review dataset structure visually.
* Confirm rows represent observations and columns represent variables.
* Verify data types correspond to expected meanings (numeric, categorical, date, text).
* Check indexing (unique identifiers or datetime consistency).

### 3. Data Cleaning

* **Structure validation:** confirm shape and logical organisation.
* **Completeness:** identify missing values; decide whether to drop, fill, or impute.
* **Duplicates:** detect and remove identical records.
* **Range and logic validation:** confirm values fall within realistic limits and obey logical relationships.
* **Format standardisation:** align units, date formats, and naming conventions.
* **Outlier inspection:** detect extreme values and decide if they are valid or errors.
* **Consistency check:** ensure repeated entries or grouped data match across fields.
* **Documentation:** record every modification to maintain reproducibility.

### 4. Data Transformation

* Select relevant columns for analysis.
* Create derived metrics (returns, ratios, percentages, rolling means, etc.).
* Rename columns for clarity.
* Change data frequency (daily to monthly, etc.) if needed.
* Merge or join related datasets when multiple sources are used.

### 5. Data Exploration

* Compute descriptive statistics (mean, median, range, variance, correlations).
* Identify relationships and trends through summary tables.
* Plot visual summaries (line charts, bar charts, histograms, boxplots, scatter plots).
* Note any unexpected behaviours or patterns for follow-up.

### 6. Analysis and Modelling

* Define clear questions or hypotheses.
* Choose analytical or statistical methods aligned with the goal.
* Validate assumptions for chosen techniques.
* Compare alternative methods if applicable.

### 7. Interpretation and Insight

* Translate numerical or visual results into plain-language conclusions.
* Quantify uncertainty or confidence in findings.
* Identify limitations or data caveats.
* Relate results to the original goal or question.

### 8. Presentation and Reporting

* Create clean visualisations suitable for the intended audience.
* Annotate charts with context, labels, and explanations.
* Organise findings logically within a notebook or report.
* Export clean figures and datasets for reproducibility.

### 9. Archival and Reproducibility

* Save final cleaned datasets and scripts separately from raw data.
* Document environment, package versions, and processing steps.
* Maintain versioned storage (e.g., Git or local copies).

---

This checklist applies to all data projects regardless of language or platform. It ensures complete workflow coverage before, during, and after analysis.
