Skip to content

feat: create config tool for pipeline setup#22

Merged
TordAreStromsnes merged 13 commits intomainfrom
feature/#10/config-tool
Oct 30, 2025
Merged

feat: create config tool for pipeline setup#22
TordAreStromsnes merged 13 commits intomainfrom
feature/#10/config-tool

Conversation

@TordAreStromsnes
Copy link
Contributor

@TordAreStromsnes TordAreStromsnes commented Oct 20, 2025

This pull request introduces the initial documentation and implementation for the config subpackage in dataorc-utils, providing a robust configuration system for Data Lake pipelines. The changes include new documentation pages detailing configuration usage, core concepts, and validation, as well as the main package initialization and core modules for configuration management, parameter definitions, environment handling, and validation utilities.

Documentation improvements:

  • Added comprehensive documentation pages for CorePipelineConfig, PipelineParameterManager, and configuration defaults/validation, explaining usage patterns, environment integration, and configuration structure. [1] [2] [3] [4]
  • Updated mkdocs.yml to include new documentation sections and enabled advanced markdown extensions for better code highlighting and navigation.

Package initialization and public API:

  • Created __init__.py for the main package and the config subpackage, exposing commonly used configuration symbols and making submodules easily accessible. [1] [2]

Core configuration modules:

  • Implemented defaults.py for building and resolving environment configurations using helper functions and repository/domain overrides.
  • Defined core parameter enums and default values in enums.py, centralizing configuration keys and layer defaults.

Project setup:

  • Added a pyproject.toml for package metadata, build configuration, and development dependencies, standardizing project setup and linting rules.

#10

- Updated mkdocs.yml to include new user guide navigation for dataorc-utils package.
- Added detailed documentation for CorePipelineConfig, Defaults and validation, and PipelineParameterManager.
- Implemented comprehensive examples and usage patterns in the documentation.
- Introduced validation rules for CorePipelineConfig to ensure correct configuration.
- Enhanced the PipelineParameterManager to support environment-specific configurations and custom parameters.
- Added unit tests for configuration validation and lake path generation.
@TordAreStromsnes TordAreStromsnes marked this pull request as draft October 20, 2025 11:36
@TordAreStromsnes TordAreStromsnes changed the title Feature/#10/config tool feature: create config tool for pipeline setup Oct 27, 2025
@TordAreStromsnes TordAreStromsnes marked this pull request as ready for review October 29, 2025 07:01
@hknutsen hknutsen changed the title feature: create config tool for pipeline setup feat: create config tool for pipeline setup Oct 29, 2025
Copy link
Member

@hknutsen hknutsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work! 🥳

I've added a few comments - consider them all suggestions to be discussed rather the requested changes.

Also:

  • Remember to add the dataorc-utils package to release-please-config.json.
  • Currently, the tests directory is placed under the src directory. From what I've seen, it's usually put at the same level as the src directory. I think the idea is that the tests live alongside your source code rather than in your source code.

- Updated CorePipelineConfig documentation for clarity and conciseness.
- Removed unnecessary fields and methods from CorePipelineConfig and related classes.
- Simplified the PipelineParameterManager to focus on environment-driven configuration.
- Enhanced validation logic for environment variables and configuration rules.
- Consolidated infrastructure variable handling into a single dictionary.
- Removed deprecated defaults and validation functions.
- Added comprehensive tests for configuration management and validation.
- Updated mkdocs configuration to include new snippets extension.
- Adjusted release-please configuration for better package management.
@TordAreStromsnes
Copy link
Contributor Author

Great feedback, @hknutsen . Ended up being able to remove a lot of stuff and still ending up making the entire tool more flexible. Let me know if you agree with the changes? Ended up doing quite a large change to the set up and changed the infrastructure naming to env name. Also moved the tests. out of src :)

Thanks!

Copy link
Member

@hknutsen hknutsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 🥳

@TordAreStromsnes TordAreStromsnes merged commit 21a8a84 into main Oct 30, 2025
6 checks passed
@TordAreStromsnes TordAreStromsnes deleted the feature/#10/config-tool branch October 30, 2025 13:20
TordAreStromsnes pushed a commit that referenced this pull request Oct 31, 2025
🤖 I have created a release *beep* *boop*
---


## 0.1.0 (2025-10-30)


### Features

* create config tool for pipeline setup
([#22](#22))
([21a8a84](21a8a84))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
TordAreStromsnes pushed a commit that referenced this pull request Feb 5, 2026
🤖 I have created a release *beep* *boop*
---


##
[0.2.0](dataorc-v0.1.1...dataorc-v0.2.0)
(2026-02-05)


### Features

* add Azure Key Vault support and documentation
([#42](#42))
([abc42a0](abc42a0))
* create config tool for pipeline setup
([#22](#22))
([21a8a84](21a8a84))
* introduce dictionary functionality for environment variables access
([#57](#57))
([b6291fa](b6291fa))
* mount data lake ([#31](#31))
([0bb3e51](0bb3e51))
* **utils:** add argument parsing helper for Databricks wheel tasks
([#43](#43))
([393c6a2](393c6a2))
* **utils:** add retry logic and customizable parameters for
get_keyvault_secret
([#63](#63))
([acbc2b7](acbc2b7))
* **utils:** implement LakeFileSystem for data lake operations and add
documentation ([#64](#64))
([be9e738](be9e738))
* **utils:** support optional revision suffix in version format and
update tests ([#59](#59))
([8ea0b60](8ea0b60))
* **utils:** treat env as plain string and default to "dev"
([#50](#50))
([65473a8](65473a8))


### Documentation

* add changelog tab
([#20](#20))
([2ec4271](2ec4271))
* add CI status badge
([#9](#9))
([8de41fe](8de41fe))
* add contributing guidelines
([#15](#15))
([434cf31](434cf31))
* add developing instructions
([#33](#33))
([835a35e](835a35e))
* add early development phase warning
([#39](#39))
([406746d](406746d))
* bootstrap package ([#6](#6))
([afbb765](afbb765))
* build docs using uv
([#36](#36))
([15a1125](15a1125))
* initialize documentation structure
([#8](#8))
([0adb45d](0adb45d))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants