Data Engineering Experience 🚀

Flint is a minimalist, agnostic Python framework designed to streamline and standardize data engineering pipelines. By embracing Convention over Configuration, flint eliminates environment friction, absolute path hardcoding, and complex PySpark session management.

✨ Key Features

Zero-Config File Discovery: Automatic tree-walking directory resolution anchors your data catalog using your local pyproject.toml file.
Decentralized Catalog: Declare your metadata layouts inside modular, self-contained mini-YAML files.
Elastic Processing Runtimes: Switch dynamically between Pandas and PySpark execution engines using exactly the same unified interface.
Interactive CLI Scaffolding: Spin up a new production-ready data directory structure instantly with flint init.

📦 Installation

(Once published to PyPI)

pip install flint-core

Or install it directly from the source repository using Poetry:

poetry add git+[https://github.com/idperez720/data-engineering-exp.git](https://github.com/idperez720/data-engineering-exp.git)

🏁 Quick Start

1. Initialize your workspace

Navigate to an empty directory and let the interactive wizard scaffold the workspace conventions:

flint init

2. Declare a dataset

Add a specification block inside conf/catalog/sample_dataset.yaml:

customers:
  description: "Main production customer data"
  format: "csv"
  engine: "pandas"
  storage_path: "data/sample_table.csv"

3. Load data anywhere

Create a Python script or open a Jupyter Notebook inside src/notebooks/ and fetch your data instantly:

from flint_core.core.io import DataLoader

# Autodiscovers your project root boundaries and settings
loader = DataLoader()

# Loads the dataset securely as a Pandas DataFrame
df = loader.load("customers")
df.head()

📖 Complete Documentation

For comprehensive guides, testing architecture deep-dives, and complete API references, visit our documentation site: 👉 http://127.0.0.1:8000/ (Replace with your deployed docs URL, e.g., GitHub Pages)

⚖️ License

Distributed under the MIT License. Any modification or distribution (including forks) must include the original copyright notice and liability waiver. See LICENSE for more information.

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
.github/workflows		.github/workflows
docs		docs
flint_core		flint_core
tests		tests
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
mkdocs.yml		mkdocs.yml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Engineering Experience 🚀

✨ Key Features

📦 Installation

🏁 Quick Start

1. Initialize your workspace

2. Declare a dataset

3. Load data anywhere

📖 Complete Documentation

⚖️ License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Data Engineering Experience 🚀

✨ Key Features

📦 Installation

🏁 Quick Start

1. Initialize your workspace

2. Declare a dataset

3. Load data anywhere

📖 Complete Documentation

⚖️ License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages