A powerful yet friendly ETL (Extract, Transform, Load) tool powered by Polars backend, targeting the large data science community using Python.
Currently in early development. Stay tuned for updates.
[TODO]
[TODO]
[TODO]
Currently we only support x86_64 architectures and Linux distributions using glibc
(GNU C Library) due to lack of dependecy management by the team. Building Polars from source doesn't seem to be so complicated though, so we'll make sure to try again in the near future.
Since we started developing the proof-of-concept with Libadwaita, a building blocks for GNOME applications, so it's supposed to be compatible only with GNOME desktop environment. I think it's possible that the application will still look correct and good on other distributions. Anyway, we'll add support for Windows in the future and hopefully for macOS as well.
The following are some of the resources used in decision making and development planning.:
- The ONLY Data Cleaning Framework You Need | Ep. 3
- How to NAIL Exploratory Data Analysis (Lead Analyst Demo)
We've been doing some research on similar applications as follows:
- Excel
- Google Sheets
- Power BI
- Tableau
- Alteryx Designer
[TODO]
The recommended way to build and run this project is using GNOME Builder.
I personally use Visual Studio Code, but you can use whatever your favorite is. To run and build using Flatpak on VS Code, consider installing Flatpak extension. Run the following commands in the terminal to install the dependencies (on Fedora):
sudo dnf install flatpak flatpak-builder --assumeyes
flatpak remote-add --if-not-exists gnome-nightly https://nightly.gnome.org/gnome-nightly.flatpakrepo
flatpak install gnome-nightly org.gnome.Platform//master
flatpak install gnome-nightly org.gnome.Sdk//master
Type and run Flatpak: Select or Change Active Manifest
in the command palette (F1 or Ctrl+Shift+P) and select the com.macipra.Eruo.Devel.json
manifest file. Finally, type and run Flatpak: Build and Run
in the command palette or simply hit Ctrl+Alt+B.
If you're using a Python language server, you may want to install the requirements. For better dependency management, it's recommended to create a virtual environment rather than installing packages globally:
python -m venv .pyenv
source .pyenv/bin/activate
pip install -r requirements-devel.txt
To add new dependencies using pip
to the flatpak-builder
manifest json file, you can use the flatpak-pip-generator
. Either adding the reference to the com.macipra.Eruo*.json
files or copy-pasting the content directly into the manifest files and delete the generated file. Do not forget to update the requirements*.txt
files as well.
Here are some useful references for the project development:
- Flatpak: https://docs.flatpak.org/en/latest/index.html
- Flathub: https://docs.flathub.org/docs/category/for-app-authors
- GNOME developer: https://developer.gnome.org/documentation/index.html
- GNOME Python API: https://api.pygobject.gnome.org/index.html
- GTK4: https://docs.gtk.org/gtk4/index.html
- PyGObject: https://gnome.pages.gitlab.gnome.org/pygobject/index.html
- Pycairo: https://pycairo.readthedocs.io/en/latest/index.html
- Libadwaita: https://gnome.pages.gitlab.gnome.org/libadwaita/doc/1.4/index.html
- Polars: https://docs.pola.rs/api/python/stable/reference/index.html
Please bear with us, most of the docstrings are AI-generated, though sometimes under my supervision. Your help will be greatly appreciated.
This project is distributed under the GNU General Public License Version 3. We use GTK and Libadwaita to build the user interface, which are licensed under the GNU Lesser General Public License Version 2.1. The backend for data manipulation uses Polars, which is distributed under the MIT License. For other dependencies, see the requirements.txt
file.