- Switzerland
-
14:13
(UTC +01:00) - ssp.sh
- @ssp.sh
- in/sspaeti
- https://dedp.online
- https://subscribe.ssp.sh
datanengineering
DuckDB is an analytical in-process SQL database management system
The Auron accelerator for distributed computing framework (e.g., Spark) leverages native vectorized execution to accelerate query processing
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
Modin: Scale your Pandas workflows by changing a single line of code
Database connectivity API standard and libraries for Apache Arrow
Zero-ETL, infinite possibilities. Live query APIs, code & more with SQL. No DB required.
data load tool (dlt) is an open source Python library that makes data loading easy 🛠️
Sling is a CLI tool that extracts data from a source storage/database and loads it in a target storage/database.
Dagster Labs' open-source data platform, built with Dagster.
Apache XTable (incubating) is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.
This is a list of links to different freely available learning resources about computer programming, math, and science.
This is a repo with links to everything you'd ever want to learn about data engineering
Fastest library to load data from DB to DataFrames in Rust and Python
This is a the starter workspace for HelloDATA BE.
Firefox extension that shows parquet schema when going over GCP cloud storage. Use DuckDB WASM
Support for generating modern platforms dynamically with services such as Kafka, Spark, Streamsets, HDFS, ....
end-to-end data engineering project to get insights from PyPi using python, duckdb, MotherDuck & Evidence
Free, simple, and intuitive online database diagram editor and SQL generator.
The property-based testing library for Python
The best place to learn data engineering. Built and maintained by the data engineering community.
Turning PySpark Into a Universal DataFrame API