Apache Superset is a Data Visualization and Data Exploration Platform
-
Updated
Jun 13, 2024 - TypeScript
Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.
Apache Superset is a Data Visualization and Data Exploration Platform
Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage and metadata. Runs and scales everywhere python does.
vtreat is a data frame processor/conditioner that prepares real-world data for predictive modeling in a statistically sound manner. Distributed under a BSD-3-Clause license.
I'm an IT graduate with a passion for data, software, and cloud computing. With a knack for problem-solving and a commitment to staying updated with cutting-edge technologies, I aim to contribute to innovative projects and help organizations achieve their goals.
Public Fused UDFs. Build any scale workflows with the Fused Python SDK and Workbench webapp, and integrate them into your stack with the Fused Hosted API.
Tracking outages for Puerto Rico's private electricity distributor with GitHub Actions.
This project dives deep into customer sales data to uncover valuable insights for business decision-making. It leverages machine learning and time-series forecasting to predict customer churn, forecast product demand, and segment customers based on their purchasing behavior.
EvalML is an AutoML library written in python.
This is a repository for collecting papers and code in time series domain.
The Universal Storage Engine
Here you can find the repository of the end-of-the-course project for Data Science, a module of the integrated course in Computational Management of Data (I.C.), aa 2023/2024, DhDk unibo.
HTTP API for datanommer and the fedmsg bus
My Jupyter notebooks in which I practice data science.
Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
An orchestration platform for the development, production, and observation of data assets.
Prefect is a workflow orchestration tool empowering developers to build, observe, and react to data pipelines
Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, with more integrations coming..
A lab notebook for prompting.