A small data pipeline that collects data about billionaires from Forbes and Bloomberg
-
Updated
May 14, 2023 - Python
A small data pipeline that collects data about billionaires from Forbes and Bloomberg
Run histogram-based gradient boosted trees binary classifier on generated data and interpret results with standard metrics, SHAP, and supervised clustering
Tools for processing 96-well plates
O objetivo deste projeto é demonstrar como processar eficientemente um arquivo de dados massivo contendo 1 bilhão de linhas (~14GB), especificamente para calcular estatísticas (Incluindo agregação e ordenação que são operações pesadas) utilizando Python.
@functime-org tutorial at @PythonBiellaGroup
Datagrunt is a Python library designed to simplify the way you work with CSV files. It provides a streamlined approach to reading, processing, and transforming your data into various formats, making data manipulation efficient and intuitive.
Fork just to monitor the changes
Comparison of Python data manipulation packages Pandas and Polars.
Projeto voltado para consumo de dados públicos (Open Data) do GOV.BR visando criar uma extração facilitada dos dados na página de dados abertos da org
DataFrame comparison done right, powered by Rust with polars (AKA the bear-agnostic 🐻 🐼 🐨 🐻❄️ DataFrame comparison library)
Add a description, image, and links to the polars topic page so that developers can more easily learn about it.
To associate your repository with the polars topic, visit your repo's landing page and select "manage topics."