Skip to content
#

Data Science

Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.

Here are 44,329 public repositories matching this topic...

Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage and metadata. Runs and scales everywhere python does.

  • Updated Jun 13, 2024
  • Jupyter Notebook

I'm an IT graduate with a passion for data, software, and cloud computing. With a knack for problem-solving and a commitment to staying updated with cutting-edge technologies, I aim to contribute to innovative projects and help organizations achieve their goals.

  • Updated Jun 13, 2024

This project dives deep into customer sales data to uncover valuable insights for business decision-making. It leverages machine learning and time-series forecasting to predict customer churn, forecast product demand, and segment customers based on their purchasing behavior.

  • Updated Jun 13, 2024
  • Jupyter Notebook

Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, with more integrations coming..

  • Updated Jun 13, 2024
  • Rust