Skip to content
Permalink
master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Go to file
 
 
Cannot retrieve contributors at this time

Go Data Science Tooling, Packages, Libraries, etc.

This is a curated list of well-maintained and developing tools, packages, libraries, etc. related to doing data science with Go.

Also, this space includes a list of proposed packages that would fill certain gaps in the ecosystem or provide enhanced functionality.

Proposed

Arithmetic

Bioinformatics

Classification

Clustering

  • github.com/salkj/kmeans - A ready-to-use naive kmeans package for Go.
  • github.com/mpraski/clusters - Go implementations of several clustering algoritms (k-means++, DBSCAN, OPTICS), as well as utilities for importing data and estimating optimal number of clusters.

CSV

Distributed Data Analysis/Pipelining

Geospatial

General data munging

General purpose machine learning

Graphs

JSON

I/O

Matrices/Arrays/Linear Algebra

Neural Networks

NLP

Non-SQL Database Interactions

Parquet

Plotting/dashboarding

Probability/statistics/experiments

Recommendation Systems

Regression

SQL-like Database Interactions

Time Series

Web Scraping

Proposed

  • Multi-dimensional slices within Go itself (Proposal).
  • A robust (and concurrent) package to handle minimizations/fits of data and histograms (gonum/optimize would provide a nice foundation for this).
  • A robust (and concurrent) package to describe statistical models (Bayesian and frequentist) with many nuisance parameters, etc...
  • A Go native package for A/B testing.
  • A database with datalog querying. Inspiration can be drawn from Rich Hickey's Datomic database, but open source.
  • A datalog query system for distributed computation. Similar to Cascalog for the Hadoop ecosystem, but integrating with some of the Go tools instead.