Rank-1 Dictionary Learning in PyFlink
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
.gitignore
LICENSE
R1DL_Flink.py
R1DL_Spark.py
README.md
functions.py

README.md

PyFlink R1DL

NOTE: Due to issues with parallelism in the Python API, this script eats a lot of RAM and is not production-ready (hopefully it will be soon).

Rank-1 Dictionary Learning in PyFlink, featured in Implementing dictionary learning in Apache Flink, Or: How I learned to relax and love iterations.

If you like Java or you want something that is probably more stable and doesn't use the incomplete Flink Python API, an implementation is available at quinngroup/flink-r1dl.

Use with the new-iterations-with-multiops branch of GEOFBOT/flink (some binaries available here) which has the bulk iterations implementation and tweaks needed to run this script.

Usage

Run .../pyflink2.sh R1DL_Flink.py without extra arguments for usage information.

Input files are made up of rows of whitespace-separated numbers (whitespace separating the columns).