fundata1 README

Karmic Social Capital Benchmark

We provide textual documents in the Markdown format. While being just plain .txt documents, they look best when rendered with markdown-aware tools -- e.g. on github this README.markdown can be seen at http://github.com/alexy/fundata1.

FunData is a functional data shootout. The current shootout, the very first one, is for processing real-world Twitter data.

The results are in!

fundata1-results.markdown

The server specs where the timings were obtained.

Some interesting lessons are being gathered in

fundata1-lessons.markdown

The data format, as distributed, are described in

fundata1-replier-graph-format.markdown

Getting the data is described in the aptly named

fundata1-getting-the-data.markdown

The question we're solving is computing Khrabrov and Cybenko's Karmic Social Capital (KSC) for all users communicating via Twitter as present in the data. The mathematical definition is in the file

khrabrov-mind-economy-eccs2010.pdf

A textual description of KSC is in

fundata1-khrabrov-karmic-social-capital.markdown

This git repository is in fact an umbrella for the three submodules comprising the currently available three reference functional representations of the KSC algorithm. Each of them is also hosted on github, here in the order of appearance in the target language:

Each of those languages' repos contains further notes on the choices and possible improvements available in their respective implemetations. Since JVM languages lack an obvious efficient general-purpose serialization, we relax the rules for them a bit.

The machine is a SunFire 4240 server with 64 GB of RAM and 8 CPUs.

The purpose of having separate repos by language is to facilitate forking and improvement of their implementations, potentially beating other languages. You're welcome to supply an implementation of the KSC conforming to the rules in other languages, not necessarily functional.

Some observation on these implementations are posted at functional.tv.

Join the Fundata Google Group to discuss the shootout and provide alternative implementations.

NOTE: I am submitting my Ph.D. in data mining to UPenn/Dartmouth and am looking for a cool job in the Valley/Seattle, hence my bandwidth in improving my own implementations will be somewhat limited for a while into 2011. If you have a self-contained implementation installable on CentOS 5 or Gentoo Prefix, or from source with clear steps, I'd be happy to run it, in the Wide Finder spirit. If you want to speed up the existing implementations, see the TODO.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

fundata1 README

Karmic Social Capital Benchmark

About

Releases

Packages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
data		data
README.markdown		README.markdown
TODO.markdown		TODO.markdown
fundata1-announcement.txt		fundata1-announcement.txt
fundata1-getting-the-data.markdown		fundata1-getting-the-data.markdown
fundata1-khrabrov-karmic-social-capital.markdown		fundata1-khrabrov-karmic-social-capital.markdown
fundata1-lessons.markdown		fundata1-lessons.markdown
fundata1-replier-graph-format.markdown		fundata1-replier-graph-format.markdown
fundata1-results.markdown		fundata1-results.markdown
fundata1-server.markdown		fundata1-server.markdown
khrabrov-mind-economy-eccs2010.pdf		khrabrov-mind-economy-eccs2010.pdf

alexy/fundata1

Folders and files

Latest commit

History

Repository files navigation

fundata1 README

Karmic Social Capital Benchmark

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages