even nicer readme

SmokinCaterpillar · Feb 23, 2018 · d98880d · d98880d
1 parent 4446f7f
commit d98880d
Showing 1 changed file with 7 additions and 2 deletions.
diff --git a/README.md b/README.md
@@ -4,8 +4,11 @@
 ![test](https://travis-ci.org/SmokinCaterpillar/TrufflePig.svg?branch=master)
 [![Coverage Status](https://coveralls.io/repos/github/SmokinCaterpillar/TrufflePig/badge.svg?branch=master)](https://coveralls.io/github/SmokinCaterpillar/TrufflePig?branch=master)
 
-This is a steemit curation bot based on Natural Language Processing and Machine Learning.
-The deployed bot can be found here: https://steemit.com/@trufflepig
+[Steemit](https://steemit.com) can be a tough place for minnows, as new users are often called. I had to learn this myself. Due to the incredible amount of new posts that are published by the minute, it is incredibly hard to stand out from the crowd. Often even nice, well-researched, and well-crafted posts of minnows get buried in the noise because they do not benefit from a lot of influential followers that could upvote their quality posts. Hence, their contributions are getting lost long before one or the other whale could notice them and turn them into trending topics.
+
+However, this user based curation also has its merits, of course. You can become fortunate and your nice posts get traction and the recognition they deserve. Maybe there is a way to support the Steemit content curators such that high quality content does not go unnoticed anymore. In fact, I developed a curation bot called `TrufflePig` to do exactly this with the help of Natural Language Processing and Machine Learning. The deployed bot can be found here: https://steemit.com/@trufflepig
+
+#### The Concept
 
 The basic idea is to use well paid posts of the past as training examples to teach a Machine Learning Regressor (MLR) how high quality Steemit content looks like. In turn, the trained MLR can be used to identify posts of high quality that were missed by the curation community and did receive much less payment than they deserved. We call this posts *truffles*.
 
@@ -17,6 +20,8 @@ The general idea of this bot is the following:
 
 3. Next, we can compare the predicted payout with the actual payouts of recent Steemit posts (between 24 and 48 hours old). If the Machine Learning model predicts a huge reward, but the post was merely paid at all, we classify this contribution as an overlooked truffle.
 
+### The Implementation
+
 The bot is trained on posts that are older than 7 days and, therefore, have already been paid. Features include style measures such as spelling errors, number of words, readability scores. Moreover, a post's content is modelled as a [Latent Semantic Indexing](https://de.wikipedia.org/wiki/Latent_Semantic_Analysis) projection. The final regressor is simply a multi-output [Random Forest](http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html).
 
 To scrape data from the steemit blockchain and to post a toplist of the daily found truffles the bot uses the official [Steem Python](https://github.com/steemit/steem-python) library.