To describe the architecture of persona and store configuration files.
bash install.sh
The systemd services will be generated: persona-offline
, persona-realtime
, persona-flume
and persona-backend
.
And you can use them as service.
user_tag_value
,moc_post
,moc_reply
,moc_comment
comes from moocMySql
.wda_mooc
maybe come from moocHDFS
.Spark
used for off-line data processing.Spark Streaming
used for real-time data processing.Redis
has been chosen for data caching.
How to choose
MySql
,HBase
andRedis
?
-Redis
: the data is easy to lose, but fastest.
-HBase
: data not lose. Is its deployment easy?
-MySql
: too slow.
- How to arrange
persona - ml
module?