Database plugin for data persistence #227

Open
mcicolella opened this Issue Oct 23, 2016 · 24 comments

Projects

None yet

3 participants

@mcicolella
Contributor

One of the most important feature is data persistence to analyze users' behaviours, sensors data to create charts and so on.
So we should collect and store events and commands to mantain a clear history of our home.
We have already a first prototype of database plugin (called Harvester https://github.com/freedomotic/freedomotic/tree/master/plugins/devices/harvester / http://freedomotic.com/content/plugins/harvester) based on relational model.
Nowadays it is focused on storing only object-related events.
But events and commands can have a different structure (different properties) so a nosql approach could be a better way also for scaling. Maybe Cassandra DB or another solution should be adopted.
Collected data could be analyzed with machine learning algorithms and tools to give Freedomotic intelligence.
Any idea? Please post your comments

@mcicolella
Contributor

@P3trur0 What's your opinion?

@P3trur0
Contributor
P3trur0 commented Oct 25, 2016

@mcicolella, AFAIK it seems an ideal scenario for EventSourcing.

It could make sense to store the events in something like a journal log to have a complete dataset both for analytics and data replaying.

Excuse any terseness and typos, I am on my mobile now!

@mcicolella
Contributor

@P3trur0 In this approach are "executed commands" considered as system "events"?
For example if an user turns on/off a light we should have two "events" as

  1. user X executed command "turn on light Y"
  2. light Y changed its behavior "power" to "on"
    Is it right?
@P3trur0
Contributor
P3trur0 commented Oct 26, 2016

@mcicolella good day!

Yes indeed: mentally shifting I could consider "commands" as events themselves (could commands be defined as event to do an event?), so technically speaking they should be both modeled as DTOs.

This should allow to persist both of the instance types on a EventSourced permanent storage.
So, if they are something like:

TurnTheLight as the command
LightTurned as event

they seem to me both immutable stuff, in the sense that once executed they won't be updated anymore. So an append only approach seems reasonable for both of them.

The main advantage I see here is the potential replaying of the history of the Freedomotic instance.

I hope that's clear (excuse any oversimplification here)!

@mcicolella
Contributor

@P3trur0 Any idea about a possible plugin architecture to manage this?

@P3trur0
Contributor
P3trur0 commented Oct 28, 2016

@mcicolella I think your proposal about Cassandra should be ok. I'd prefer it over MongoDB just because it seems to me more robust and scalable.
However I've not big experience about their usages.
If you like I could give this task a try, but next weekend is my first available slot for this.

Is it fine?

@mcicolella
Contributor

@P3trur0 Of course! If you want and when you have time.
We are thinking about Cassandra to use it with Apache Spark for data analysis but we can consider different solutions.
In this phase it's very important to have a storage system to preserve Freedomotic "history".
It's the base to create other advanced services.
So every idea is welcome!

@mcicolella
Contributor

It's very important to track the user who makes an action (e.g. Bob turns on the tv or the livingroom light at 8:00 pm) so the system can learn his habits and give some recommendations. It's simple if we use a sw client to do that but how to track a pushed phisycal button?

@P3trur0
Contributor
P3trur0 commented Nov 10, 2016

@mcicolella do the physical elements have an UUID so far?
If so, the persistence operation could be store also the identifier of the resource that's triggering the event itself.

@mcicolella
Contributor

@P3trur0 I'm considering a generic scenario where the resources don't have a physical ID. But my concern is related to user's ID. How can I identify the person who has turned on the TV or pushed a "button"?

@P3trur0
Contributor
P3trur0 commented Nov 10, 2016

@mcicolella so you mean a physical event occurred outside of freedomotic on a resource however registered on the system?

@mcicolella
Contributor

@P3trur0 Yes. For example we have the following use case (http://freedomotic.sednet.it/casi-duso/gestione-appartamento/) to manage a flat via relais board. The user can control his home with Freedomotic or with the physical buttons to turn on/off the lights. In the second case Freedomotic detects the event but it can't know who pressed the button. So the system can't learn the user's habits as I wrote previously. Of course if the user uses the GUI or the web client he is logged in and can be identified.
Hope it's clear.

@P3trur0
Contributor
P3trur0 commented Nov 10, 2016

@mcicolella Yup, I see the point.

I guess that, since this scenario is out of the GUI, we should assume that the external event belongs to a general abstract entity (e.g.: House) more than to a specific user.

Later, in the learning mechanism, we'd give a different weight to the events occurred outside the client.

@P3trur0
Contributor
P3trur0 commented Nov 20, 2016

@mcicolella I am keep on working on this plugin.

Here you can see a first draft of its structure to interact with Cassandra.

I've set up an integration test environment using a Docker image of Cassandra 3.9.

So far the plugin is able to startup a cluster and define a Freedomotic keyspace if it does not exist.

Next steps will be the persistence of both events and commands.

Meanwhile could you please give me a feedback about my approach? How does it look according to you?

Thank you!

@mcicolella
Contributor

Hi @P3trur0
I'll check it with more attention.
For now just two notes:

  • the package name we adopted for all plugins is "com.freedomotic" not "it.freedomotic"
  • I saw Cassandra instance parameters are set in the pom.xml. Is it a temporary solution because you haven't yet created the plugin with its manifest file?

Thanks for your great work!

@P3trur0
Contributor
P3trur0 commented Nov 21, 2016

Hi @mcicolella,

thanks!

  • the package names have been fixed (they'll come in the next commit ;));
  • the parameters you see in the pom.xml are the ones used for the Integration Test (however, they are currently not yet managed) container.

For the plugin I am considering to use an external configuration file like persistence/persistence.xml. Is it fine, right?

Thank you!

@mcicolella
Contributor

The plugin requires its configuration file persistence-manifest.xml (if "persistence" is its definitive name).
You can include the parameters for Cassandra instance or add a specific file as in Harvester plugin
https://github.com/freedomotic/freedomotic/tree/master/plugins/devices/harvester/src/main/resources

@P3trur0
Contributor
P3trur0 commented Nov 21, 2016

@mcicolella if you like "persistence" as name I'd stay with it :) Otherwise I am open to other proposals, of course!

@amenak77

Hi,
"persistence" is perfect; I'm with Mauro: great work!! @P3trur0 obviously adding persistence to freedomotic is the first step (and with it there will be a great amount of data to work with), but what will be, in your opinion, next step to add "intelligence" to the framework? All the project would go to a higher level; have a nice evening, Alberto

@P3trur0
Contributor
P3trur0 commented Nov 21, 2016

@amenak77 nowadays machine learning and data analysis are surely buzzwords, so we could use the persisted events and commands to study the behaviours of the users trying to predict/suggest future interactions with the framework.
Thank you too for keeping me involved in this cool project!

@mcicolella
Contributor

@P3trur0 I'm adding your fork link so anyone can follow your work and contribute with ideas and suggestions
https://github.com/P3trur0/freedomotic/tree/issue_227/plugins/devices/persistence

@P3trur0
Contributor
P3trur0 commented Dec 2, 2016

@mcicolella yes for sure! Every comment is definitely welcome!

I'll have spare time to spend on freedomotic again from Thursday the 8th :)

@P3trur0
Contributor
P3trur0 commented Dec 8, 2016

Hi guys, finally I've prepared a first mergeable version of a Persistence plugin to save serialized events on Cassandra.

See PR #243

@amenak77
amenak77 commented Dec 8, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment