Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First draft: injected buffer implementation #21

Closed
wants to merge 11 commits into from

Conversation

Dave-McCraw
Copy link

Hi,

Not really a pull request, I just wanted to see if you had any thoughts on this. First time I've ever tried to do anything with Python.

I'd like to switch away from the raspberrypi module to using the OEM gateway. However since my main issue with the raspberrypi solution is instability / data loss, I'd like to back the OEM gateway with a persistent buffer (either file based or MySQL) rather than in-memory.

As a first draft, I've just tried to refactor the existing behaviour exactly into a new class, and then inject that dependency at runtime. This should mean that it's relatively easy to drop new concrete buffer implementations into place.

cheers,

@lafrech
Copy link
Owner

lafrech commented Apr 3, 2014

Hi Dave.

I agree with the need. (It appears as a TODO in the code.)

I can't really test right now, as my setup is not running. If I can't get it to run soon as I'd like it to do, I'll just run a simple temperature only setup for the sake of the tests.

My objective was to do keep the data in memory and optionnaly synchronize it from time to time into a file, why not a DB ?

I expected to do that in the only buffer implementation, using a variable to make it optionnal. A new class is fine, especially if we intend to have several possibilities (nothing, file, DB).

You approach seems exhaustive. I mean the abstract class is populated with a lot of functions. We don't really need to provide so much right now since anyone subclassing it will be in position to modify the code anyway (not like distributing a compiled lib for external usage) but I like to do things this way myself as well.

I don't integrate it as is, as I have a few comments. I'd like to provide you line by line feedback. Sort of a peer review.

I'll try to come back to you asap.

Thank you for your interest and participation.

Please ring bell if I'm too long.

@Dave-McCraw
Copy link
Author

Thanks. My end goal is have a DB implementation with soft deletes / configurable purging.

I find that emoncms at the moment is too brittle, we change this engine to that engine and the various timestore implementations are not portable (unless you have a perfect running copy of emoncms and all relevant dependencies).

Since I only have 5000 data points per day I don't see any reason not to buffer several weeks of RF data on a rolling basis, then I'd have that long to handle any issues and potentially just replay all of the inputs by removing the soft delete flag.

It may seem a little redundant but I'm happy to think of emoncms as a post-processor being fed by a data source holding the raw inputs (the data isn't exactly going to challenge my HDD, and people running on SD cards can still use in-memory buffering).

@Dave-McCraw
Copy link
Author

OK, this [edit: the in memory buffer!] is now working nicely on my live Pi. It would be nice if the buffer implementation could be moved out to the configuration file, so I'm taking a look at Python factory options now.

@pb66
Copy link

pb66 commented Apr 3, 2014

This is a really cool idea the current in memory buffering works really well and extended persistent data buffering would be a great addition,
As a sidenote if you were to use the file option rather than sql then it could also make the "Rock-Solid" RO image virtually "bomb-proof" just by mounting a hdd RW and switching on this function (no additional sql to install)
Would I be right to assume this is all at a data content level and would do the same job regardless of how the data arrives?

@Dave-McCraw
Copy link
Author

Thanks. That's it working with the buffer implementation specified in the config file. It defaults to the InMemoryBuffer for backwards compatibility with existing config files (although installing this requires a restart, it can be done without reconfiguration)

@Dave-McCraw
Copy link
Author

@pb66 the in memory version just provides temporal decoupling of the RX transmissions and onward processing. (I say "just" as though this is not crucial functionality!)

With a persistent buffer it would be possible to upgrade the OEM gateway without losing any buffered transmissions, and expand this to other things like "replay all of last week's submissions" if you had some kind of issue with emoncms that wasn't picked up on for a while.

Personally, I like the MySQL implementation because I'd like to have a node traffic-light board telling me exactly when transmissions were last received. This isn't possible with emoncms (or at least, doesn't seem to be accurate, perhaps because I'm having redis issues).

There are other interesting queries I could do too, like finding nodes with bad RF signal by looking at the proportion of missed/irregular transmissions.

@Dave-McCraw
Copy link
Author

Hmm, I've got mysql working now but I'm not sure whether to push it or not, as I don't want to make the job of reviewing this any harder than necessary. I'll settle for pasting in some query / log output:

Detecting whether any nodes have reception issues (node 7 + 25) or are configured incorrectly (node 24 is sleeping for half as long):

mysql> select node, count(*) from data group by node;
+------+----------+
| node | count(*) |
+------+----------+
|    7 |        1 |
|   20 |       24 |
|   21 |       22 |
|   22 |       22 |
|   24 |       43 |
|   25 |        9 |
+------+----------+
6 rows in set (0.01 sec)

Knowing exactly when each node last called in:

mysql> select node, from_unixtime(time) as last_packet_time from data inner join (select max(id) as id from data group by node) last_update on last_update.id = data.id order by last_packet_time desc;
+------+---------------------+
| node | last_packet_time    |
+------+---------------------+
|   24 | 2014-04-03 17:23:39 |
|   25 | 2014-04-03 17:23:08 |
|   21 | 2014-04-03 17:23:04 |
|   20 | 2014-04-03 17:21:57 |
|   22 | 2014-04-03 17:21:54 |
|    7 | 2014-04-03 16:11:54 |
+------+---------------------+

Catching up on processing after a reboot:

2014-04-03 17:12:53,429 DEBUG MySQL buffer: 102 unprocessed items
2014-04-03 17:12:53,435 DEBUG MySQL buffer: retrieved stored input #27 from node 21 (21,29,0,97,87,106,122,122) at 1396541202
2014-04-03 17:12:53,638 DEBUG MySQL buffer: marking 27 as processed
2014-04-03 17:12:53,865 DEBUG MySQL buffer: 101 unprocessed items
2014-04-03 17:12:53,871 DEBUG MySQL buffer: retrieved stored input #28 from node 24 (24,107,0,756,24) at 1396541259
2014-04-03 17:12:54,026 DEBUG MySQL buffer: marking 28 as processed
2014-04-03 17:12:54,247 DEBUG MySQL buffer: 100 unprocessed items
2014-04-03 17:12:54,254 DEBUG MySQL buffer: retrieved stored input #29 from node 20 (20,28,619,161,185,143,167,171) at 1396541266
2014-04-03 17:12:54,464 DEBUG MySQL buffer: marking 29 as processed
2014-04-03 17:12:54,708 DEBUG MySQL buffer: 99 unprocessed items
2014-04-03 17:12:54,715 DEBUG MySQL buffer: retrieved stored input #30 from node 22 (22,25,618,125,0,0,578,560,0) at 1396541271
2014-04-03 17:12:55,045 DEBUG MySQL buffer: marking 30 as processed
2014-04-03 17:12:55,267 DEBUG MySQL buffer: 98 unprocessed items
2014-04-03 17:12:55,274 DEBUG MySQL buffer: retrieved stored input #31 from node 24 (24,106,0,745,24) at 1396541324
2014-04-03 17:12:55,556 DEBUG MySQL buffer: marking 31 as processed
2014-04-03 17:12:55,807 DEBUG MySQL buffer: 97 unprocessed items
2014-04-03 17:12:55,815 DEBUG MySQL buffer: retrieved stored input #32 from node 21 (21,29,0,97,86,107,122,122) at 1396541326
2014-04-03 17:12:56,015 DEBUG MySQL buffer: marking 32 as processed
2014-04-03 17:12:56,234 DEBUG MySQL buffer: 96 unprocessed items
2014-04-03 17:12:56,241 DEBUG MySQL buffer: retrieved stored input #33 from node 20 (20,28,615,163,186,143,167,171) at 1396541389
2014-04-03 17:12:56,407 DEBUG MySQL buffer: marking 33 as processed
2014-04-03 17:12:56,626 DEBUG MySQL buffer: 95 unprocessed items
2014-04-03 17:12:56,633 DEBUG MySQL buffer: retrieved stored input #34 from node 24 (24,106,0,746,24) at 1396541390
2014-04-03 17:12:56,795 DEBUG MySQL buffer: marking 34 as processed
2014-04-03 17:12:57,034 DEBUG MySQL buffer: 94 unprocessed items
2014-04-03 17:12:57,041 DEBUG MySQL buffer: retrieved stored input #35 from node 22 (22,25,615,126,0,0,527,540,0) at 1396541398
2014-04-03 17:12:57,203 DEBUG MySQL buffer: marking 35 as processed

@lafrech
Copy link
Owner

lafrech commented Apr 3, 2014

OK I reviewed your code and I like it.

oemgateway.conf.dist

Your forgot an API key, here.

I don't mind the active False -> True change. Users should review it anyway before using it. Or maybe we should set both to false. Anyway it probably needs better comments, a complete explanation. We'll deal with this later.

oemgatewaybuffer.py

Great.

self.buffer = getattr(ogbi, bufferImplType)() means the use must write the real class name in the config file. I'd like to allow him to write just 'memory', 'file', 'mysql', or anything. Maybe this could be done with a function in oemgatewaybufferimpl.py module.

We could also do that for the other listeners and buffers, but I don't feel such a need.

Anyway, no rush, let's keep this as it is.

oemgatewaybufferimpl.py

Fine. As I said, you went pretty far into the decomposition of the actions, to make it as generic as possible. You probably didn't have to go that far, but anyway, I like it this way as well.

Perhaps size and even isFull could be decorated @ property.

discardOldestItem is in fact discardOldestItems (plural) in case several items were added in the meantime. You probably didn't test that function, as size won't work, you need self.size(), or self.size if decorated @ property.

discardOldestItemIfFull -> discardOldestItemsIfFull (plural as well). Anyway, did we actually need to split those in two functions ? I don't mind. Just wondering.

storeItem: if we discard before appending, we might as well discard one more item, otherwise the buffer will overflow of one item. In fact, it will also overflow if we do it the other way around, but just for an instant. Maybe discardOldestItems could discard one more item ? (We don't mind here, but for the sake of it.)

removeLastRetrievedItem sounds a bit weird. Assuming the buffer is always used FIFO (strongly recommended if feeding emoncms, since some processing operations rely on sample order), we could rename functions from retrieveItem and removeLastRetrievedItem to retrieveOldestItem and removeOldestItems. Do you see a use case where we wouldn't want to retrieve the oldest item ?

No newline at end of file.

README.md

Needs updating.

General comments

I never liked OemGatewayBuffer as a name, since it was much more than a buffer, and actually not so much of a buffer. Since now we're moving the buffering out of it, it's even worse. Unfortunately, I couldn't come up with a better name. You probably feel the same.

What do you say if we rename OemGatewayBuffer with a better name, and use this one for the actual OemGatewayAbstractBuffer and OemGatewayInMemoryBuffer ? (Or we keep AbstractBuffer and InMemoryBuffer if you prefer.)

Do you have an idea for a better name ?

Of course this would break the config files. But if we're to do this at some point, the sooner the better.

I'm thinking renaming plus creating a new class with pretty much the same name in a file with the same name could be confusing, for instance in old forum threads.

Just asking for your opinion.

This makes me wonder about a better release process. Do you think I should tag current as v1 (or v0.1) and create a dev branch for what we're doing here, until we release a v2 (or 0.2) ?

@lafrech
Copy link
Owner

lafrech commented Apr 3, 2014

Oh, and maybe add some logging in there

def discardOldestItemIfFull(self):
if self.isFull():
self._log.warning("Buffer full, deleting oldest items")
self.discardOldestItem()

(Also needs logger declaration at init.)

@Dave-McCraw
Copy link
Author

OK, in the commit above I have:

  • altered configuration to fix apikey and add buffer size to buffer method
  • supported the use of a map to specify buffer method ('memory')
  • added logging support to buffer impl and now pass in the buffer name to make this more useful
  • revised the interface slightly (size is not necessary, really the client wants to know if there are items. I also changed 'removeLast' to 'discardLast' just to allow for the possibility of marking buffer entries in the DB: soft delete rather than actually removing/deleting them).
  • Improved method naming as suggested
  • algorithm change: size is checked first and if full, an item is dropped so that max size is never exceeded.

@Dave-McCraw
Copy link
Author

Readme updated.

WRT "discardLastRetrievedItem", my thinking was that it does not foreshadow implementation detail. It could be FIFO or FILO but the important thing from the perspective of the system is that the successfully submitted item is not kept in the buffer and submitted again.

@Dave-McCraw
Copy link
Author

Here is my suggestion for rebranding the architecture: the old buffers become 'dispatchers', which use the new buffers.

I've tested this and it works for me (including the full buffer emptying) but I won't commit any more in case you don't like this change, as we can then easily revert it.

It does invalidate all deployed config files so might be worth branding as 'v2' to make it clear.

Since this presumably only gets deployed on Raspberry Pis, I'd also like to debian package this repo and revise the emoncms install instructions to use this over the original module?

@Dave-McCraw
Copy link
Author

Again I reflect to myself that GitHub is all the win compared with passing around code comments and git diff links by email or static document 👍

@pb66
Copy link

pb66 commented Apr 4, 2014

Excellent work!
If oem_gateway is getting a major revamp, maybe rather than a v2 you should consider a name change to emon??? by adopting the "emon suite" theme it would avoid confusion and maybe promote adoption as the "oem_gateway" is not widely recognized as being emonCMS compatible.

emonGateway - closest to the existing name
emonGW - matches the TX TH CMS etc
emonHUB - probably the simplest descriptive name

Just a thought :-)

@Dave-McCraw
Copy link
Author

Names are very important. But in theory it doesn't just do emoncms, right?

It's a marshaller for (serial) data packets but it could easily be extended to read them from e.g. a DB table (or web request), and dispatch them over RFM12b or other HTTP endpoints than just emoncms.

I have to admit, I do find 'oem' slightly confusing (original equipment manufacturer?) but I'm not sure what exactly would be a better name....

@pb66
Copy link

pb66 commented Apr 4, 2014

Since most of the oem stuff is prefixed "emon" my perception of "emonGateway" was gateway to the "emon" branded OEM project rather than to emonCMS in particular, I think the "emon" part is quite essential to qualify it as a core part of the project rather than an optional compatible alternative.
Although emonGateway would of been my 1st choice because of the link to oem_gateway, if gateway isn't technically correct maybe emonHUB is a better choice as not only is hub a less technically specific term, but it also then suggests it is the centre piece to which all else connects., the bit you must always have regardless of emonCMS,

Will this version still work on a RO environment?

@lafrech
Copy link
Owner

lafrech commented Apr 4, 2014

Hi.

Thanks a lot for the rework !

Dispatcher is a great name.

oemgateway.conf.dist

Maybe add bufferSize = 1000 to both buffers.

oemgatewaybuffer.py

May I suggest making bufferMethodMap an attribute to class OemGatewayBuffer ? I think we could get access to it through the class, before instanciating a buffer. This way, adding a new buffer is just about modifying oemgatewaybuffer.py, the abstraction is complete.

Maybe I could do the same kind of map for the listeners and dispatchers (gotta get used to say dispatcher, now...), but I feel the class names can't be simplified, so I don't see what I'd use for names instead.

Great job, overall. Logging, doc update, everything.

I understand about discardLastRetrievedItem. I'm just thinking, who would do other than FIFO ? I used to... and it was a bug. But on the principle, you're right. I agree.

Yes, git is nice, and having something such as GitHub, Gitorious, or equivalent really simplifies the workflow.

I'll create a dev branch for this new stuff (along with @TrystanLea's #20) and probably tag current HEAD as v1. We'll have time to think of another name.

I think my idea was that OEM is OpenEnergyMonitor and emon is emonCMS but this is stupid: emonTX, emonTH, etc. I should have thought twice...

I agree with pb66

  • emonGateway is fine and close.
  • emonGW matches but may be too cryptical.
  • emonHUB sounds cool

In theory, it doesn't do just emonCMS. In practice, it has no processing, and many other web platforms expect processed data. I already thought we could add some processing to the gateway, since the RPi has processing power for that. But this is not the design choice of OEM.

@Dave-McCraw
Copy link
Author

@pb66, legacy config files result in a 1000x in-memory buffer for full back compatibility with existing installs (RO or otherwise). Obviously the MySQL option expands the requirements quite a bit - file persistance would still require an HDD. I haven't tried a flat file adapter since I was wondering if we could just use a flat-file DB engine with the same queries.

@lafrech lafrech mentioned this pull request Apr 4, 2014
@lafrech
Copy link
Owner

lafrech commented Apr 4, 2014

Let's have new name discussion here: #23

…ateway into invertBufferDependency

Conflicts:
	oemgatewaydispatchbuffer.py
@Dave-McCraw
Copy link
Author

This might be a little bit less tidy, but I want to make an initial commit while it's all working 🚀

To use this, you'll need to update your config file as appropriate to reach the following DB table:

CREATE TABLE `data` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `data` varchar(255) DEFAULT NULL,
  `time` int(10) unsigned DEFAULT '0',
  `node` int(11) NOT NULL,
  `processed` int(11) DEFAULT '0',
  `buffer` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`id`)
) 

You can now run queries like

select node, from_unixtime(time) as last_packet_time from data 
inner join (select max(id) as id from data group by node) last_update on last_update.id = data.id
order by last_packet_time desc;

@lafrech lafrech added this to the v1.1 milestone Apr 5, 2014
@lafrech
Copy link
Owner

lafrech commented Apr 5, 2014

Hi Dave.

No problem inserting new stuff now that I created the dev branch.

I probably won't test the mySQL DB part. I mean unless it was needed, I won't bother creating the DB. Besides, I run on SD Card (32 Go) only, so I tend to limit write accesses.

I'll try to understand soft/hard delete another day. More generally, I don't have any high level comment. I didn't try to understand that yet. I'd like to merge cause I dont' want you stuck, and it won't hurt anyone anyway.

Here are a few comments on your latest commits:

I don't really understand this line, but this is debug anyway. I suppose you had a reason.

self._log.debug("MySQL buffer (%s): is full? %d > %d = %s" % (self._bufferName, self.size(), self._maximumEntriesInBuffer, self.size() >= self._maximumEntriesInBuffer))

(Edit: I understand what it does, I don't understand why you'd do that.)

BTW, I try to keep line length under 80 chars. This may seem anachronical. Everybody uses wide screens. I code in an xterm through SSH, and even this simple setup can display longer lines. "Fortunately", GitHub is here to recall us the pain of lines too long to be displayed. But don't bother reworking this specifically, we can do it anytime.

In retrieveItem (and in general), try to keep the exception catching restricted to the exception source. I mean except MySQLdb... should come just after conn.close(). Then use else: statement for the general case where it went fine.

I suppose return [-1,""] is some kind of error reporting mechanism. Why not. I guess the idea is to return an error code and a message. In fact, the python way of doing this may be to create our own exception class, and catch that in the caller. We could think of doing so.

Rework what you can and rebase on dev. I can always do a little rework afterwards if I'm unhappy with the coding style, the README, or anything. It's dev branch, so no rush.

You did quite a lot of rework, already, thank you for that.

(Or is there a easy way for me to pick your commits in my dev branch while your pull request is on master ? I mean easier than you just changing the target.)

@Dave-McCraw
Copy link
Author

Hi,

If you clone your repo and add mine as an extra remote you should be able to merge this into the dev branch no problem (I'm not sure how you use git, I use SSH mode but you can easily figure out the alternative paths if you don't):

git clone git@github.com:Jerome-github/oem_gateway.git
cd oem_gateway
git remote add downstream git@github.com:Dave-McCraw/oem_gateway.git
git fetch downstream
git checkout dev
git merge remote/invertBufferDependency

I will have a look at your other comments. It was tricky to get something working in Python at all, so there are a few rough edges around the error handling etc. as you note.

@pb66
Copy link

pb66 commented Apr 6, 2014

@dave - I tried this install and it's all cloned ok, created/edited .conf & I've installed original dependencies plus python-mySQLdb, but it stalls at

pi@b2d-hdd2 ~/oem_gateway $ ./oemgateway
2014-04-06 11:16:28,895 INFO Logging level set to DEBUG
2014-04-06 11:16:28,900 INFO Opening gateway...
2014-04-06 11:16:28,904 INFO Creating dispatcher emoncms_local
2014-04-06 11:16:28,910 INFO Set up dispatcher 'emoncms_local' (buffer: memory)
2014-04-06 11:16:28,914 INFO Creating dispatcher emoncms_remote
Could not start OemGateway: init() got an unexpected keyword argument 'dbHost'

What should dbHost be set to ? tried ip & hostname with no joy

@lafrech
Copy link
Owner

lafrech commented Apr 6, 2014

Dave, I think you did quite well with Python. I'm using SSH as well. For pull request management, I expected to use the Githug interface, but I can also add your repo to my local clone and do as you say, it's logical. Don't bother too much with the rework. We can do that in several steps. Or I can send a pull request to your dev branch...

pb66, the problem seems to be that when creating the second dispatcher, the wrong class is called, and the init method doesn't know what to do with the dbHost argument.

Did you actually checkout Dave's last commit (what is the first commit appearing in git log ?) ? Could it be the dispatcher or buffer class name that is not entered correctly in your config file ?

@Dave-McCraw
Copy link
Author

It should only look for that if you are creating a MySQL buffer, have you
got an in-memory one but left the config for the DB in place?

On Sun, Apr 6, 2014 at 11:24 AM, pb66 notifications@github.com wrote:

@dave https://github.com/Dave - I tried this install and it's all
cloned ok, created/edited .conf & I've installed original dependencies plus
python-mySQLdb, but it stalls at

pi@b2d-hdd2 ~/oem_gateway $ ./oemgateway
2014-04-06 11:16:28,895 INFO Logging level set to DEBUG
2014-04-06 11:16:28,900 INFO Opening gateway...
2014-04-06 11:16:28,904 INFO Creating dispatcher emoncms_local
2014-04-06 11:16:28,910 INFO Set up dispatcher 'emoncms_local' (buffer:
memory)
2014-04-06 11:16:28,914 INFO Creating dispatcher emoncms_remote
Could not start OemGateway: init() got an unexpected keyword argument
'dbHost'

What should dbHost be set to ? tried ip & hostname with no joy

Reply to this email directly or view it on GitHubhttps://github.com//pull/21#issuecomment-39664013
.

@lafrech
Copy link
Owner

lafrech commented Apr 6, 2014

I will have a look at your other comments. It was tricky to get something working in Python at all, so there are a few rough edges around the error handling etc. as you note.

Regarding the error handling, let's keep the custom exception design for later. I was just trying to understand the [-1,""] return value.

@pb66
Copy link

pb66 commented Apr 6, 2014

Oh yeah, sorry you're right, I had changed database to memory prior to installing python-mySQLdb and forgotten to change back. I have tired a couple of packets and it seems to work now. I did notice that the dispatchers are still created despite "active =" set to false and therefore if/when creation fails the error stops further progress, should an unused/unconfigured dispatcher be able to cause a fail ?

Later today I will configure a setup that I can "switch off" the emoncms server for an hour or 2 to test the catch up.

@lafrech
Copy link
Owner

lafrech commented Apr 6, 2014

Yes, inactive dispatchers are created. They just dont do anything. If a user wants them not to be created, he can remove them from the config file (or comment them).

At the time, I wondered if, when setting active to false, the data waiting to be sent should be trashed or kept for later. I don't remember which option I chose. It is not that important.

So, to answer your question, this is not a bug, rather a design choice.

It is all about what this option is for. It was meant to allow me to neutralize a dispatcher with just an easy change in the config file. It is useful for the debug. In real life, I don't see any use case.

@lafrech lafrech removed this from the v1.1 milestone Apr 6, 2014
@pb66
Copy link

pb66 commented Apr 6, 2014

That's cool if it's meant to be that way. Do you think it may be advantageous for it to report the problem dispatcher, but then continue with the others for a more robust installation ? Although I caused the error during install on this occasion, would a real world fault stop any data being collated. eg power-failure causes a corrupt db or similar and then on reboot the program halts at this point and other servers do not get their data nor does anything get "buffered".

It would be handy to know (although not important right now) if the data is still retained when set to false as if it does this effectively is a "pause" setting which may be a useful feature, if taking a server down for planned maint/upgrade etc it can be paused and reinstated under control after the server is back on-line rather than just relying on a successful or failed delivery.

@Dave-McCraw
Copy link
Author

@pb66 - because the config is reloaded dynamically, it should really continue to operate as long as there is at least one valid buffer. I don't think that's the case though, I think it will kill the daemon (and empty the in-memory cache in the process) if there are any errors.

I haven't quite got my head around the way Python does error handling. It was harder than I thought to implement the MySQL buffer because there's no compile-time type checking etc. so you actually have to run through each possibility to rule out sudden death of the daemon.

That's something we could add as a separate branch / pull to the dev branch though.

I think for the time being we've given @jerome-github plenty of work to do! (#20)...

@lafrech
Copy link
Owner

lafrech commented Apr 7, 2014

Yes, don't worry about the error handling right now. We'll think of it in a further step.

@lafrech lafrech added this to the v1.1 milestone Apr 7, 2014
@lafrech
Copy link
Owner

lafrech commented Apr 8, 2014

I merged into dev, then did a bit of rework, mostly coding style related. Not sure my style is better, but I'd like to keep coherence. And I like 4 space tabs (which is also the choice on emoncms). I hope I didn't break anything... I didn't test the mySQL buffer.

The mySQL buffer is included, but will probably be removed for the first release, then put back.

I haven't been thinking about it, but the error handling would probably be improved by using exceptions.

I'm closing the pull request, but you can still comment here on the subject. Or open a new issue.

Again, it is not fully tested (at least by myself) but i didn't want to stall you. It's dev branch, anyway.

@lafrech lafrech closed this Apr 8, 2014
@lafrech
Copy link
Owner

lafrech commented Apr 9, 2014

@Dave-McCraw: I'm trying to reintroduce bulk send mode. The objective is to send several data sets at a time:

http://domain.tld/emoncms/input/bulk.json?apikey=12345&data=[[-10,10,1806],[-5,10,1806],[0,10,1806]]+&offset=0

This requires emoncms >= 8.0 as it depends on a recent change made by Trystan.

I just modified the code (08c500a) to use this syntax, but I still send one data set (one sample at a time), which is not really interesting...

I'd need to reproduce what I had before already: a6c75ef, 2cddea5.

To do so we'll probably want to modify the buffer API.

@lafrech
Copy link
Owner

lafrech commented Apr 9, 2014

(Sorry, pawned by Ctrl+Enter again.)

We'd need to be able to retrieve several elements at a time, and delete them after sending.

Again, I feel awkward about this DiscardLastRetrieved.

Perhaps the buffer class could provide the following

  • retrieveItems(nb_of_items_to_retrieve)
  • retrieveItem: retrieveItems(1) or retrieveItems(1)[0], to be defined
  • discardLastRetrieveItems

and inside the buffer, there'd be an attribute memorizing which data was about to be sent / or which data to keep if correctly sent. Kinda like what I did here: d184d15.

I can think of a way to do that with the InMemory buffer, perhaps even with the SQL one, but I'd rather ask for your opinion (and anyone else's).

@Dave-McCraw
Copy link
Author

Will the buffer always be asked to supply N items and then either all N succeed or all N fail?

If so, I don't think it's too hard a problem, but it would be a hassle to individually update the buffer (if, say, you ask for 100 items then send in two batches of 50, one of them fails).

Another thought: what happens if one data item is corrupt and rejected by emoncms? I'm guessing it will endlessly attempt to resend that item if so, rather than giving up after a certain number of events (not specific to bulk submissions).

Feel free to rename the prototype methods and I can alter the MySQL buffer to match, since that's the one I am testing with my home installation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants