Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InfluxDB::LineProtocol #1

Closed
domm opened this issue Jul 30, 2015 · 3 comments
Closed

InfluxDB::LineProtocol #1

domm opened this issue Jul 30, 2015 · 3 comments

Comments

@domm
Copy link

domm commented Jul 30, 2015

Hey, I've just seen this rather fresh module, and I've also recently started using InfluxDB. You might want to consider using my InfluxDB::LineProtocol module instead of your _to_line function. A quick glance at our code seems to indicate that you do not escape various character etc. And while InfluxDB::LineProtocol currently does not implement all of the fine details of the line protocol, we might collaborate to make it better (instead of wasting time on duplicate code).

https://github.com/domm/InfluxDB-LineProtocol
https://metacpan.org/pod/InfluxDB::LineProtocol

Greetings,
domm

@ajgb
Copy link
Owner

ajgb commented Jul 30, 2015

Hi,

I'm glad that you've noticed it.

I've seen your module, but it seems that it's implementation is very specific to the company that sponsored it, eg. default field named value or input data precision hardcoded to nanoseconds.

On the other hand my module (as you've noticed) is not making any conversion nor validation of the data, putting the responsibility onto the users of it. As you are aware the InfluxDB is very sensitive to the format of the data (especially to the first write), and I've taken the position that the users know the best how they do want the data to be stored.

That's why my _to_line function is extremely generic, simple and fast, and what is most important does exactly what the user asked for - creates the correctly formatted input to the /write endpoint out of the hash reference.

And that was based on the fact that in vast majority of the cases the data that users will store in InfluxDB would not contain any special characters needed escaping nor quoting.

In other cases my module would happily accept the output of your data2line function:

$cv = AE::cv;
    $db->write(
        database => 'mydb',
        precision => 'n',
        data => data2line('measurement', 42, { tag => 'foo'} ),
        on_success => $cv,
        on_error => sub {
                $cv->croak("Failed to write data: @_");
        }
);
$cv->recv;

I'll be glad to collaborate with you on extending the functionality of your module, as it seems your aim is to provide functions that automate the conversion of some complex data into line protocol expected by InfluxDB.

Thank you,
Alex

@domm
Copy link
Author

domm commented Jul 31, 2015

I've seen your module, but it seems that it's implementation is very specific to the company
that sponsored it, eg. default field named value or input data precision hardcoded to
nanoseconds.

Actually no, we just used the defaults suggested by the InfluxDB docs.

My goal was to make it very easy to write stats, without needing to know all the details of the LineProtocol (eg escaping, ordering of tags, what time precision to use, ..)

and I've taken the position that the users know the best how they do want the data to be stored
..
creates the correctly formatted input to the /write endpoint out of the hash reference.

which all asumes that the users know exactly how InfluxDB works internally. Which is a fair point, but I try to abstract the annoying part away for the user, so that there in only one source for bugs (my conversion code), and not unlimited sources (everywhere a user calls ->write)

And that was based on the fact that in vast majority of the cases the data that users will
store in InfluxDB would not contain any special characters needed escaping nor quoting.

Famous last words :-)

But in the end you are right that your module does not need to use InfluxDB::LineProtocol. as users can just call data2line themselves (as shown in your code sample)

You could just point out in the docs that when using an hash as input to write(), users MUST escape the values properly; or could use InfluxDB::LineProtocol to convert a hash to a line, and pass the line on to you're write(). That way your module stays lean and fast ( though IMO fast-and-wrongish as opposed to slowish-and-correct :-) ), but users of it still get educated on how to easily convert a hash to a proper line.

If you want I can prepare a doc patch.

And I still would love some feedback on my data2line() (missing features, bugs, pot. speed improvements..)

Greetings,
domm

@ajgb
Copy link
Owner

ajgb commented Aug 1, 2015

The fact that there is no quoting nor escaping is already documented.

The use case for using asynchronous client library and dedicated time-series database is the speed - when lots of datapoints are being written, wasting CPU cycles for unnecessary work or transferring redundant data over network is not really acceptable.

But I'm also sure that there are also users (maybe even more) who would prefer hand-holding approach your module provides.

When I have a chance I'll look into your module in more details and make some suggestions if I find anything that needs improving without breaking the spirit of your module.

Thank you,
Alex

@ajgb ajgb closed this as completed Aug 1, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants