Data generation #1

at15 · 2016-10-22T00:53:32Z

Time series database write is different from other NoSQL, typically it's

key a string for describe the source ie cpu.idle
timestamp when does the event happen
value numeric value, integer or float
tags k=>v for adding attributes to data

Examples

Since generate complex data cost a lot resources, it's a wise idea to save the data to the disk.
while influx-comparison use the bulk form of the target database, I think it's better to use a general serialization format, and store meta data in another file (you can even do some dirty trick to the meta
to change the load without generating the data)

Serialization

https://github.com/alecthomas/go_serialization_benchmarks

The text was updated successfully, but these errors were encountered:

at15 · 2016-11-12T18:46:05Z

since protobuf is widely used, and based on the benchmark result, CapN is not faster than protobuf, I will go with protobuf https://github.com/gogo/protobuf

benchmark                                   iter           time/iter      bytes alloc               allocs
---------                                   ----           ---------      -----------               ------
BenchmarkCapNProtoMarshal-8                 2000000        578 ns/op       56 B/op        2 allocs/op
BenchmarkCapNProtoUnmarshal-8               3000000        515 ns/op      200 B/op        6 allocs/op
BenchmarkCapNProto2Marshal-8                1000000       1427 ns/op      244 B/op        3 allocs/op
BenchmarkCapNProto2Unmarshal-8              1000000       1325 ns/op      320 B/op        6 allocs/op
BenchmarkGoprotobufMarshal-8                2000000        746 ns/op      312 B/op        4 allocs/op
BenchmarkGoprotobufUnmarshal-8              2000000        978 ns/op      432 B/op        9 allocs/op
BenchmarkGogoprotobufMarshal-8             10000000        211 ns/op       64 B/op        1 allocs/op
BenchmarkGogoprotobufUnmarshal-8            5000000        289 ns/op       96 B/op        3 allocs/op

czheo · 2016-11-12T21:33:25Z

Don't you think using serialization could be over designed? tsv/csv file may be enough for this kinda of simple jobs.

at15 · 2016-11-12T21:48:53Z

@czheo kind of, but the csv lib in golang is more hard to use in some sense. I will try csv first and see if it easy to use，optimize stuff can be leave to later when there is a need.

at15 · 2016-11-23T08:21:08Z

That's what we have after #13

{"v":1,"t":869077270450000000,"name":"machine","tag":{"cpu_core":"3","host":"default-3","os":"ubuntu16.04"}}
{"v":1,"t":869077260450000000,"name":"machine","tag":{"cpu_core":"3","host":"default-2","os":"ubuntu16.04"}}
{"v":1,"t":869077270450000000,"name":"machine","tag":{"cpu_core":"3","host":"default-3","os":"ubuntu16.04"}}
{"v":1,"t":869077270450000000,"name":"machine","tag":{"cpu_core":"3","host":"default-2","os":"ubuntu16.04"}}
{"v":1,"t":869077270450000000,"name":"machine","tag":{"cpu_core":"3","host":"default-2","os":"ubuntu16.04"}}
{"v":1,"t":869077270450000000,"name":"machine","tag":{"cpu_core":"3","host":"default-2","os":"ubuntu16.04"}}
{"v":1,"t":869077270450000000,"name":"machine","tag":{"cpu_core":"3","host":"default-2","os":"ubuntu16.04"}}

at15 · 2016-12-26T00:16:06Z

for json serialization and deserialization, using Encoder and Decoder might be better if using reader interface since there are using a pool.

at15 added the disscusion label Oct 22, 2016

at15 added this to the Investigation milestone Oct 22, 2016

at15 mentioned this issue Nov 12, 2016

Config file #6

Closed

8 tasks

at15 modified the milestones: Implementation, Investigation Nov 12, 2016

at15 added working and removed disscusion labels Nov 12, 2016

at15 changed the title ~~Workload Write~~ Data generation Nov 12, 2016

This was referenced Nov 12, 2016

[WIP] Data generation #9

Merged

Data loading #10

Closed

at15 removed the working label Nov 29, 2016

at15 closed this as completed in d3f0e11 Mar 24, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data generation #1

Data generation #1

at15 commented Oct 22, 2016 •

edited

Loading

at15 commented Nov 12, 2016

czheo commented Nov 12, 2016

at15 commented Nov 12, 2016 •

edited

Loading

at15 commented Nov 23, 2016

at15 commented Dec 26, 2016

Data generation #1

Data generation #1

Comments

at15 commented Oct 22, 2016 • edited Loading

at15 commented Nov 12, 2016

czheo commented Nov 12, 2016

at15 commented Nov 12, 2016 • edited Loading

at15 commented Nov 23, 2016

at15 commented Dec 26, 2016

at15 commented Oct 22, 2016 •

edited

Loading

at15 commented Nov 12, 2016 •

edited

Loading