Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data generation #1

Closed
at15 opened this issue Oct 22, 2016 · 5 comments
Closed

Data generation #1

at15 opened this issue Oct 22, 2016 · 5 comments
Milestone

Comments

@at15
Copy link
Member

at15 commented Oct 22, 2016

Time series database write is different from other NoSQL, typically it's

  • key a string for describe the source ie cpu.idle
  • timestamp when does the event happen
  • value numeric value, integer or float
  • tags k=>v for adding attributes to data

Examples

Since generate complex data cost a lot resources, it's a wise idea to save the data to the disk.
while influx-comparison use the bulk form of the target database, I think it's better to use a general serialization format, and store meta data in another file (you can even do some dirty trick to the meta
to change the load without generating the data)

Serialization

@at15 at15 added this to the Investigation milestone Oct 22, 2016
@at15 at15 mentioned this issue Nov 12, 2016
8 tasks
@at15 at15 modified the milestones: Implementation, Investigation Nov 12, 2016
@at15 at15 added working and removed disscusion labels Nov 12, 2016
@at15 at15 changed the title Workload Write Data generation Nov 12, 2016
@at15
Copy link
Member Author

at15 commented Nov 12, 2016

since protobuf is widely used, and based on the benchmark result, CapN is not faster than protobuf, I will go with protobuf https://github.com/gogo/protobuf

benchmark                                   iter           time/iter      bytes alloc               allocs
---------                                   ----           ---------      -----------               ------
BenchmarkCapNProtoMarshal-8                 2000000        578 ns/op       56 B/op        2 allocs/op
BenchmarkCapNProtoUnmarshal-8               3000000        515 ns/op      200 B/op        6 allocs/op
BenchmarkCapNProto2Marshal-8                1000000       1427 ns/op      244 B/op        3 allocs/op
BenchmarkCapNProto2Unmarshal-8              1000000       1325 ns/op      320 B/op        6 allocs/op
BenchmarkGoprotobufMarshal-8                2000000        746 ns/op      312 B/op        4 allocs/op
BenchmarkGoprotobufUnmarshal-8              2000000        978 ns/op      432 B/op        9 allocs/op
BenchmarkGogoprotobufMarshal-8             10000000        211 ns/op       64 B/op        1 allocs/op
BenchmarkGogoprotobufUnmarshal-8            5000000        289 ns/op       96 B/op        3 allocs/op

This was referenced Nov 12, 2016
@czheo
Copy link
Collaborator

czheo commented Nov 12, 2016

Don't you think using serialization could be over designed? tsv/csv file may be enough for this kinda of simple jobs.

@at15
Copy link
Member Author

at15 commented Nov 12, 2016

@czheo kind of, but the csv lib in golang is more hard to use in some sense. I will try csv first and see if it easy to use,optimize stuff can be leave to later when there is a need.

@at15
Copy link
Member Author

at15 commented Nov 23, 2016

That's what we have after #13

{"v":1,"t":869077270450000000,"name":"machine","tag":{"cpu_core":"3","host":"default-3","os":"ubuntu16.04"}}
{"v":1,"t":869077260450000000,"name":"machine","tag":{"cpu_core":"3","host":"default-2","os":"ubuntu16.04"}}
{"v":1,"t":869077270450000000,"name":"machine","tag":{"cpu_core":"3","host":"default-3","os":"ubuntu16.04"}}
{"v":1,"t":869077270450000000,"name":"machine","tag":{"cpu_core":"3","host":"default-2","os":"ubuntu16.04"}}
{"v":1,"t":869077270450000000,"name":"machine","tag":{"cpu_core":"3","host":"default-2","os":"ubuntu16.04"}}
{"v":1,"t":869077270450000000,"name":"machine","tag":{"cpu_core":"3","host":"default-2","os":"ubuntu16.04"}}
{"v":1,"t":869077270450000000,"name":"machine","tag":{"cpu_core":"3","host":"default-2","os":"ubuntu16.04"}}

@at15 at15 removed the working label Nov 29, 2016
@at15
Copy link
Member Author

at15 commented Dec 26, 2016

for json serialization and deserialization, using Encoder and Decoder might be better if using reader interface since there are using a pool.

@at15 at15 closed this as completed in d3f0e11 Mar 24, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants