-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Duplicate information of cluster #9
Comments
Hi @zhangxin511 , you should keep in mind the proposed architecture layout. As you could see , there are:
Working on both nodes In this layout with the dbbackend names in influxdb-srelay.conf and the influxdb name in the syncflux.conf shoud be the same. Supose the rwha-sample. influxdb-srelay.conf influxdb-srelay.conf ( in myinfluxdb01_server )...
...
[[influxdb]]
name = "myinfluxdb01"
location = "http://myinfluxdb01_server:8086/"
timeout = "10s"
[[influxdb]]
name = "myinfluxdb02"
location = "http://myinfluxdb02_server:8086/"
timeout = "10s"
[[influxcluster]]
# name = cluster id for route configs and logs
name = "ha_cluster"
# members = array of influxdb backends
members = ["myinfluxdb01","myinfluxdb02"]
log-file = "ha_cluster.log"
log-level = "info"
type = "HA"
query-router-endpoint-api = ["http://myinfluxdb01_server:4090/api/queryactive","http://myinfluxdb02_server:4090/api/queryactive"]
..
... influxdb-srelay.conf ( in myinfluxdb02_server )...
...
[[influxdb]]
name = "myinfluxdb01"
location = "http://myinfluxdb01_server:8086/"
timeout = "10s"
[[influxdb]]
name = "myinfluxdb02"
location = "http://myinfluxdb02_server:8086/"
timeout = "10s"
[[influxcluster]]
# name = cluster id for route configs and logs
name = "ha_cluster"
# members = array of influxdb backends
members = ["myinfluxdb02","myinfluxdb01"]
log-file = "ha_cluster.log"
log-level = "info"
type = "HA"
query-router-endpoint-api = ["http://myinfluxdb02_server:4090/api/queryactive","http://myinfluxdb01_server:4090/api/queryactive"]
...
... Only changes the members and query-router-endpoint-api order to query first its own syncflux syncflux.conf (on myinfluxdb01_server ) master-db = "myinfluxdb01"
slave-db = "myinfluxdb02"
[[influxdb]]
release = "1x"
name = "myinfluxdb01"
location = "http://myinfluxdb01_server:8086/"
admin-user = "admin"
admin-passwd = "admin"
timeout = "10s"
[[influxdb]]
release = "1x"
name = "myinfluxdb02"
location = "http://myinfluxdb02_server:8086/"
admin-user = "admin"
admin-passwd = "admin"
timeout = "10s" syncflux.conf (on myinfluxdb02_server )Only swaps mater and slave values master-db = "myinfluxdb02"
slave-db = "myinfluxdb01"
[[influxdb]]
release = "1x"
name = "myinfluxdb01"
location = "http://myinfluxdb01_server:8086/"
admin-user = "admin"
admin-passwd = "admin"
timeout = "10s"
[[influxdb]]
release = "1x"
name = "myinfluxdb02"
location = "http://myinfluxdb02_server:8086/"
admin-user = "admin"
admin-passwd = "admin"
timeout = "10s" About your questions:
I hope you can understand how smart-relay and syncflux can work together to build a better HA solution when we can not run a InfluxDB Enterprise cluster. Any other question? |
Appreciated for you detailed info. I tried your approach, since I am not sure where the HA load Balance coming from I made only one srelay instance but keeps the others as you suggested, but still can't make data in sync when one node is down.
Here is my setup: docker-compose.yml
The configuration file and folder structure are attached. I am sorry to bother you like this, but could you take a look and let me know what went wrong? Much appreciated! |
Hi @zhangxin511 I will check your config ASAP |
Hi @zhangxin511 first thing I've detected is in your syncflux.toml config. db names should be the same in both engines srelay and syncflux config influxdb-srelay.conf [[influxdb]]
name = "myinfluxdb01"
location = "http://influx-a:8086/"
timeout = "10s"
[[influxdb]]
name = "myinfluxdb02"
location = "http://influx-b:8086/"
timeout = "10s"
syncflux-a.tomlmaster-db = "myinfluxdb01"
slave-db = "myinfluxdb02"
[[influxdb]]
release = "1x"
name = "myinfluxdb01"
location = "http://influx-a:8086/"
admin-user = "admin"
admin-passwd = "admin"
timeout = "10s"
[[influxdb]]
release = "1x"
name = "myinfluxdb02"
location = "http://influx-b:8086/"
admin-user = "admin"
admin-passwd = "admin"
timeout = "10s" syncflux-b.tomlmaster-db = "myinfluxdb02"
slave-db = "myinfluxdb01"
[[influxdb]]
release = "1x"
name = "myinfluxdb01"
location = "http://influx-a:8086/"
admin-user = "admin"
admin-passwd = "admin"
timeout = "10s"
[[influxdb]]
release = "1x"
name = "myinfluxdb02"
location = "http://influx-b:8086/"
admin-user = "admin"
admin-passwd = "admin"
timeout = "10s" Could you fix these config files and test again please? |
@toni-moreno I was not able to make recover works by using your suggestions only. But after change the docker-compose file from using I understand this is still in the early development phase, a suggestion based on my issue: it looks like |
Sorry, too early, it looks like the data recovery is not ALWAYS working for my case , I actually get only one good replication so far and others all failed. I do see the
While only one time there was a good recover, which give this output:
It looks like this block of code is not always executed, which I have no idea how: https://github.com/toni-moreno/syncflux/blob/6627a8281cd93305f9315b6b6be325f4cdbd0dbb/pkg/agent/client.go#L594-L615 |
Hi @zhangxin511 my workmate @sbengo will review your case ASAP |
Thank you @toni-moreno fpr your continuously helps! Let me know if you need anything else @sbengo |
@sbengo @toni-moreno I figured out partially why my data was not recovered:
|
Hi @zhangxin511 , thanks for the info and sorry for the late response! When the Syncflux is initiated it gets info about available databases, rps and measurements attached to them (a.k.a schema), and currently it never refresh it (only on init). As I can see on your logs, on the failing case the schema seems to be empty so it won't iterate over the measurements (on linked function) Bad case:
Working case:
ReviewI think its related with schema creation (if there were no data, the schema would be empty: only db, rp was stored). So:
@toni-moreno opened an issue (I think it was before your comment!) asking for a reload schema toni-moreno/syncflux#16 . We have discussed it and we think we will add this feature on next days, so the schema will be always reloaded before the sync data process. About timing issues/feature, we will keep discussing , but we currently doesn't support those cases Thanks, |
@sbengo Thank you for your detailed response. It would be great to backfill data based on when the data was inserted instead of based on pure the time tag, because a lot of influxDB data are inserted by scheduled JOBs instead of real-time insertion. Lastly, I think the syncflux takes time to start, there is a noticeable delay. Hope you can take a look at. With these being said, I have a full srelay setup and working as your specific, I will close this issue now. Appreciate all your helps @toni-moreno and @sbengo |
The influxcluster in rwha-sample.influxdb-srelay.cong seems duplicated info with query-router-endpoint-api?
For the SyncFlux, we can run as a HA Cluster, with 1 master 1 slave.
My question is:
[[influxdb]]
, but the example show we can also define two endpoints inquery-router-endpoint-api
, which is the cluster endpoint of SyncFlux. My understanding is we can only have one SyncFlux cluster by the existing two db.hamonitor
ofSyncFlux
works. I tested to write data when one node is down (either master or slave), but I don't see data was synced.The text was updated successfully, but these errors were encountered: