Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions for migrate data from cluster-to-cluster #4624

Closed
3 tasks
rickchen12 opened this issue Jul 13, 2023 · 19 comments
Closed
3 tasks

Questions for migrate data from cluster-to-cluster #4624

rickchen12 opened this issue Jul 13, 2023 · 19 comments
Assignees
Labels
question The question issue vmctl

Comments

@rickchen12
Copy link

Is your question request related to a specific component?

vmctl

Describe the question in detail

I want to migrate the monitor data from a cluster to another cluster. after I checked in doc, I found this part:
image

I have some questions:

  1. there are 12 old instances and 6 new instances, if I use this command to migrate the data, and per the doc, it will migrate the data for all tenants, does it mean I only need to execute this command on 1 instance? once for all
    ./vmctl vm-native --vm-native-src-addr=http://127.0.0.1:8481/
    --vm-native-dst-addr=http://127.0.0.1:8480/
    --vm-native-filter-match='{name="vm_app_uptime_seconds"}'
    --vm-native-filter-time-start='2023-02-01T00:00:00Z'
    --vm-native-step-interval=day \
    --vm-intercluster

. or I need to use individual IP, like
./vmctl vm-native --vm-native-src-addr=http://source 1:8481/
--vm-native-dst-addr=http://des A:8480/ \

./vmctl vm-native --vm-native-src-addr=http://source 2:8481/
--vm-native-dst-addr=http://des B:8480/ \

  1. because there only 6 instances for new cluster but 12 instances for old cluster, and I used the --replicationFactor command for old clusters, as far as I know , it will backup the data on 2 or 3 different vmstorage instances. If I use this command , due to the new cluster instances are less than old , I want to know if it cause duplicate data issue or not?.

kindly give me some advices, thx a lot.

Troubleshooting docs

@rickchen12 rickchen12 added the question The question issue label Jul 13, 2023
hagen1778 added a commit that referenced this issue Jul 14, 2023
Addresses #4624

Signed-off-by: hagen1778 <roman@victoriametrics.com>
@hagen1778
Copy link
Collaborator

if I use this command to migrate the data, and per the doc, it will migrate the data for all tenants, does it mean I only need to execute this command on 1 instance? once for all

Yes, vmctl will migrate all the data. As a source you specify vmselect address. vmselect has access to all vmstorage nodes. Hence, it will return back data from all vmstorage nodes.
You can specify not exact vmselect address, but a load balancer if you have one. In this case, the exporting load will be spread across available vmselects.

because there only 6 instances for new cluster but 12 instances for old cluster, and I used the --replicationFactor command for old clusters, as far as I know , it will backup the data on 2 or 3 different vmstorage instances. If I use this command , due to the new cluster instances are less than old , I want to know if it cause duplicate data issue or not?.

It will migrate all the data from old cluster, including replicated data. You're right, it will produce duplicates that need to be removed. And to keep the same replicationFactor (if needed), one should configure vminsert on the destination address with corresponding replicationFactor=N. See #4633

@hagen1778 hagen1778 self-assigned this Jul 14, 2023
hagen1778 added a commit that referenced this issue Jul 14, 2023
Addresses #4624

Signed-off-by: hagen1778 <roman@victoriametrics.com>
valyala pushed a commit that referenced this issue Jul 14, 2023
Addresses #4624

Signed-off-by: hagen1778 <roman@victoriametrics.com>
@rickchen12
Copy link
Author

rickchen12 commented Jul 17, 2023

hi @hagen1778 , thanks a lot! I have another question, due to the data are very huge(more than 3 months, and more than 20T).. I tried from my side, seems like it's very slow to migrate these data, for example:
image
this is I tested with 1 instance, and migrate only 2-3 hours data, and it seems like need more than 20-30 mins to finish... I havent tried multiple instances with load balance..

could you give me some advices on it?. how to make it faster?.

and after it's done, it pop out this error msg:
2023/07/17 06:51:38 migration failed: cannot get metrics from source http://10.99.41.104:8481: series request failed: unexpected response code 422: {"status":"error","errorType":"422","error":"cannot obtain values for label "name": cannot fetch label values from vmstorage nodes: cannot get label values from vmstorage 10.99.41.72:8401: cannot execute funcName="labelValues_v5" on vmstorage "10.99.41.72:8401": the number of matching timeseries exceeds 300000; either narrow down the search or increase -search.max* command-line flag values at vmselect; see https://docs.victoriametrics.com/#resource-usage-limits"}
-----is it due to this "the number of matching timeseries exceeds 300000"?. how to reslove?.

and the vmselect seems like been killed automatically due to below error msg:
{"ts":"2023-07-06T14:55:45.659+0800","level":"info","caller":"VictoriaMetrics/lib/httpserver/httpserver.go:97","msg":"pprof handlers are exposed at http://127.0.0.1:8481/debug/pprof/"}
{"ts":"2023-07-17T16:04:02.874+0800","level":"info","caller":"VictoriaMetrics/app/vmselect/querystats/querystats.go:87","msg":"enabled query stats tracking at /api/v1/status/top_queries with -search.queryStats.lastQueriesCount=20000, -search.queryStats.minQueryDuration=1ms"}
{"ts":"2023-07-17T16:06:25.642+0800","level":"panic","caller":"VictoriaMetrics/lib/fs/fs_unix.go:50","msg":"FATAL: cannot determine free disk space on "/tmp/searchResults": no such file or directory"}

how to avoid it?

and another quick question.. if I just copy the source data file of vmstorage from old cluster to new cluster, like :
scp source _IP1@source_path/* des_ip1@des_path
does it work?.. just wondering. LOL

thanks again!!

@hagen1778
Copy link
Collaborator

hagen1778 commented Jul 20, 2023

could you give me some advices on it?. how to make it faster?.

See https://docs.victoriametrics.com/vmctl.html#migrating-data-from-victoriametrics

Migration speed can be adjusted via --vm-concurrency cmd-line flag, which controls the number of concurrent workers busy with processing. Please note, that each worker can load up to a single vCPU core on VictoriaMetrics. So try to set it according to allocated CPU resources of your VictoriaMetrics destination installation.

I'd also suggest to change interval from hours to days or weeks.


-----is it due to this "the number of matching timeseries exceeds 300000"?. how to resolve?.

follow the recommendation from the error message and bump search.maxSeries to bigger value:

Migrating big volumes of data may result in reaching the safety limits on src side. Please verify that -search.maxExportDuration and -search.maxExportSeries were set with proper values for src. If hitting the limits, follow the recommendations here. If hitting the number of matching timeseries exceeds... error, adjust filters to match less time series or update -search.maxSeries command-line flag on vmselect/vmsingle;


{"ts":"2023-07-17T16:06:25.642+0800","level":"panic","caller":"VictoriaMetrics/lib/fs/fs_unix.go:50","msg":"FATAL: cannot determine free disk space on "/tmp/searchResults": no such file or directory"}
how to avoid it?

vmselect persists temporary results from search queries on disk if there is no enough mem to fit them. It is recommended to configure vmselect with PVC with couple of GBs of the space.

and another quick question.. if I just copy the source data file of vmstorage from old cluster to new cluster, like :
scp source _IP1@source_path/* des_ip1@des_path
does it work?.. just wondering. LOL

It does. See https://docs.victoriametrics.com/#from-victoriametrics

@rickchen12
Copy link
Author

thx again!! your answer help me a lot!!!

and another quick question.. if I just copy the source data file of vmstorage from old cluster to new cluster, like :
scp source _IP1@source_path/* des_ip1@des_path
does it work?.. just wondering. LOL

It does. See https://docs.victoriametrics.com/#from-victoriametrics
-----cool, but the old cluster have 12 instances, the new cluster only have 6 instances, if I copy random 2 instances to 1 instance, does it work?.

@hagen1778
Copy link
Collaborator

if I copy random 2 instances to 1 instance, does it work?.

No, you can't merge data from two or more storage nodes with simple copying. It can be only done with 1-to-1 match.
You need to re-import data if you need to re-shard it to a different number of shards (vmstorages).

@rickchen12
Copy link
Author

hi @hagen1778 sorry to bother you, I already changed the add the --vm-concurrency command, but seems like nor work, as bellow:

before add the command:
command:
./vmctl-prod vm-native --vm-native-src-addr=http://10.99.41.104:8481/ --vm-native-dst-addr=http://10.99.41.179:8480/ --vm-native-filter-time-start='2023-07-24T03:00:00Z' --vm-native-step-interval=hour --vm-native-step-interval=day --vm-intercluster

log:
2023/07/24 06:20:41 Import finished! 2023/07/24 06:20:41 VictoriaMetrics importer stats: time spent while importing: 30m40.329177197s; total bytes: 171.0 GB; bytes/s: 92.9 MB; requests: 7369; requests retries: 0; 2023/07/24 06:20:41 Total time: 30m40.330029847s

after add the command:
command:

./vmctl-prod vm-native --vm-native-src-addr=http://10.99.41.104:8481/ \
  --vm-native-dst-addr=http://10.99.41.179:8480/ \
  --vm-native-filter-time-start='2023-07-24T03:00:00Z' \
  --vm-native-step-interval=hour --vm-native-step-interval=day --vm-concurrency=16 --vm-intercluster

log:
2023/07/24 07:04:00 VictoriaMetrics importer stats: time spent while importing: 36m48.165801639s; total bytes: 187.1 GB; bytes/s: 84.7 MB; requests: 7369; requests retries: 0; 2023/07/24 07:04:00 Total time: 36m48.166906476s

----any suggestions?.

@hagen1778
Copy link
Collaborator

----any suggestions?.

If increasing concurrency doesn't help, you likely having a bottleneck elsewhere:

  1. Source database is maxed out by resources
  2. Destination database is maxed out by resources
  3. Network between vmctl and src/dst is maxed out

90MB/s migration speed is pretty decent. Are you sure the network between components isn't maxed out?

@rickchen12
Copy link
Author

rickchen12 commented Jul 24, 2023

If increasing concurrency doesn't help, you likely having a bottleneck elsewhere:

Source database is maxed out by resources
Destination database is maxed out by resources
Network between vmctl and src/dst is maxed out
90MB/s migration speed is pretty decent. Are you sure the network between components isn't maxed out?
----I see, maybe due to the network limit, thx again.

and there are many error msgs when I tried to migrate the data about 1 month.

2023/07/24 07:54:22 Initing import process from "http://10.99.41.104:8481/select/1:0/prometheus/api/v1/export/native" to "http://10.99.41.179:8480/insert/1:0/prometheus/api/v1/import/native" with filter 
        filter: match[]={__name__!=""}
        start: 2023-07-01T03:00:00Z for tenant 1:0
2023/07/24 07:54:22 Exploring metrics...
2023/07/24 07:54:25 Found 799103 metrics to import
2023/07/24 07:54:25 Selected time range will be split into 24 ranges according to "day" step. Requests to make: 19178472
Requests to make for tenant 1:0: 62100 / 19178472 [█▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒] 0.32%2023-07-24T07:54:35.378Z        error   VictoriaMetrics/app/vmctl/backoff/backoff.go:56 got error: failed to init export pipe: export request failed: unexpected error when performing request: Get "http://10.99.41.104:8481/select/1:0/prometheus/api/v1/export/native?end=2023-07-14T03%3A00%3A00Z&match%5B%5D=%7B__name__%3D%22AWSSNS_01ca_5310_8ffd_78db40f3c722_92_verifyCode_feedBack_total%22%7D&start=2023-07-13T03%3A00%3A00Z": dial tcp 10.99.41.104:8481: connect: cannot assign requested address on attempt: 1; will retry in 1s
2023-07-24T07:54:35.378Z        error   VictoriaMetrics/app/vmctl/backoff/backoff.go:56 got error: failed to init export pipe: export request failed: unexpected error when performing request: Get "http://10.99.41.104:8481/select/1:0/prometheus/api/v1/export/native?end=2023-07-07T03%3A00%3A00Z&match%5B%5D=%7B__name__%3D%22AWSSNS_01ca_5310_8ffd_78db40f3c722_92_verifyCode_feedBack_total%22%7D&start=2023-07-06T03%3A00%3A00Z": dial tcp 10.99.41.104:8481: connect: cannot assign requested address on attempt: 1; will retry in 1s
Requests to make for tenant 1:0: 63278 / 19178472 [█▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒] 0.33%2023-07-24T07:54:36.181Z        error   VictoriaMetrics/app/vmctl/backoff/backoff.go:56 got error: failed to init export pipe: export request failed: unexpected error when performing request: Get "http://10.99.41.104:8481/select/1:0/prometheus/api/v1/export/native?end=2023-07-17T03%3A00%3A00Z&match%5B%5D=%7B__name__%3D%22AWSSNS_01d2_56cc_b0f9_00ac56290a5b_7_verifyCode_feedBack_total%22%7D&start=2023-07-16T03%3A00%3A00Z": dial tcp 10.99.41.104:8481: connect: cannot assign requested address on attempt: 1; will retry in 1s
2023-07-24T07:54:36.182Z        error   VictoriaMetrics/app/vmctl/backoff/backoff.go:56 got error: failed to init export pipe: export request failed: unexpected error when performing request: Get "http://10.99.41.104:8481/select/1:0/prometheus/api/v1/export/native?end=2023-07-18T03%3A00%3A00Z&match%5B%5D=%7B__name__%3D%22AWSSNS_01d2_56cc_b0f9_00ac56290a5b_7_verifyCode_feedBack_total%22%7D&start=2023-07-17T03%3A00%3A00Z": dial tcp 10.99.41.104:8481: connect: cannot assign requested address on attempt: 1; will retry in 1s
2023-07-24T07:54:36.184Z        error   VictoriaMetrics/app/vmctl/backoff/backoff.go:56 got error: failed to init export pipe: export request failed: unexpected error when performing request: Get "http://10.99.41.104:8481/select/1:0/prometheus/api/v1/export/native?end=2023-07-19T03%3A00%3A00Z&match%5B%5D=%7B__name__%3D%22AWSSNS_01d2_56cc_b0f9_00ac56290a5b_7_verifyCode_feedBack_total%22%7D&start=2023-07-18T03%3A00%3A00Z": dial tcp 10.99.41.104:8481: connect: cannot assign requested address on attempt: 1; will retry in 1s
2023-07-24T07:54:36.184Z        error   VictoriaMetrics/app/vmctl/backoff/backoff.go:56 got error: failed to init export pipe: export request failed: unexpected error when performing request: Get "http://10.99.41.104:8481/select/1:0/prometheus/api/v1/export/native?end=2023-07-20T03%3A00%3A00Z&match%5B%5D=%7B__name__%3D%22AWSSNS_01d2_56cc_b0f9_00ac56290a5b_7_verifyCode_feedBack_total%22%7D&start=2023-07-19T03%3A00%3A00Z": dial tcp 10.99.41.104:8481: connect: cannot assign requested address on attempt: 1; will retry in 1s

what's these error msg mean?.

---sorry, I have many questions, because what I do now is a migration for huge data(over 20T), and there are some gap between old and new cluster(12 instances->6 instances, and the cpu/memary/disk are different also, this is the first time I do this, so I dont have any experiences T-T

@hagen1778
Copy link
Collaborator

2023-07-24T07:54:36.182Z error VictoriaMetrics/app/vmctl/backoff/backoff.go:56 got error: failed to init export pipe: export request failed: unexpected error when performing request: Get "http://10.99.41.104:8481/select/1:0/prometheus/api/v1/export/native?end=2023-07-18T03%3A00%3A00Z&match%5B%5D=%7B__name__%3D%22AWSSNS_01d2_56cc_b0f9_00ac56290a5b_7_verifyCode_feedBack_total%22%7D&start=2023-07-17T03%3A00%3A00Z": dial tcp 10.99.41.104:8481: connect: cannot assign requested address on attempt: 1; will retry in 1s

Hm, the error doesn't look related to VM. It is returned by network stack and suggests that there is no enough ports to establish new TCP connection. Do you have --vm-native-disable-http-keep-alive enabled? Can you try increasing ----vm-native-step-interval=day to a week?

@rickchen12
Copy link
Author

@hagen1778 hello again LOL,
Can you try increasing ----vm-native-step-interval=day to a week?
------there are no week option, only support day, hour and month LOL

I checked with my peers, seems like it's not due to the network limit, because the network specification of our instance is 12 Gpbs , seems like the command '--vm-concurrency=16' dosent work , any advices?.

command:
sudo ./vmctl-prod vm-native --vm-native-src-addr=http://10.99.41.104:8481/select/1/prometheus \ --vm-native-dst-addr=http://10.99.41.179:8480//insert/0/prometheus --vm-native-filter-time-start='2023-07-01T03:00:00Z' \ --vm-native-step-interval=month --vm-concurrency=12 --vm-native-disable-http-keep-alive

image

and another question, if we cant decrease the migrate time, I want to know if the incremental data will be migrated after I execute the command?.

for example:
I execute this command at 00:00:00 28 Jul :

sudo ./vmctl-prod vm-native --vm-native-src-addr=http://10.99.41.104:8481/select/1/prometheus \
  --vm-native-dst-addr=http://10.99.41.179:8480/insert/0/prometheus --vm-native-filter-time-start='2023-07-01T03:00:00Z' \
  --vm-native-step-interval=month --vm-concurrency=12  --vm-native-disable-http-keep-alive

and it will last for 60 hours or more, will the data of these 60 hours will be migrated too?. or it will catch a snapshot at 00:00:00 28 Jul, and only will migrate the data before this snapshot.

thx a lot.

@rickchen12
Copy link
Author

and btw, I tested the bandwidth between source instance and des instance with iperf3, seems like the bandwidth is 5G/s, and the transfer speed is about 600mb/s.

here is the test result:

Connecting to host 10.99.41.179, port 5201
[  4] local 10.99.41.104 port 40016 connected to 10.99.41.179 port 5201
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.00   sec   594 MBytes  4.98 Gbits/sec    0    629 KBytes       
[  4]   1.00-2.00   sec   593 MBytes  4.97 Gbits/sec    0    629 KBytes       
[  4]   2.00-3.00   sec   592 MBytes  4.97 Gbits/sec    0    664 KBytes       
[  4]   3.00-4.00   sec   591 MBytes  4.96 Gbits/sec    0    664 KBytes       
[  4]   4.00-5.00   sec   590 MBytes  4.95 Gbits/sec    0   1.06 MBytes       
[  4]   5.00-6.00   sec   592 MBytes  4.97 Gbits/sec    0   1.06 MBytes       
[  4]   6.00-7.00   sec   592 MBytes  4.97 Gbits/sec    0   1.06 MBytes       
[  4]   7.00-8.00   sec   590 MBytes  4.95 Gbits/sec    0   1.55 MBytes       
[  4]   8.00-9.00   sec   591 MBytes  4.96 Gbits/sec    0   1.55 MBytes       
[  4]   9.00-10.00  sec   590 MBytes  4.95 Gbits/sec    0   1.55 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.00  sec  5.78 GBytes  4.96 Gbits/sec    0             sender
[  4]   0.00-10.00  sec  5.78 GBytes  4.96 Gbits/sec                  receiver

@hagen1778
Copy link
Collaborator

------there are no week option, only support day, hour and month LOL

You're right! Created a feature request #4738

I checked with my peers, seems like it's not due to the network limit, because the network specification of our instance is 12 Gpbs , seems like the command '--vm-concurrency=16' dosent work , any advice?.

network limit is one of the 3 points I listed here #4624 (comment). Have you verified the rest?

and it will last for 60 hours or more, will the data of these 60 hours will be migrated too?. or it will catch a snapshot at 00:00:00 28 Jul, and only will migrate the data before this snapshot.

vmctl has no checkpoints for now, so it will re-import everything again. If this happens, it is recommended to configure dedupMinInterval=1ms on the vmstorage and vmselect side to remove potential duplicates after re-importing.

@rickchen12
Copy link
Author

rickchen12 commented Jul 31, 2023

@hagen1778 , I think I'll get the key point soon, thx for your help..
network limit is one of the 3 points I listed here https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4624#issuecomment-1647363148. Have you verified the rest?
-----sorry, I missed, but I dont know how to check the database limit?.

  1. Source database is maxed out by resources
  2. Destination database is maxed out by resources
    ---- I think you mean the limit of vmstorage right?. how to verify it?. pls guide me, thx a lot.

vmctl has no checkpoints for now, so it will re-import everything again. If this happens, it is recommended to configure dedupMinInterval=1ms on the vmstorage and vmselect side to remove potential duplicates after re-importing.
----- which means the execution time is not very important, because there are no checkpoints for vmctl, so no matter how long it will execute in one time, the vmctl will migrate the latest data either. Do I understand right?.

for the example I mentioned before, maybe I'm not make it clear:

for new cluster - only finished the deployment , doesnt start to receive the new monitor data from vmagent yet.
for old cluster - running normal and receive the data. from vmagent

I executed the command in new cluster at 00:00:00 28 Jul:

sudo ./vmctl-prod vm-native --vm-native-src-addr=http://10.99.41.104:8481/select/1/prometheus \
  --vm-native-dst-addr=http://10.99.41.179:8480/insert/0/prometheus --vm-native-filter-time-start='2023-07-01T03:00:00Z'  --vm-native-step-interval=month --vm-concurrency=12  --vm-native-disable-http-keep-alive

and it will execute for 60 hours(will done at 30 Jul), due to the old cluster is still receiving the new monitor data, will it migrate the data of 29 Jul/30 Jul from old cluster to new cluster also in this execution period?. the new one only receive the data that vmctl transferred not from vmagent.

@rickchen12
Copy link
Author

rickchen12 commented Aug 2, 2023

I tested from my side, seems like it wont include the data of the execution time period...

here is the command:

sudo ./vmctl-prod vm-native --vm-native-src-addr=http://10.99.41.104:8481/select/1/prometheus --vm-native-dst-addr=http://10.99.41.179:8480//insert/0/prometheus --vm-native-filter-time-start='2023-07-31T03:00:00Z' --vm-native-step-interval=day --vm-concurrency=6 --vm-native-disable-http-keep-alive

and here is the log:
VictoriaMetrics Native import mode

2023/08/01 05:01:50 Initing import process from "http://10.99.41.104:8481/select/1/prometheus/api/v1/export/native" to "http://10.99.41.179:8480/insert/0/prometheus/api/v1/import/native" with filter 
        filter: match[]={__name__!=""}
        start: 2023-07-31T03:00:00Z
2023/08/01 05:01:50 Exploring metrics...
2023/08/01 05:01:50 Selected time range will be split into 2 ranges according to "day" step. Requests to make: 13420
2023/08/01 09:01:21 Import finished!
2023/08/01 09:01:21 VictoriaMetrics importer stats:
  time spent while importing: 3h59m31.456531627s;
  total bytes: 488.8 GB;
  bytes/s: 34.0 MB;
  requests: 13420;
  requests retries: 0;
2023/08/01 09:01:21 Total time: 3h59m31.457462052s

and I exported the data of one metric, the latest timestamp is 1690866043 = 2023-08-01T05:00:43.000Z

almost same as the timestamp I executed the command.. so if I want to ensure as little data loss as possible, the only way that I can do is to increase the limit of transfer..

do you have any other suggestions for this situation?. @hagen1778

@rickchen12
Copy link
Author

hello, needs your help, bro. @hagen1778 TOT

@hagen1778
Copy link
Collaborator

almost same as the timestamp I executed the command.

Yes, the end timestamp when omitted is set to now(). You can manually set it to whatever you want.

so if I want to ensure as little data loss as possible

For minimal data loss it is recommended to start writing data to two destinations: old and new installations. Then one should use vmctl to migrate data from old to new installation starting from whenever you want to start and ending with the moment when you started to write to two destinations. Once migration is done the new installation will have complete data: ingested in realtime and migrated via vmctl.

@rickchen12
Copy link
Author

@hagen1778 thx a lot,
Yes, the end timestamp when omitted is set to now(). You can manually set it to whatever you want. ---- how?. use this command?. "--vm-native-filter-time-end " ?.

I have tried from my side.

this is the command:
sudo ./vmctl-prod vm-native --vm-native-src-addr=http://10.99.41.104:8481/select/1/prometheus --vm-native-dst-addr=http://10.99.41.179:8480//insert/1/prometheus --vm-native-filter-time-start='2023-08-08T03:00:00Z' --vm-native-filter-time-end "2023-08-09T15:07:00Z" --vm-native-step-interval=hour --vm-concurrency=18 --vm-native-disable-http-keep-alive

this is the log:

VictoriaMetrics Native import mode

2023/08/08 03:16:52 Initing import process from "http://10.99.41.104:8481/select/1/prometheus/api/v1/export/native" to "http://10.99.41.179:8480/insert/1/prometheus/api/v1/import/native" with filter 
        filter: match[]={__name__!=""}
        start: 2023-08-08T03:00:00Z
        end: 2023-08-09T15:07:00Z
2023/08/08 03:16:52 Exploring metrics...
2023/08/08 03:16:53 Selected time range will be split into 37 ranges according to "hour" step. Requests to make: 241129
2023/08/08 03:27:44 Import finished!
2023/08/08 03:27:44 VictoriaMetrics importer stats:
  time spent while importing: 10m51.304763228s;
  total bytes: 132.8 GB;
  bytes/s: 204.0 MB;
  requests: 241129;
  requests retries: 0;
2023/08/08 03:27:44 Total time: 10m51.305708643s

and I exported the data, the latest timestamp is 1691464723 = 2023-08-08T03:18:43.000Z

seems like it still not include the latest data..

@hagen1778
Copy link
Collaborator

and I exported the data, the latest timestamp is 1691464723 = 2023-08-08T03:18:43.000Z

Is it for all metrics like this?
The migration process works like the following:

  1. vmctl explores unique time series. Let's say it founds 10.
  2. Then it breaks configured time interval to shorter interval according to vm-native-step-interval
  3. Then performs sequential migration for each time series (p1) for each time interval (p2)

If migration takes 10min for 10 series it is likely each series takes 1m to migrate. Hence, the first time series in the list will contain data from --vm-native-filter-time-start to now(), the next one from --vm-native-filter-time-start to now()+1m delay, etc.

In two words, you shouldn't expect vmctl to migrate all the data including last minutes. If you want to migrate from cluster A to cluster B without losing the most recent data - follow the advice described here #4624 (comment)

@rickchen12
Copy link
Author

thx, I see. you helped me a lot. great community! @hagen1778

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question The question issue vmctl
Projects
None yet
Development

No branches or pull requests

2 participants