Full Sync issues easy solutions!!! #375

ALEX778899 · 2017-12-01T21:42:44Z

Dear IOTA FUNDATION and DEVELOPERS, I'm a systems analyst and Java developer, I'm helping lot of persons to setup a full node but I can notice there are many full nodes not fully synched after weeks with a few transactions to request (around 10 in one hour) and sometimes their database is too big. Obviously they can't use their full nodes with the wallet. Even in https://iotatangle.slack.com is full of users having this problem.
I can guarantee there aren't hardware or software issues, they have full synched neighbors (tcp and udp).
I analyzed a lot of powerful PC, SERVERS, VPS, THERE AREN'T MEMORY PROBLEMS, NO BOTTLENECKS, NO CRASHES, NO ERROR MESSAGES AND THE PROCESS IS RUNNING SMOOTHLY.

I believe IOTA will bring a real freedom and I'm supporting it as much as I can.
I would like to understand why these nodes never became totally synched (Latest Milestone Index=Latest Solid Milestone Index) and why they have so little transactions so please can you explain me if there are any specific requirement I'm not aware about?

Waiting for your explanation
Thanks in advance

***** UPDATE **************
I FOUND AND TESTED TWO DIFFERENT SOLUTIONS FOR THE PROBLEM (ONLY IF YOU FOLLOW THESE SUGGESTIONS AND ANYWAY YOU WILL NOT BECOME FULL SYNC, FOLLOW THIS OTHER INSTRUCTIONS #409.

FIRST SOLUTION
I hope the developers we'll fix the problem soon, in the meantime how to solve the problem:

stop the service;
delete all files in log and db folder;
download the complete database from http://db.iota.partners/IOTA.partners-mainnetdb.tar.gz (updated every 30 minutes) and extract in db folder;
start the service and in max 4 hours you'll be FULL SYNC :-)

SECOND SOLUTION
add LiQio's udp://94.156.128.15:14600 and udp://185.181.8.149:14600 swarm nodes. They will add you back automatically. When you become full sync remember to remove the swarm nodes.

mindlapse · 2017-12-02T22:23:04Z

I have 12 neighbours who are fully synced, however my node is not fully synced. When restarting it catches up to a subtangle milestone several hours before the current one and then sits there. The 'latestMilestone' is newer but it is also out of date by hours.

Traffic is flowing, but in the logs, I notice that I'm not seeing any new notifications of milestones from the coordinator.

I'm using Ubuntu in AWS with Oracle JDK 1.8.1_151. I have not been able to sync my node once since I began with IOTA about a week ago. What could be happening here? Does the full node need to have the ability to request newer milestones?

paulhandy · 2017-12-03T02:30:58Z

@ALEX778899 as you're a systems analyst and a Java developer, please provide some runtime analysis to help us see where the program is finding a bottleneck on your system. I'd be most interested to see how memory is behaving, as I would suspect this type of issue to be mainly related to memory.
Also, what memory parameters are you running with?

You must understand that it is difficult to help solve an issue with so very little context ( though, judging by the all-caps brand-new username and issue title, I wouldn't be surprised to find this to be another instance of a concern troll account. )

rnagler · 2017-12-03T22:04:36Z

same with me, running on linux mint, 4 processor, 16gb, oracle java 8.152, fiber connected, 6 working tcp connections, 2 udp, perfect communication, after 24 hours I have allways the same solid milestone number. Here is my node info:
NODE INFO

appName
IRI
appVersion
1.4.1.2
jreAvailableProcessors
4
jreFreeMemory
149596368
jreVersion
1.8.0_152
jreMaxMemory
3715629056
jreTotalMemory
555220992
latestMilestone
WVDUPPMYDPRZVFPSBPCISWVJUJDVUVSSWCWYRWJJQXGZYDPOA9JVJ9BQXZ9HBOPYIDPTKVBKWMVFA9999
latestMilestoneIndex
295252
latestSolidSubtangleMilestone
XTNQBAH9OPNJFELGCCZNSRCKKOLBTTBXSGLNRGIHYYNJRSXQODMBEDHGFSNYJFRGICWHKVMYYZEIZ9999
latestSolidSubtangleMilestoneIndex
244014
neighbors
9
packetsQueueSize
0
time
1512338547420
tips
3356
transactionsToRequest
12562

rnagler · 2017-12-03T22:08:47Z

My impression after 3 days of syncing trial is, that manual search for neighbors may lead to isles of neighbors with redundant transaction exchange. All new nodes go https://iotatangle.slack.com and sync with nodes that also have just arrived there. Mixing with full synced nodes would be necessary, but if you have a fully synced node you dont want to change your neighbors. So a lot of useless work is done - maybe it helps the tangle, but the effort to become a fully synced node seems to be too high for the moment.

mindlapse · 2017-12-03T22:36:53Z

ideally all new nodes should be close to a public node, and we need more public nodes.

GhostTyper · 2017-12-03T22:38:16Z

I suggest you just start makin' a public node.

mindlapse · 2017-12-03T22:43:30Z

I would, if I could get synced. I can confirm that I have neighbours who are in sync and who are sending traffic, although my node has not yet synced. I just rescanned today and the same symptom still appears (a "latestSolidSubtangleMilestoneIndex" that doesn't move, currently stuck at 295189. The latestMilestoneIndex is also behind, at 295222 of 295272).

I'm using Oracle JDK 8 (1.8.0_151-b12). I'm running on a 4 core server with 8 GB of RAM in the us-east-1 amazon data center using a c5.xlarge instance type.

GhostTyper · 2017-12-03T22:52:39Z

I can give you my database, if you wish. Just extract it and then start your node with it.

mindlapse · 2017-12-03T23:12:28Z

That would be awesome - would you be able to share via dropbox?

GhostTyper · 2017-12-03T23:19:06Z

No, just use this link: https://iota.lukaseder.de/download.html (I'm sorry, but this service is discontinued. Use the download on iota.partners instead.)

GhostTyper · 2017-12-03T23:39:15Z

Did it work?

mindlapse · 2017-12-04T05:18:33Z

It worked! I'm in sync. Thank you so much!

rnagler · 2017-12-04T06:15:12Z

is it possible to get a fully synced database to download and to speed up syncing?

GhostTyper · 2017-12-04T06:18:07Z

I can export it again, if you don't find somebody who can help you (everyone could do this). I will develop an automagical export every night UTC 4h or something like that the next days.

rnagler · 2017-12-04T07:28:04Z

Please export it again

GhostTyper · 2017-12-04T07:33:57Z

Download it here: https://iota.lukaseder.de/download.html (I'm sorry, but this service is discontinued. Use the download on iota.partners instead.)

lunfardo314 · 2017-12-04T09:46:59Z

That is what I posted to slack. repeat it here.

Sync problem.
Always the same pattern:

Latest milestone climbs in sync with botbox while lastest SOLID milestone doesn't move
when solid milestone becomes few hundred points behind the latest one, I restart my IRI
after ctrl-C or kill IRI stops communications but continues working on something for some 10+ min. Only then stops
after restart node is immediately synced with SOLID milestone 2-30 points behind the "latest".
then it repeats from step 1.
E.g After restart some hour ago solid milestone was at 295518 and I know it won't move until next restart (edited)

[10:43]
What I think it is going on.

IRI no doubt receives from neighbors all the information needed for the confirmation of very up-to-date solid subtangle
it doesn't do that because it's too busy gossiping with other nodes. The backlog mounts up, occupies memory etc
when shut down is requested, it stops being busy with gossiping and starts cleaning up it's backlog of confirmations. That's why after restart it is synced immediately

[10:44]
My Ubuntu has 8GB RAM, 2 cores, iri flag is -Xmx6G, swapping disabled, CPU is busy 15-30%
I think it is a bug.

rnagler · 2017-12-04T09:47:16Z

thanks, my old db seemed to be anyhow corrupted, now I can connect with light wallet

LiQio · 2017-12-04T10:15:51Z

I can support what @lunfardo314 summarized.
I would like to add that after moving to a way more powerful server (64GB, 8 Core, NVME SSD) the node found it's solid sync state. Although at the cost of a very huge DB-dir (35 GB). During synchronization RAM usage went up to 18 and more GB.

lunfardo314 · 2017-12-04T10:32:18Z

that's quite a machine for IOT node!

onemoreitguy · 2017-12-04T13:53:57Z

Download the database.
But simply replacing the db directory content will generate java errors :(

ALEX778899 · 2017-12-04T15:06:52Z

@paulhandy I analyzed a lot of powerful PC, SERVERS, VPS, THERE AREN'T MEMORY PROBLEMS, NO BOTTLENECKS, NO CRASHED, NO ERROR MESSAGES AND THE PROCESS IS RUNNING SMOOTHLY.
Please test it, install a fresh node in Ubuntu/Centos and you'll never become FULL SYNC!

**** HELP FOR ALL THE USERS HAVING THE PROBLEM****
I FOUND AND TESTED A TEMPORARY SOLUTION:
I hope the developers we'll fix soon, in the meantime how to solve the problem:

stop the service;
delete all files in log and db folders;
download the complete database from http://db.iota.partners/IOTA.partners-mainnetdb.tar.gz (updated every 30 minutes) and extract in db folder;
start the service and in max 5 hours you'll be FULL SYNC :-)

rnagler · 2017-12-04T16:10:04Z

I wonder if one can prove in theory that this manual sync process will ever converge in fully synced nodes. In my opinion the danger of having isles of nodes syncing forever without being fully synced is immanent,

nimearo · 2017-12-04T20:37:53Z

Did someone tried to analyze thread- and heapdumps or debug a running iri instance in this state already?

eelco2k · 2017-12-04T22:23:39Z

I was also having problems with the memory. and my full node was crashing and not fully synced with 7 peers.
I noticed a lot of swapping on my memory. (even when memory was not full) and then after some investigation I saw that Debian 7 in my vm image had vm.swappiness = 60 as default. After i edited /etc/systctl.conf to set it to 10 all my problems where gone. still a growing memory usage but not extreme which causes the crash after some hours.

my steps to fix:

sudoedit /etc/sysctl.conf
Add this line vm.swappiness = 10
sudo shutdown -r now # restart system

Maybe that helps for some people...( IRI version 1.4.1.2 )

GhostTyper · 2017-12-05T05:57:22Z

Ok guys. The sad truth is: This software is in beta.

I'm recording very detailed performance statistics since i'm running my open wallet node. These collected data tell:

You can easily sync up even with only 1 core and 1 GB RAM, when the last snapshot is fresh.
When the last snapshot was taken about a month ago nodes with 4 cores and 4 GB RAM will have more and more problems syncing up when gone out of sync.

I guess in the current state of IRI this should just be considered "normal" and will be optimized in the future. Every try to solve this by tweaking system settings just won't solve the real problem. I for my part will "solve" these issues by throwing more and more hardware on my public node until the next snapshot happens.

A quick "solution" for the network would be to take a snapshot.

nimearo · 2017-12-05T21:59:37Z

From my observation of a running node I would also suggest to run without -Xmx and -Xms flags because most of the memory which is used by iri is in native memory and used by rocksdb from my understanding. So I would suggest to add some swap to avoid the well known bad::alloc memory errors but to give rocksdb as much physical memory as possible.

nuriel77 · 2017-12-07T18:02:26Z

@eelco2k nice find!

Just to add, there's no need to reboot:
For runtime, you can just run sudo sysctl vm.swappiness=10.
And, ofc you can add the vm.swappiness = 10 to /etc/sysctl.conf so it persists between reboots.

Schweigi · 2017-12-10T08:08:45Z

My node has the same problem of just not syncing up as @lunfardo314 described. The vm.swappiness adjustment didn't help and so didn't all other tips in this threat. The question is if there is any kind of log file which could help debug the issue or how else one can help?

nuriel77 · 2017-12-10T08:10:57Z

@Schweigi did you try to d/l a fully synced database?
Also try to find neighbors that are fully synced, that might help too.

Schweigi · 2017-12-10T20:22:52Z

@nuriel77 I was using the Swarm nodes (according to Slack #nodesharing) to help sync my node but it made no difference.

I solved the problem now - long story short:
I noticed that the Docker version of IOTA (from this repo) uses by default -Xmx8g for Java. This was leading to a lot of crashes as my server only has 4GB of memory and Docker itself needs a little bit of memory too. Anyway, I ditched Docker completely and set the server up from scratch according to http://iota.partners and now the node is fully synced.

LiQio · 2017-12-11T07:52:45Z

@Schweigi The -Xmx8g makes sense. But concerning the OP problem using iota.partners (and downloading an up-to-date DB) is just a workaround.

mostaruk · 2017-12-13T08:13:07Z

Hello, where would I put this database? Many thanks

nuriel77 · 2017-12-13T08:19:36Z

@mostaruk that depends on which guide/tutorial you've been following

mostaruk · 2017-12-13T08:25:38Z

@nuriel77i am using this one https://github.com/nuriel77/iri-playbook/wiki/IOTA-Full-Node-Tutorial---Linux

nuriel77 · 2017-12-13T08:27:18Z

There's a section in the FAQ explaining this https://github.com/nuriel77/iri-playbook/wiki/IOTA-Full-Node-Tutorial---Linux#where-can-i-get-a-fully-synced-database-to-help-kick-start-my-node

If you need more help contact me on slack (nuriel77)

mostaruk · 2017-12-13T08:39:11Z

Ahh yes sorry about that. Should have probably read the FAQ! Thanks, I'll look you up on slack.

ysle · 2017-12-17T20:38:40Z

thanks for all the efforts here, but do we have any official statement here from the devs? i mean like a long term solution as a hotfix / commit to 1.4.1.3 please ?

zenmetsu · 2017-12-25T05:29:26Z

I hardly consider the OP's workaround to be viable. Here's the typical experience.
Download DB snapshot (2min - 30min depending on your net speed)
Untar DB snapshot (usually around 20sec-1min)
Start IRI...
wait...
and wait...
and wait...
during this period, which is typically about half an hour, I assume that IRI is processing the previously loaded DB. Connecting to the node will show that it is stuck at milestone 243000. CPU never spikes to above 50% on a powerful system, RAM utilization is acceptable, some frequent GC takes place within jvm, but the time spent in GC isn't excessive... disk utilization (via iostat -x) never exceeds 20% on a decent SSD... network never comes close to full utilization... so WTF is IRI doing during this period, and why TF is it taking so long if nothing in the system appears to be the bottleneck??

After this DB work is finished, the node will finally begin synchronizing to the network. It usually takes a minimum of 40 minutes to get to this point. Allow another hour for synchronization to take place, and you are getting close to 2 hours from start of DB download to point of synchronization. Considering that I'm seeing nodes getting stuck after 4-5 hours of operation, you are looking at an effective 60-70% duty cycle. The OP's solution cannot be considered a workaround, let alone a solution. :(

ALEX778899 · 2017-12-26T21:36:05Z

@zenmetsu, the workarount is tested, viable and helped lot of new users, please read again the suggetions, there is an update.

brunoamancio · 2018-01-03T15:00:01Z

Please read #428

ALEX778899 changed the title ~~BIG PROBLEM! TONS OF FULL NODE ARE NOT FULL SYNCHED AFTER WEEKS EVEN THEY HAVE FULL SYNCHED NEIGHBORS!!!~~ FULL SYNC PROBLEM TEMPORARY SOLUTION!!! Dec 5, 2017

ALEX778899 changed the title ~~FULL SYNC PROBLEM TEMPORARY SOLUTION!!!~~ FULL SYNC PROBLEM SOLUTION!!! Dec 17, 2017

ALEX778899 changed the title ~~FULL SYNC PROBLEM SOLUTION!!!~~ Full Sync issues easy solutions!!! Dec 28, 2017

ALEX778899 mentioned this issue Dec 28, 2017

Dear developers, please look the report, lot of users have these issues. #449

Closed

HerrMuellerluedenscheid mentioned this issue Jan 2, 2018

tip failed consistency check iotaledger/iota.py#126

Closed

iotasyncbot changed the title ~~Full Sync issues easy solutions!!!~~ IRI-249 ⁃ Full Sync issues easy solutions!!! Apr 17, 2018

anyong changed the title ~~IRI-249 ⁃ Full Sync issues easy solutions!!!~~ Full Sync issues easy solutions!!! Apr 22, 2018

alon-e closed this as completed Apr 25, 2018

Full Sync issues easy solutions!!! #375

Full Sync issues easy solutions!!! #375

Comments

ALEX778899 commented Dec 1, 2017 • edited by iotasyncbot Loading

mindlapse commented Dec 2, 2017 • edited Loading

paulhandy commented Dec 3, 2017

rnagler commented Dec 3, 2017

rnagler commented Dec 3, 2017

mindlapse commented Dec 3, 2017

GhostTyper commented Dec 3, 2017

mindlapse commented Dec 3, 2017 • edited Loading

GhostTyper commented Dec 3, 2017

mindlapse commented Dec 3, 2017

GhostTyper commented Dec 3, 2017 • edited Loading

GhostTyper commented Dec 3, 2017

mindlapse commented Dec 4, 2017

rnagler commented Dec 4, 2017

GhostTyper commented Dec 4, 2017

rnagler commented Dec 4, 2017

GhostTyper commented Dec 4, 2017 • edited Loading

lunfardo314 commented Dec 4, 2017 • edited Loading

That is what I posted to slack. repeat it here.

rnagler commented Dec 4, 2017

LiQio commented Dec 4, 2017

lunfardo314 commented Dec 4, 2017 • edited Loading

onemoreitguy commented Dec 4, 2017

ALEX778899 commented Dec 4, 2017 • edited Loading

rnagler commented Dec 4, 2017

nimearo commented Dec 4, 2017

eelco2k commented Dec 4, 2017 • edited Loading

GhostTyper commented Dec 5, 2017

nimearo commented Dec 5, 2017

nuriel77 commented Dec 7, 2017 • edited Loading

Schweigi commented Dec 10, 2017

nuriel77 commented Dec 10, 2017

Schweigi commented Dec 10, 2017

LiQio commented Dec 11, 2017

mostaruk commented Dec 13, 2017

nuriel77 commented Dec 13, 2017

mostaruk commented Dec 13, 2017

nuriel77 commented Dec 13, 2017

mostaruk commented Dec 13, 2017

ysle commented Dec 17, 2017

zenmetsu commented Dec 25, 2017 • edited Loading

ALEX778899 commented Dec 26, 2017 • edited Loading

brunoamancio commented Jan 3, 2018

ALEX778899 commented Dec 1, 2017 •

edited by iotasyncbot

Loading

mindlapse commented Dec 2, 2017 •

edited

Loading

mindlapse commented Dec 3, 2017 •

edited

Loading

GhostTyper commented Dec 3, 2017 •

edited

Loading

GhostTyper commented Dec 4, 2017 •

edited

Loading

lunfardo314 commented Dec 4, 2017 •

edited

Loading

lunfardo314 commented Dec 4, 2017 •

edited

Loading

ALEX778899 commented Dec 4, 2017 •

edited

Loading

eelco2k commented Dec 4, 2017 •

edited

Loading

nuriel77 commented Dec 7, 2017 •

edited

Loading

zenmetsu commented Dec 25, 2017 •

edited

Loading

ALEX778899 commented Dec 26, 2017 •

edited

Loading