-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update Script #22
Comments
Same with me - Didn't get the line 17 syntax error , but did get the rest of the update error you had. Stopped and nothing else had updated.
|
@gazaz, @Sipheren, Additionally, there is a flaw in the setup script which causes this script to fail to make the git pull; Which doesn't copy the hidden .git repository metadata to the destination /opt/geth, or /opt/lighthouse. Solution: |
Thanks all for the details! And of course apologies, I guess the update script wasn't tested as well as I thought after I made some major updates to the setup script a while back. Working on it! Great insight on the shopt as well! Will update and test scripts again. update-client.sh has been updated with the primary bug which was the typo fix: Also updated the setup script: Gimme some time and I'll work on the steps and post them so you don't have to rebuild the clients, ETA today sometime if all goes well. |
Awesome, thanks for all the replies. |
Brilliant thanks for working on it ! |
@wishbonesr thank you so much for the detailed analysis, super helpful. It looks like because the script does (line 137) So I do agree the most straightforward way going forward is to do a quick rebuild, which thankfully should take around the same time and running the update client would have. It's fast because by default the reset-validator.sh script keeps blockchain data, so no need to resync everything (however it may take a few hours or more to sync back what it needs). The only other thing is you'll need to import your keys to lighthouse again, which should only take a few minutes for most folks (who aren't running 100+ validators, if you are, sorry for the 100+ times you need to type in your password, but thank you very, very much for your service :). Here's what I've just did to update my validator.
After you've entered the wallet password for each validator and it's complete, you should see process completes successfully.
Then start the lighthouse validator client and after the (hopefully brief) syncing completes, you should be back to validating!
You can check the status of the clients using this command (hitting enter or spacebar to scroll down for more status important):
So just to be clear, if you are wanting to update your clients, the update script will not work without doing the quick rebuild which means you get the updated scripts, reset the validator (it keeps most of the blockchain data automatically) and run the setup script again and then you will be back validating on the latest and greatest network on earth (with the latest client updates too as the script pulls and uses the new clients). After a few hours, the validator was back to Active, 99% effectiveness and earning fees again. Again, apologies for the issue updating clients, and of course I pushed fixes as soon as a I could. |
Worked a treat thanks, validator updated and back printing PLS :) Thanks for the hard work. |
Also, there have been suggestions of a way to avoid the "quick rebuild" by doing something like this.
If anyone decides to tinker with this and verify it works, feel free to let us know the process. However, even with a few hours downtime of the validator, it still seems like the safest/most tested method as of now is the minimal rebuild process described in the prior post. |
Thanks for this, I am looking to go through this myself on the weekend. I have run the reset script once before during testing, works fine and doesn't take all that long to be back up and synced. Also, major thanks for providing this repo and maintaining it in the first place, was wasting a lot of time trying to get everything setup manually or using dockers, this script was a godsend :) Cheers |
EDIT: All good, just had to re-add the metrics flags and that to the .service files, reload the daemon and restart the services. :) Question, after the validator-reset script is run, should the Grafana dashboards all just kick back in or do I need to remove and re-run that script also? They are all these and setup how I like but none seem to be getting any data, guessing the new install of geth and that doesnt link up the the db or something? |
In case some folks find themselves unable to update this repo to their node (because you needed to edit the the safety switches in the scripts), git will tell you that you need to commit first. Since this is a one way operation, it's ok to set the HEAD of the clone back to the last clone/pull. Do this to avoid having to purge and re-clone via the url.
|
@rhmaxdotorg, EC2 - test instance error during install and/or update:
|
Good to hear, @wishbonesr! For the EC2 instance, were you using Ubuntu Linux (22.04) or Amazon Linux (probably the first choice/default)? I've only tested/supported Ubuntu 22.04 for the script, so it may work on other OSes, but not strictly supported. However, from googling it looks like If you did come across this using Ubuntu 22.04, this package could be added to the APT_PACKAGES list in the script. Let me know if you want to test that while you're seeing the error, doesn't seem like it would hurt to add it anyways, but just trying to keep it as minimal packages as necessary to support the validators. |
@rhmaxdotorg, Note: I could only afford one validator, and it's already up and running (so no urgency on my part) - just wanted to help out. |
I ran into the same issue when using the On the other hand, I was able to test fixing the git repos and rebuilding the clients on a test instance. The steps I took were:
Due to step 7 I have not run these steps in my actual validator can someone please confirm that new lighthouse binary is not where the service config file expects it to be? Or is there something wrong with this method? |
Ok I have run the above process on a live validator service, and wrote a helper script make sure to run it after switching users: The following method avoids the first alternative and prevents you from having to re-add all the keys so it is handy if you have a lot of validators, also, see closing thoughts on ideas on how you can reduce the validator downtime during updates about ~99% set -eo pipefail
cd /home/node
function cleanup() {
echo "Cleanup started"
git reset --hard # Set all files back to match `main``
echo -e "\ndata/" >> .gitignore # add data folder to the gitignore so its not cleaned up by following command
echo -e "\nlighthouse/lh" >> .gitignore # add symbolic link created by setup script to gitignore
git clean -f -d # Remove files that are not in `main` anymore and not tracked by git
git reset --hard # Reset gitignore
}
echo "Cloning go pulse repo..."
git clone https://gitlab.com/pulsechaincom/go-pulse
cp -R go-pulse/.git /opt/geth
pushd /opt/geth
cleanup
popd
echo "Cloning lighthouse repo..."
git clone https://gitlab.com/pulsechaincom/lighthouse-pulse
cp -R lighthouse-pulse/.git /opt/lighthouse
pushd /opt/lighthouse
cleanup
popd
echo "Removing cloned repos..."
rm -rf go-pulse && rm -rf lighthouse-pulse
echo "Done now run the update script" After it's done you can pull the latest from repo: https://github.com/rhmaxdotorg/pulsechain-validator and run the Closing thoughtsThe fact that the lighthouse binary is created elsewhere gives us the flexibility to update it without barely any downtime, Rust building is dog slow, taking about 45 mins - 1 hour whereas go builds (go-pulse [geth]) are very fast and don't have this problem, in the future we could leverage Right now pulse is cheap and it might be expensive to be down for 1 hour if you have a lot of validators or if PLS moons, the idea above would solve this. |
Awesome! Thanks for the script and interesting details in the closing thoughts @nicogranuja! Just a question on this part:
Just to clarify, are you suggesting any code changes to the update script? Or is adding the As the steps of setting up a validator with the setup script and then running the update-client.sh script after a new version is released shouldn't affect the
Since the script and service file want the latest lighthouse binary to point to I wondered if you tested/saw this or otherwise agree (since you seem to have a better testing environment than me right now :) Misc Checking your Lighthouse and Geth versions
|
Actually yes, turns out the symbolic link created by the setup script here: https://github.com/rhmaxdotorg/pulsechain-validator/blob/646a642d7414c3fbebafcb02f5ab4dcc4c338afb/pulsechain-validator-setup.sh#L201C8-L201C8 was being deleted since it is an untracked file, I have updated my comment above to add it to the
Thanks for the MISC section, I have confirmed my suspicions as it turns out, the Makefile for lighthouse (or Rust itself) builds the binary in two places: node:~$ .cargo/bin/lighthouse --version
Lighthouse Lighthouse-Pulse/v2.3.0-de8e0a0
BLS library: blst
SHA256 hardware acceleration: true
Allocator: jemalloc
Profile: release
Specs: mainnet (true), minimal (false), gnosis (false), pulsechain (true)
node:~$ /opt/lighthouse/target/release/lighthouse --version
Lighthouse Lighthouse-Pulse/v2.3.0-de8e0a0
BLS library: blst
SHA256 hardware acceleration: true
Allocator: jemalloc
Profile: release
Specs: mainnet (true), minimal (false), gnosis (false), pulsechain (true) TLDR; everything looks good, there is no issue with setup script approach of symbolic link, as long as we don't clear the untracked file Proposed changesI will propose and if time permits I will raise a merge request, to reduce the downtime while upgrading the clients, the idea is simple, let's leverage that we can "rug" the binaries while these are running and start the build process without stopping the clients, after that, let's restart all 3 clients and they will start back up using the updated binaries. |
@nicogranuja excellent, thank you! Just one more thing to clarify: Do you think Again, not sure if you tested this scenario yet and some of these scenarios are harder for me to test than others. |
No, the current symbolic link should work just fine, I verified that both |
Gotcha, that makes sense! Appreciate the details and the script that gives people options to get the clients up to date (if using the old setup script):
I'll close this thread since it seems like we've captured a lot of the important notes and feedback, but feel free to ping it or a new thread if more stuff or ideas. |
HI,
I tried to use the update script with this and I ended up with a few issues, would appreciate some help if possible.
Firstly, the script wouldnt run as line 17 has an issue:
So I just commended out this line:
So I dont think it ended up doing anything.
Thanks
The text was updated successfully, but these errors were encountered: