Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Task] Improve Harvester upgrade responder #1849

Closed
PhanLe1010 opened this issue Jan 20, 2022 · 16 comments
Closed

[Task] Improve Harvester upgrade responder #1849

PhanLe1010 opened this issue Jan 20, 2022 · 16 comments
Assignees
Labels
kind/enhancement Issues that improve or augment existing functionality not-require/test-plan Skip to create a e2e automation test issue priority/0 Must be fixed in this release
Milestone

Comments

@PhanLe1010
Copy link

@PhanLe1010 PhanLe1010 added the kind/enhancement Issues that improve or augment existing functionality label Jan 20, 2022
@PhanLe1010
Copy link
Author

cc @bk201

@guangbochen guangbochen added this to the v1.0.1 milestone Jan 21, 2022
@guangbochen guangbochen added the priority/1 Highly recommended to fix in this release label Jan 21, 2022
@rebeccazzzz rebeccazzzz modified the milestones: v1.0.1, v1.0.2 Feb 24, 2022
@bk201
Copy link
Member

bk201 commented Mar 15, 2022

We also need to add ISOURL and ISOChecksum in the response for available upgrades.
(

ISOURL string `json:"isoURL"`
ISOChecksum string `json:"isoChecksum"`
)

@PhanLe1010
Copy link
Author

@bk201

We are planning to have Harvester use the generic version of upgrade-responder at https://github.com/longhorn/upgrade-responder.

In general, a version is fully uniquely identified by {name, release date, and tags}. That is why we hard-coded the response in that format.

Is there another way to get the ISOURL and ISOChecksum instead of putting it into the upgrade responder's response?

@bk201
Copy link
Member

bk201 commented Mar 16, 2022

@PhanLe1010 Thanks for the reply. I understand as an upstream project, the responder wants to be generic. But it will be nice to have a way to extend the fields in the future.

@guangbochen @FrankYang0529 I guess we have two choices now:

  • Fork the project and run our own code.
  • Running the same responder version with longhorn, but because the fields are already coded in v1.0.0, to make future upgrade notification pop-up, we need to ask users to manually create a version CR like:
    cat > versions.yaml <<EOF
    apiVersion: harvesterhci.io/v1beta1
    kind: Version
    metadata:
      name: 1.0.1
      namespace: harvester-system
    spec:
      isoChecksum: 2765d89e5e8024347c81d36cc84b7168bf358a3a2f4da0c63164ad834a94bcbbaa13af4900aad4caa0a27550e4b4db5bf2c4726daa88664378e0c581a79ad2cb
      isoURL: http://10.10.0.1/harvester/harvester.iso
      minUpgradableVersion: 1.0.0
      releaseDate: "20211231"
      tags:
      - dev
      - test
    EOF
    
    • And we can use a different place like Github releases to store ISOURL and ISOChecksum in next version. Any comments?

@FrankYang0529
Copy link
Member

I think "Fork the project and run our own code" may be more friendly to users.

@PhanLe1010
Copy link
Author

PhanLe1010 commented Mar 19, 2022

I like the second approach because of its benefit in the long run:

  • Yeah, the disadvantage of this approach is v1.0.0 client already had the hard-coded logic and required a workaround. But in the long run, there will be fewer users using this version.
  • I think the checksum should be stored in a reputable source. Github release page seems to be the better source
  • Using the generic version of the Longhorn repo will reduce the operational effort in the future. We are planning to improve database write operation. If Harvester uses the same repo as in Longhorn, the team doesn't have to worry about rebuilding the code, testing.
  • We can allow extra custom fields in the response body of the generic version if this is a strong requirement

@bk201 bk201 removed their assignment Apr 11, 2022
@rebeccazzzz rebeccazzzz added priority/0 Must be fixed in this release and removed priority/1 Highly recommended to fix in this release labels Apr 27, 2022
@guangbochen guangbochen modified the milestones: v1.0.2, v1.1.0, v1.0.3 May 15, 2022
@guangbochen
Copy link
Contributor

TODO: add MinUpgradableVersion field to the upstream upgrade responder and migrate to it later.

@harvesterhci-io-github-bot
Copy link

harvesterhci-io-github-bot commented Jun 17, 2022

Pre Ready-For-Testing Checklist

  • If labeled: require/HEP Has the Harvester Enhancement Proposal PR submitted?

  • Where is the reproduce steps/test steps documented?
    The reproduce steps/test steps are at: [e2e] [Task] Improve Harvester upgrade responder tests#385

  • Is there a workaround for the issue? If so, where is it documented?

  • ~Have the backend code been merged (harvester, harvester-installer, etc) (including backport-needed/*)?

  • ~Does the PR include the explanation for the fix or the feature? ~

  • Does the PR include deployment change (YAML/Chart)? If so, where are the PRs for both YAML file and Chart?

  • If labeled: area/ui Has the UI issue filed or ready to be merged?

  • If labeled: require/doc, require/knowledge-base Has the necessary document PR submitted or merged?

  • If NOT labeled: not-require/test-plan Has the e2e test plan been merged? Have QAs agreed on the automation test case? If only test case skeleton w/o implementation, have you created an implementation issue?

  • If the fix introduces the code for backward compatibility Has a separate issue been filed with the label release/obsolete-compatibility?

@harvesterhci-io-github-bot

Automation e2e test issue: harvester/tests#385

@noahgildersleeve
Copy link

@FrankYang0529 I went through the steps here and I didn't see a prompt for an upgrade. I was running this off of the latest master bulid in a virtualized environment. I was wondering, is there is another way I should be testing this?

The steps that were outlined seemed to test if the harvester system would be able to access influxdb, but not if influxdb was installed in the system.

@FrankYang0529
Copy link
Member

Hi @noahgildersleeve, thanks for helping to test this issue. Since we changed harvester/upgrade-responder to longhorn/upgrade-responder and we also change the request body in harvester, I would like to test whether longhorn/upgrade-responder can work with the latest harvester.

We cannot add a dev version for testing on the production server, so we need to start longhorn/upgrade-responder in the local environment. InfluxDB is a dependency of longhorn/upgrade-responder. In macOS, I used the following commands to install and start influxDB.

brew install influxdb@1
brew services restart influxdb@1

After you start influxDB, you can use the following commands to start longhorn/upgrade-responder and check whether it can give you a correct response. Make sure you have a version with dev and test tags in config/response.json, so harvester will create a version CR without version comparison.

go run main.go --debug start --upgrade-response-config config/response.json --influxdb-url http://localhost:8086 --geodb geodb/GeoLite2-City.mmdb --application-name harvester

curl -X POST http://<SERVER-IP>:8314/v1/checkupgrade \
     -d '{ "appVersion": "master-head", "extraInfo": {}}' 

After longhorn/upgrade-responder can work, you can update the upgrade-checker-url setting in Harvester. Finally, you can delete pods in the harvester-system/harvester deployment to trigger it to get version data on our upgrade-responder. harvester-system/harvester deployment also sends a request to the upgrade-responder every hour.

@noahgildersleeve
Copy link

I tested this in master-96b90714-head and I'm unable to trigger the upgrade. I setup the influxdb instance and the
curl -X POST http://<SERVER-IP>:8314/v1/checkupgrade \ -d '{ "appVersion": "master-head", "extraInfo": {}}'

commandworked fine, but it does not seem to be triggering the upgrade check when I delete the pods from harvester-system/harvester. I tried a few different urls for the upgrade url. I tried the http address with ports 80, and 8314. I also tried those with https. I updated the response.json to be

{
  "Versions": [
    {
      "Name": "v1.0.3",
      "ReleaseDate": "2022-06-15T00:00:00Z",
      "Tags": [
        "latest",
	"test",
	"dev"
      ]
    }
  ]
}

influxdb is running with version v1.8.10.

I attached the support bundle

supportbundle_f8c598c8-4629-42f4-ac3c-d48a8caa2e88_2022-06-22T22-44-01Z.zip

@FrankYang0529
Copy link
Member

FrankYang0529 commented Jun 23, 2022

@noahgildersleeve, sorry, I know the reason. Actually, your harvester gets a new version from the upgrade-responder, but it also needs to send a request to GitHub to get more information. Currently, we don't have v1.0.3 version on GitHub, so it fails with this error message. May you change the version name to v1.0.2 and test it again? Sorry for all inconvenience. I update the example version in harvester/tests#385.

2022-06-22T22:44:00.539384284Z time="2022-06-22T22:44:00Z" level=error msg="failed syncing version metadata: failed to download version.yaml from URL: https://releases.rancher.com/harvester/v1.0.3/version.yaml"

I tried again in my local environment. With correct response.json, I can get an update.

Screen Shot 2022-06-23 at 3 56 35 PM

@TachunLin TachunLin self-assigned this Jul 11, 2022
@TachunLin
Copy link

Checked the first trial on master-9f6f58ac-head

Result

Can select the prompted upgrade button by using the updated version of Harvester upgrade responder https://github.com/longhorn/upgrade-responder (v0.1.4)

  • From v1.0.2 GA to master head
    image

But seems the upgrade did not actually download the image, this may related to the network environment configuration
image

Thus I would perform the second trail to confirm again.

Test Information

  • Test Environment: 1 node1 harvester on local kvm machine
  • Harvester version: master-9f6f58ac-head

Verify Steps

Follow the steps in harvester/tests#385 (comment)

  1. Clone longhorn/upgrade-responder and checkout to v0.1.4.
  2. Edit response.json content in config folder
{
  "Versions": [
    {
      "Name": "v1.0.2-master-head",
      "ReleaseDate": "2022-06-15T00:00:00Z",
      "Tags": [
        "latest",
        "test",
        "dev"
      ]
    }
  ]
}

  1. Install InfluxDB
zypper in influxdb
  1. Check InfluxDB version
influx -version
  1. Check influxDB service running
sudo systemctl status influxdb.service
  1. Run longhorn/upgrade-responder with the command:
go run main.go --debug start --upgrade-response-config config/response.json --influxdb-url http://localhost:8086 --geodb geodb/GeoLite2-City.mmdb --application-name harvester
  1. Check the local upgrade responder is running
curl -X POST http://localhost:8314/v1/checkupgrade \
     -d '{ "appVersion": "v1.0.2", "extraInfo": {}}'
  1. Create a new folder for the http server

  2. Download the latest master head installation files
    https://releases.rancher.com/harvester/master/harvester-master-amd64.iso
    https://releases.rancher.com/harvester/master/harvester-master-initrd-amd64
    https://releases.rancher.com/harvester/master/harvester-master-rootfs-amd64.squashfs
    https://releases.rancher.com/harvester/master/harvester-master-vmlinuz-amd64
    https://releases.rancher.com/harvester/master/harvester-master-amd64.sha256

  3. Launch a python http server

python3 -m http.server
  1. Create a version.yaml with the following content
apiVersion: harvesterhci.io/v1beta1
kind: Version
metadata:
  name: v1.0.2-master-head
  namespace: harvester-system
spec:
  isoChecksum: '37da4e0baa273e08ee2ecca8b443dc71ae756b93abec651948286e1ee37b80cfeb69baca8798dd341e942e9c6c7404140fcda744e2876d928ab24ebd8eef6375'
  isoURL: http://192.168.1.110:8000/v1.0.2-master-head/harvester-master-amd64.iso
  releaseDate: '20220711'
  1. [optional] If you encounter connection issue, you can Intall ngrok service for port fowarding
    https://snapcraft.io/install/ngrok/opensuse

  2. Open a terminal and run `ngrok

  3. Run ngrok http 8000 to foward internal 8000 port to ngrok external link
    image

  4. Run ngrok http 8134 to foward 8134 port to ngrok external link
    image

  5. Open Harvester settings, change upgrade-checker-url setting to our upgrade-responder URL.
    image

  6. Change the release download url to our http server url
    image

  7. ssh to harvester node, change to root, run k9s

  8. Run : deployments -> / harvester -> select the harvester node

  9. Remove pods in deployment harvester-system/harvester to trigger check new versions.
    image

  10. Wait for 5 - 10 minutes,

  11. Check Harvester dashboard and click the upgrade button
    image

@TachunLin
Copy link

Verify fixed after the second trial on master-7158c858-head (7/12). Close this issue.

Result

Can select the prompted upgrade button by using the updated version of Harvester upgrade responder https://github.com/longhorn/upgrade-responder (v0.1.4) by using the upgrade-checker url
image

  • Can correctly connect to the file server by using the release URL setting
    image

Test Information

  • Test Environment: 1 node harvester on local kvm machine
  • Harvester version: master-7158c858-head (7/12)
  • Upgrade responder OS: Ubuntu focal 20.04

Verify Steps

Follow the steps in #1849 (comment)

  1. Clone longhorn/upgrade-responder and checkout to v0.1.4.
  2. Edit response.json content in config folder
{
  "Versions": [
    {
      "Name": "v1.0.2-master-head",
      "ReleaseDate": "2022-06-15T00:00:00Z",
      "Tags": [
        "latest",
        "test",
        "dev"
      ]
    }
  ]
}

  1. Install InfluxDB
  2. Run longhorn/upgrade-responder with the command:
go run main.go --debug start --upgrade-response-config config/response.json --influxdb-url http://localhost:8086 --geodb geodb/GeoLite2-City.mmdb --application-name harvester
  1. Check the local upgrade responder is running
curl -X POST http://localhost:8314/v1/checkupgrade \
     -d '{ "appVersion": "v1.0.2", "extraInfo": {}}'
  1. Create a new folder v1.0.2-master-head for the http server

  2. Download the latest master head installation files
    https://releases.rancher.com/harvester/master/harvester-master-amd64.iso
    https://releases.rancher.com/harvester/master/harvester-master-initrd-amd64
    https://releases.rancher.com/harvester/master/harvester-master-rootfs-amd64.squashfs
    https://releases.rancher.com/harvester/master/harvester-master-vmlinuz-amd64
    https://releases.rancher.com/harvester/master/harvester-master-amd64.sha256

  3. Launch a python http server

python3 -m http.server
  1. Create a version.yaml with the following content
apiVersion: harvesterhci.io/v1beta1
kind: Version
metadata:
  name: v1.0.2-master-head
  namespace: harvester-system
spec:
  isoChecksum: '0d5999471553e767cb0c4d7d1c82b00b884e994e5856d8feb90798ace523b7aa2145a5fc245e1d0073ce7b41c490979950f3f31f60a682c971aba63d562973e5'
  isoURL: http://192.168.122.224:8000/v1.0.2-master-head/harvester-master-amd64.iso
  releaseDate: '20220712'
  1. Check the upgrade responder connection in harvester node
curl -X POST http://192.168.122.224:8314/v1/checkupgrade      -d '{ "appVersion": "v1.0.2", "extraInfo": {}}'
  1. Check the iso download url connection in harvester node
curl -output http://192.168.122.224:8000/harvester-master-amd64.iso
  1. Open Harvester settings, change upgrade-checker-url setting to our upgrade-responder URL.
    image
  2. Change the release download url to our http server url
    image
  3. ssh to harvester node, change to root, run k9s
  4. Run : deployments -> / harvester -> select the harvester node
  5. Remove pods in deployment harvester-system/harvester to trigger check new versions.
    image
  6. Wait for 5 - 10 minutes,
  7. Check Harvester dashboard and click the upgrade button
    image
  8. Select the version and start the upgrade process

@PhanLe1010
Copy link
Author

There is still one item for this ticket:

Build growth rate Grafana dashboard in the livestock cluster 

I will handle this one

@noahgildersleeve noahgildersleeve added the not-require/test-plan Skip to create a e2e automation test issue label Aug 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement Issues that improve or augment existing functionality not-require/test-plan Skip to create a e2e automation test issue priority/0 Must be fixed in this release
Projects
None yet
Development

No branches or pull requests

8 participants